Comment by scirob
1 month ago
Has anyone tried building a modern Stack Overflow that's actually designed for AI-first developers? The core idea: question gets asked → immediately shows answers from 3 different AI models. Users get instant value. Then humans show up to verify, break it down, or add production context. But flip the reputation system: instead of reputation for answers, you get it for catching what's wrong or verifying what works. "This breaks with X" or "verified in production" becomes the valuable contribution. Keep federation in mind from day one (did:web, did:plc) so it's not another closed platform. Stack Overflow's magic was making experts feel needed. They still do—just differently now.
Oh, so it wasn't bad enough to spot bad human answers as an expert on Stack Overflow... now humans should spend their time spotting bad AI answers? How about a model where you ask a human and no AI input is allowed, to make sure that everyone has everyone else's full attention?
Why disallow AI input? Is it that poor? Surely it isn't.
The entire purpose of answering questions as an "expert" on S.O. is/was to help educate people who were trying to learn how to solve problems mostly on their own. The goal isn't to solve the immediate problem, it's to teach people how to think about the problem so that they can solve it themselves the next time. The use of AI to solve problems for you completely undermines that ethos of doing it yourself with the minimum amount of targeted, careful questions possible.
What's the point of AI on a site like that? Wouldn't you just ask an LLM directly if you were fine with AI answers?
3 replies →
Am I reading an AI trying to trick me into becoming its subordinate?
In 2014, one benefit of Stack Overflow / Exchange is a user searching for work can include that they are a top 10% contributor. It actually had real world value. The equivalent today is users with extensive examples of completed projects on Github that can be cloned and run. OP's solution if contained in Github repositories will eventually get included in a training model. Moreover, the solution will definitely be used for training because it now exists on Hacker News.
I had a conversation with a couple accountants / tax-advisor types about them participating in something like this for their specialty. And the response was actually 100% positive because they know that there is a part of their job that the AI can never take 1) filings requires you to have a human with a government approved license 2) There is a hidden information about what tax optimization is higher or lower risk based on their information from their other clients 3) Humans want another human to make them feel good that their tax situation is taken care of well.
But also many said that it would be better if one wraps this in an agency so the leads that are generated from the AI accounting questions only go to a few people instead of making it fully public stackexchange like.
So +1 point -1 point for the idea of a public version.
LOL. As a top 10% contributor on Stack Overflow, and on FlashKit before that, I can assure you that any real world value attached to that status was always imaginary, or at least highly overrated.
Mainly, it was good at making you feel useful and at honing your own craft - because providing answers forced you to think about other people's questions and problems as if they were little puzzles you could solve in a few minutes. Kept you sharp. It was like a game to play in your spare time. That was the reason to contribute, not the points.
Yeah, they didn't even bother to suggest paying you with tokens for the job well done! The audacity!
hehe yea this existing of course. like these guys https://yupp.ai/ they have not announced the tokens but there are points and they got all their VC money from web3 VC. I'm sure there are others trying
hehe, damn I did let an AI fix my grammer and they promptly put the classic tell of — U+2014 in there
AI is generally setup to return the "best" answer as defined as the most common answer, not the rightest, or most efficient or effective answer, unless the underlying data leans that way.
It's why AI based web search isn't behaving like google based search. People clicking on the best results really was a signal for google on what solution was being sought. Generally, I don't know that LLMs are covering this type of feedback loop.
That seems like a horrible core idea. How is that different from data labeling or model evaluation?
Human beings want to help out other human beings, spread knowledge and might want to get recognition for it. Manually correcting (3 different) automation efforts seems like incredible monotone, unrewarding labour for a race to the bottom. Nobody should spend their time correcting AI models without compensation.
Great point, thanks for the reality check.
Speaking of evals the other day I found out that most of the people who contributed to Humanities Last Exam https://agi.safe.ai/ got paid >$2k each. So just adding to your point.
I think this could be really cool, but the tricky thing would be knowing when to use it instead of just asking the question directly to whichever AI. It’s hard to know that you’ll benefit from the extra context and some human input unless you already have a pretty good idea about the topic.
Presumably over time said AI could figure out if your question had already been answered and in that case would just redirect you too the old thread instead.