← Back to context

Comment by jackfranklyn

2 days ago

The benchmark point is interesting but I think it undersells what the complexity buys you in practice. Yes, a minimal loop can score similarly on standardised tasks - but real development work has this annoying property of requiring you to hold context across many files, remember what you already tried, and recover gracefully when a path doesn't work out.

The TODO injection nyellin mentions is a good example. It's not sophisticated ML - it's bookkeeping. But without it, the agent will confidently declare victory three steps into a ten-step task. Same with subagents - they're not magic, they're just a way to keep working memory from getting polluted when you need to go investigate something.

The 200-line version captures the loop. The production version captures the paperwork around the loop. That paperwork is boring but turns out to be load-bearing.

[flagged]

  • Anyone who disagrees with this, please check the OP's previous comments. That's all the proof you need.

    And then, as an exercise, ask yourself why you were willing to give this comment leniency?

  • This site has gone full Tower of Babel. I've seen at least a thousand "AI comment" callouts on this site in the last month and at this point I'm pretty sure 99% of them are wrong.

    In fact, can someone link me to a disputed comment that the consensus ends up being it's actually AI? I don't think I've seen one.

    • You know how the chicken sexers do their thing, but can't explain it? Like they can't write a list of things they check for. And when they want to train new people they have them watch (apprentice style) the current ones, and eventually they also become good at doing it themselves?

      It's basically that. I can't explain it (I tried listing the tells in a comment below), but it's not just a list of things you notice. You notice the whole message, the cadence, the phrases that "add nothing". You play with enough models, you see enough generations and you start to "see it".

      If you'd like to check for yourself, check that user's comment history. It will become apparent after a few messages. They all have these tells. I don't know how else to explain it, but it's there.

      1 reply →

  • Unclear why you think this is ChatGPT, doesn't read like it at all to me. Many people - myself included - use punctuation to emphasize and clarify.

    • The tells are in the cadence. And the not x but y. And the last line that basically says nothing, while using big words. It's like "In conclusion", but worded differently. Enough tells for me to click on their history. They have the exact same cadence on every comment. It's a bit more sophisticated than "chatgpt write a reply", but it's still 100% aigen. Check it out, you'll see it after a few messages in their history.