Comment by dogcomplex

9 days ago

Any time prompt crafting matters is just when demonstrating the current edge of capabilities - next iteration, you can get away with a much more general/primitive prompt. Those are just people countering the "gotcha" arguments people try to levy against LLMs, showing that even now those tasks can be done with a good prompt. Anytime it's a practical concern though - just wait a little longer for the next model to smooth that out.

You don't have to pay attention, that's the point. You can code without reading code now. Sure you gotta tell it what the app looks like with each iteration - but again, that's temporary til the next model comes out with good enough vision to assess that itself. None of this is permanently planning on requiring human interaction - it's just early days and these are progressing through mediums one at a time.

They're not canned responses either. They're bespoke mixtures of all the various elements of the current environment/context translated to an answer. It certainly handles novelty - that's the whole point. They certainly handle plenty of novelty - like entire mediums of text and images - to expert levels. I think you're just being greedy for more, here.

As for consistency and avoiding error? There are benchmarks for that. There are error checking methods. Those are all steadily improving too, and are already well-consistent on easier topics/mediums. It would be foolish to think that's innately impossible from AI for remaining ones.