← Back to context

Comment by Retric

2 days ago

I’m not so sure it’s going to even do that much. People are currently happy to use LLM’s, but the outputs aren’t accurate and don’t seem to be improving quickly.

A YouTuber watch regularly includes questions they asked Chat GPT and very single time there’s a detailed response in the comments showing how the output is wildly wrong from multiple mistakes.

I suspect the backlash from disgruntled users is going to hit the industry hard and these models are still extremely expensive to keep updated.

Using function calls for correct answer lookup already practically eliminates this, it's not wide spread yet, but the ease of doing it is already practical for many.

New models aren't being trained specifically on single answers which will only help.

The expense for the larger models is something to be concerned about. Small models with function calls is already great, especially if you narrow down what they are being used for. Not seeing their utility is just a lack of imagination.

  • Right. And any particular question people think AIs are bad at also has a comments section of people who have run better crafted prompts that do the job just fine. The consensus is heading more towards "well damn, actually LLMs might be all we need" rather than "LLMs are just a stepping stone" - but either way, that's fine, cuz plenty of more advanced architecture uses are on their way (especially error correction / consistency frameworks).

    I dont believe there are any significant academic critiques doubting this. There are a lot of armchair hot takes, and perceptions that this stuff isn't improving up to their expectations, but those are pretty divorced from any rigorous analysis of the field, which is still improving at staggeringly fast rates compared to any other field of research. Aint no wall, folks.

    • “Crafting a better prompt” is often simply spinning an RNG again and again until you end up with an answer that happens to be good enough.

      In the real world if you know the correct answer you don’t need to ask the question. A self driving car that needs you to pay attention isn’t self driving.

      Any system can get canned response, the value of AI is completely in its ability to handle novelty without hand holding. And none of these systems actually do that even vaguely well in practice rather than providing response that are vaguely close to correct.

      If I ask for a summary of an article and it gets anything wrong in the article that’s a 0 because now I need to read the article to know what it said. Arguably the value is actually negative here.