Comment by galaxyLogic
19 hours ago
I wonder, is the quality of AI answers going up over time or not? Last weekend I spent a lot of time with Preplexity trying to understand why my SeqTrack device didn't do what I wanted it to do and seems Perplexity had a wrong idea of how the buttons on the device are laid out, so it gave me wrong or confusing answers. I spent literally hours trying to feed it different prompts to get an answer that would solve my problem.
If it had given me the right easy to understand answer right away I would have spent 2 minutes of both MY time and ITS time. My point is if AI will improve we will need less of it, to get our questions answered. Or, perhaps AI usage goes up if it improves its answers?
With vision models (SOTA models like Gemini and ChatGPT can do this), you can take a picture/screenshot of the button layout, upload it, and have it work from that. Feeding it current documentation (eg a pdf of a user manual) helps too.
Referencing outdated documentation or straight up hallucinating answers is still an issue. It is getting better with each model release though
Always worth trying a different model, especially if you’re using a free one. I wouldn’t take one data point to seriously either.
The data is very strongly showing the quality of AI answers is rapidly improving. If you want a good example, check out the sixty symbols video by Brady Haran, where they revisited getting AI to answer a quantum physics exam after trying the same thing 3 years ago. The improvement is IMMENSE and unavoidable.
If the AI hasn't specifically learned about SeqTracks as part of its training it's not going to give you useful answers. AI is not a crystal ball.
The problem is it's inability to say "I don't know". As soon as you reach the limits of the models knowledge it will readily start fabricating answers.
Both true. Perplexity knows a lot about SeqTrack, I assume it has read the UserGuide. But some things it gets wrong, seems especially things it should understand by looking at the pictures.
I'm just wondering if there's a clear path for it to improve and on what time-table. The fact that it does not tell you when it is "unsure" of course makes things worse for users. (It is never unsure).
That's nowhere near as true as it was as recently as a year ago.