Comment by ninjagoo
2 days ago
> Read the papers, comprehend the papers, don't make someone else's computer do it for you
Why not?
Personally, I don't have the specialized knowledge, nor the time needed, to read and understand papers outside my own 2-3 domains. LLMs do. And I appreciate what they can do for me. They do it better, faster, and more accurately than most 'popular science', provide better coverage and also provide the ability to interact with the material to any degree or depth that I care to, better than any article.
It would be silly to pass up this capability to make my life better simply because random folks on the Internet disparage the quality of the output (contrary to my own experience) and make hand-wavy points about 'someone else's computer) while offering no credible or useful alternative :)
How do you evaluate the quality of a summary of a paper you do not have the knowledge to read and understand?
> How do you evaluate the quality of a summary of a paper you do not have the knowledge to read and understand?
Tough question. I think the straightforward answer is that you can't.
That said, there is some confidence gained in an LLM's abilities based on its performance on papers in domains that I do understand. Yes, it's not going to be the same across all domains, but the frontier labs do publish capability scores across different domains, and that helps scrutinize the answers it provides, and how much salt to take with those.
I wonder if you have asked the same LLMs to explain or summarize a paper in one of your fields and see if it still makes sense.
It could be that the LLMs are good at stringing words together in a way that seems reasonable when you are not an expert yourself, much like people from other fields seem very knowledgeable until you compare many of them or hear/see them talk with each other.
> I wonder if you have asked the same LLMs to explain or summarize a paper in one of your fields and see if it still makes sense.
I have, and it does, hence my confidence in its ability to do the same in other domains. Depending on what you're using it for, it is advisable to maintain some level of quality control (spot checks, sampling, deep dives, more rigorous continuous review) as in any process control.
Nice, that's good to hear and from the Zeitgeist that I get kind of new if I understand it correctly.