← Back to context

Comment by kridsdale3

2 days ago

Not just for math, but ALL of Science suffers heavily from a problem of less than 1% of the published works being capable of being read by leading researchers.

Google Scholar was a huge step forward for doing meta-analysis vs a physical library.

But agents scanning the vastness of PDFs to find correlations and insights that are far beyond human context-capacity will I hope find a lot of knowledge that we have technically already collected, but remain ignorant of.

This idea is just ridiculous to anyone who's worked in academia. The theory is nice, but academic publishing is currently in the late stages of a huge death spiral.

In any given scientific niche, there is a huge amount of tribal knowledge that never gets written down anywhere, just passed on from one grad student to the rest of the group, and from there spreads by percolation in the tiny niche. And papers are never honest about the performance of the results and what does not work, there is always cherry picking of benchmarks/comparisons etc.

There is absolutely no way you can get these kinds of insights beyond human context capacity that you speak of. The information necessary does not exist in any dataset available to the LLM.

  • The same could be said about programmers, but we have adapted and started writing it all down so that AI cab use it.

    • No no, in comparison to academia, programmers have been extremely diligent at documenting exactly how stuff works and providing fairly reproducible artifacts since the 1960s.

      Imagine trying to teach an AI how to code based on only slide decks from consultants. No access todocumentation, no stack overflow, no open source code used in the training data; just sales pitches and success stories. That's close to how absurd this idea is.

      1 reply →

Exactly, and I think not every instance can be claimed to be a hallucination, there will be so much latent knowledge they might have explored.

It is likely we might see some AlphaGo type new styles in existing research workflows that AI might work out if there is some verification logic. Humans could probably never go into that space, or may be none of the researchers ever ventured there due to different reasons as progress in general is mostly always incremental.

Google Scholar is still ignoring a huge amount of scholarship that is decades old (pre-digital) or even centuries old (and written in now-unused languages that ChatGPT could easily make sense of).