← Back to context

Comment by idontknowmuch

3 days ago

What tools are "actually working" as of a few years ago? Foundation models, LLMs, computer vision models? Lab automation software and hardware?

If you look at the recent research on ML/AI applications in biology, the majority of work has, for the most part, not provided any tangible benefit for improving the drug discovery pipeline (e.g. clinical trial efficiency, drugs with low ADR/high efficacy).

The only areas showing real benefit have been off-the-shelf LLMs for streamlining informatic work, and protein folding/binding research. But protein structure work is arguably a tiny fraction of the overall cost of bringing a drug to market, and the space is massively oversaturated right now with dozens of startups chasing the same solved problem post-AlphaFold.

Meanwhile, the actual bottlenecks—predicting in vivo efficacy, understanding complex disease mechanisms, navigating clinical trials—remain basically untouched by current ML approaches. The capital seems to be flowing to technically tractable problems rather than commercially important ones.

Maybe you can elaborate on what you're seeing? But from where I'm sitting, most VCs funding bio startups seem to be extrapolating from AI success in other domains without understanding where the real value creation opportunities are in drug discovery and development.

These days it's almost trivial to design a binder against a target of interest with computation alone (tools like boltzgen, many others). While that's not the main bottleneck to drug development (imo you are correct about the main bottlenecks), it's still a huge change from the state of technology even 1 or 2 years ago, where finding that same binder could take months or years, and generally with a lot more resources thrown at the problem. These kinds of computational tools only started working really well quite recently (e.g., high enough hit rates for small scale screening where you just order a few designs, good Kd, target specificity out of the box).

So both things can be true: the more important bottlenecks remain, but progress on discovery work has been very exciting.

  • As noted, I agree on the great strides made in the protein space. However, the over saturation and redundancy in tools and products in this space should make it pretty obvious that selling API calls and compute time for protein binding, annd related tasks, isn’t a viable business beyond the short term.