Comment by axg11

2 years ago

If we knew:

(a) the structure of every protein (what DeepMind is doing here)

(b) how different protein structures interact (i.e. protein complexes - DeepMind is working on this but not there yet)

Then we could use those two building blocks to design new proteins (drugs) that do what we want. If we solve those two problems with very high accuracy, we can also reduce the time it takes to go from starting a drug discovery programme to approved medicine.

Obtaining all protein structures and determining how they interact is a key step towards making biology more predictable. Previously, solving the structure of a protein was very time consuming. As a result, we didn’t know the structure for a majority of proteins. Now that it’s much faster, downstream research can move faster.

Caveat: we should remember that these are all computational predictions. AlphaFold’s predictions can be wrong and protein structures will still need to be validated. Having said that, lots of validation has already occurred and confidence in the predictions grows with every new iteration of AlphaFold.

> Then we could use those two building blocks to design new proteins (drugs) that do what we want. If we solve those two problems with very high accuracy, we can also reduce the time it takes to go from starting a drug discovery programme to approved medicine.

Drugs are usually not proteins, but instead small molecules that are designed to help or interfere with the operation of proteins instead.

  • That is only true because of our current tools and capabilities. With improved manufacturing techniques and AlphaFold++ I think biologics will dominate. Even still, there are ~2000 approved biologics [0].

    [0] - https://purplebooksearch.fda.gov/advanced-search

    • Yep, proteins are so much more flexible / precise than small molecules. Also we can get the body to produce them. Think mRNA vaccines.

How are the predictions validated? Waiting for the old fashioned way for... very difficult crystal structure experiments? Or something else?

  • Most of them are not, just estimations based on previous results given sequences with known structure.

    Every couple years there is a massive competition called CASP where labs submit previously unresolved protein structures derived from experimental EM, x-ray crystallography, or NMR studies and other labs attempt to predict these structures using their software. AlphaFold2 absolutely destroyed the other labs in the main contest (regular monomeric targets, predominantly globular) for structure resolution two years ago, in CASP 14.

    https://predictioncenter.org/casp14/zscores_final.cgi

    The latest contest, CASP15, is currently underway and expected to end this year. As with all ML, the usual caveats apply to the models Google generated -- the dangers of overfitting to existing structures, artifacts based on the way the problem was modelled, etc

  • > very difficult crystal structure experiments?

    Apart from X-ray crystallography there are other methods for structure determination such as nuclear magnetic resonance (NMR) or cryo-electron microscopy (cryo-EM). The latter has seen a dramatic improvement in resolution over the last decade.

  • If the predictions are generally good enough, could also skip the validation and directly try to get a desired effect or reaction. That isn't strictly speaking validating the structure, but depending on the use case might be easier to just go for an outcome - really a question of application and cost efficiency.

    • I mean nothing is stopping you from skipping validation with pre-alphafold techniques and say for drug discovery to already do drug screening using the predicted structure. It's just the drug screening software is already error prone so you are still going to have to do some validation. However having an idea on a potential structure means that you can do other techniques that are simpler to validate it that are less expensive/time consuming (I'm thinking of things similar to FRET).

      Another idea is these may come into play for anti-verification, so if you are drug screening against a known structure. You could potentially use these more flawed structures of proteins you don't want to target but may be similar, and try to reduce the drug's efficacy at binding them. Or something to that effect. All of that is fun ideas that are currently being explored in that space but we'll see where it takes us.

  • For a lot of X-ray crystallography cases, some of the difficulty is working out with no prior information, the actual structure from the collected data. This makes a lot of that... much easier because with https://en.wikipedia.org/wiki/Molecular_replacement something that is "close, but not correct" can be used to bootstrap the actual structure from.