Comment by gilleain
2 days ago
This is about folds, not amino acids - even if you used a larger alphabet of residues, I somehow doubt that you would get many more folds.
Thinking more about the question of protein _length_ - I'm also not convinced that longer proteins (more than say 750aa) would produce more novel folds. Larger proteins tend to be multi-domain; that is, a longer chain will fold into multiple compact domains, each one a separate fold.
I suppose there could be 'megafolds' out there in fold space, beyond 1000aa - like a 12-bladed beta propeller, or a beta-helix with alpha helices on the outside or some other wacky thing. Whether that would substantially increase the numbers of total folds, I doubt, but that is of course a guess.
(ref - https://pmc.ncbi.nlm.nih.gov/articles/PMC10251718/ for protein lengths)
Amino acid (sequence) defines the folds.
And really? Just any random sequence gets you a new fold. I mean, it won't be very useful if you pick a random one, but it'll work and be a new one.
I think this is just an artifact of natural selection basing new proteins on existing ones, not an actual useful ("rational" if you can call natural selection rational) selection limit. I don't think that if you designed proteins from first principles you'd see this limitation in your results.
A random sequence may not fold at all! I seem to remember a paper that tried this, creating a bunch of random proteins, and checking how much structure they had - I think they were helical bundles, but don't quote me.
The nice thing about stable folds, is that 'nearby' sequences in sequence space - as in, point mutations - are the same fold. If each sequence had a completely different fold, then mutation would be much more destructive. Surprisingly, however, sequences that are far apart in sequence space can also adopt the same fold (convergent evolution).
This reminds me of structural studies in proteins encoded by de novo genes in eukaryotes. They are usually either intrinsically disordered or adopt a molten-globule-like state.
3 replies →