← Back to context

Comment by nxobject

2 years ago

It looks like "500M of evolution" isn't a description (however indirect) of an iterative process, but a metric that measures differences in results:

> But in order for ESM3 to solve its training task of predicting the next masked token the model must learn how evolution moves through the space of potential proteins. In this sense, ESM3 can be thought of as an evolutionary simulator. A traditional evolutionary analysis of the ancestry of esmGFP is paradoxical as the protein was created outside natural processes, but still we can draw insight from the tools of evolutionary biology on the amount of time it would take for a protein to diverge from its closest sequence neighbor through natural evolution. We find naturally occuring GFPs with similar levels of sequence identity are separated by hundreds of millions of years of evolution. Using an analysis similar to one might perform on a new protein found in the natural world, we estimate that esmGFP represents an equivalent of over 500 million years of natural evolution performed by an evolutionary simulator.

Yeah, it seems that they created a new protein and then said it’s equivalent to 500 millions years of evolution. Which is of course clearly not.