Comment by underdeserver

3 years ago

> Not sure what part of "it does homology modeling 2x better" you didn't see in my comment? AlphaFold scored something like 85% in CASP in 2020, in CASP 2016, I-TASSER had I think 42%? So it's ~2x as good as I-TASSER which is exactly what I said in my comment.

Wait, stop, I don't know anything about proteins but 84% success is not ~2x better than 42%.

It doesn't really make sense to talk about 2x better in terms of success percentages, but if you want a feel, I would measure 1/error instead (a 99% correct system is 10 times better than a 90% correct system), making AlphaFold around 3.6 times better.

4 comments

underdeserver

palmtree3000 3 years ago

I think odds ratio ( p/(1-p) ) is the thing I'd use here. It gives the right limiting behavior (at p ~= 0, doubling p is twice as good, and at p~=1, halving 1-p is twice as good) and it's the natural way to express Bayes rule, meaning you can say "I'm twice as sure (in odds ratio terms) based on this evidence" and have that be solely a property of the update, not the prior.

camjw 3 years ago

For the lazy, this would make alphafold 7.25x better than the previous tools
bscphil 3 years ago

Excellent comment. I think the issue is that "better" is underspecified and needs some precisification to be useful. The metric you are using here is the proper response to the question "how many times more surprising is it when method A fails than method B?". This is in many cases what we care about. Probably, it's what we care about here. The odds ratio seems to do a good job of capturing the scale of the achievement.
On the other hand, it's not necessarily the only thing we might care about under that description. If I have a manufacturing process that is 99.99% successful (the remaining 0.01% has to be thrown out), it probably does not strike me as a 10x improvement if the process is improved to 99.999% success. What I care about is the cost to produce the average product that can be sent to market, and this "10x improvement" changes that only a very small amount.
underdeserver 3 years ago

TIL, thanks for this.