← Back to context

Comment by nextos

1 day ago

Yes, but it's not dramatically different from what is out there already.

There is a concerning gap between prediction and causality. In problems, like this one, where lots of variables are highly correlated, prediction methods that only have an implicit notion of causality don't perform well.

Right now, SOTA seems to use huge population data to infer causality within each linkage block of interest in the genome. These types of methods are quite close to Pearl's notion of causal graphs.

> SOTA seems to use huge population data to infer causality within each linkage block of interest in the genome.

This has existed for at least a decade, maybe two.

> There is a concerning gap between prediction and causality.

Which can be bridged with protein prediction (alphafold) and non-coding regulatory predictions (alphagenome) amongst all the other tools that exist.

What is it that does not exist that you "found it disappointing that they ignored"?

  • > This has existed for at least a decade, maybe two.

    Methods have evolved a lot in a decade.

    Note how AlphaGenome prediction at 1 bp resolution for CAGE is poor. Just Pearson r = 0.49. CAGE is very often used to pinpoint causal regulatory variants.