Comment by lukeschlather
5 days ago
The GraphCast paper says "GraphCast is implemented using GNNs" without explaining that the acronym stands for Graph Neural Networks. It contrasts GNNs to the " convolutional neural network (CNN)" and "graph attention network." (GAN?) It doesn't really explain the difference between GAN and a GNN. I think LLMs are GANs. So no, it's not an LLM in a weather model, but it's very similar to an LLM in terms of how it is trained.
> I think LLMs are GANs.
They aren't, but both of them are transformer models.
nb GAN usually means something else (Generative Adversarial Network).
I used GAN to mean graph attention network in my comment, which is how the GraphCast paper defines transformers. https://arxiv.org/pdf/2212.12794
I was looking at this part in particular:
> And while Transformers [48] can also compute arbitrarily long-range computations, they do not scale well with very large inputs (e.g., the 1 million-plus grid points in GraphCast’s global inputs) because of the quadratic memory complexity induced by computing all-to-all interactions. Contemporary extensions of Transformers often sparsify possible interactions to reduce the complexity, which in effect makes them analogous to GNNs (e.g., graph attention networks [49]).
Which kind of makes a soup of the whole thing and suggests that LLMs/Graph Attention Networks are "extensions to transformers" and not exactly transformers themselves.
Oh yeah, GNN (graph neural network) is the common term, "graph attention network" is pretty confusing because a GAN is a totally different architecture.
(Well, not necessarily architecture. Training method?)