← Back to context

Comment by E_Bfx

3 days ago

> Transformer² represents a significant milestone in the evolution of AI systems.

Coming from a math background, it always amazes me to see how people in AI/ML brag about their papers. If someone wrote:

> My paper represents a significant milestone in the evolution of algebraic geometry/ergodic theory/combinatorics

it would be a laughing stock for the math community.

They aren't just researchers, there is a company that took on $200M in a Series A...

https://sakana.ai/series-a/

  • Why is this relevant when presenting scientific research? Or is the point of your comment to say, they are incentivized to "brand" their research in a way which is attractive to a VC audience?

    • It's offered as one possible explanation for the tone or style of the language that GP commented on. I don't think their observation applies to ML research at large, this group seems to be more eccentric in their writing (see their history of submissions on HN and their blog more generally)

    • > Why is this relevant when presenting scientific research?

      I’m guessing that the difference lies in the potential value extraction possibilities from the idea.

      If comparing the transformers paper to an algorithm or geometry, that is not used by anyone, I think the differences are obvious from this perspective.

      However, if that paper on geometry led to something like a new way of doing strained silicon for integrated circuit design that made manufacturing 10 times cheaper and the circuit 10 times faster, then that would be more important then that would the transformers one.

    • > Or is the point of your comment to say, they are incentivized to "brand" their research in a way which is attractive to a VC audience?

      Yes

  • Anyone can be a researcher/scientist if they pass peer review at a reputable journal or conference. That's just how it is.

    • The bar seems to be much lower than getting a peer reviewed paper published at a reputable outlet

      This particular paper is not peer reviewed or published beyond a preprint on arxiv

In ML results are often a score (accuracy or whatever) which makes it more gamefied

It's common to have competitions where the one with the highest score in the benchmark "wins". Even if there is no formal competition, it's very important being the SOTA model.

Results are more applicable to the real world, and more "cool" subjectively (I don't think there's a 2 minutes paper equivalent for math?), which increases ego.

And often authors are trying to convince others to use their findings. So it's partly a marketing brochure.

  • - There is also (but on a smaller scale) a gamification of math with bounties (https://mathoverflow.net/questions/66084/open-problems-with-...) but when a result is proved you cannot prove it "better than the first time". So it is more a "winner take it all" situation. - I am not sure but the "2-minute papers" equivalent would be poster sessions, a must-do for every Ph.D. student - For the marketing side, there are some trends in math, and subtly researchers try to brand their results so they become active research fields. But since it cannot be measured with GitHub stars or Hugging Face downloads, it is more discreet

Especially when the results are so modest! "Significant" doesn't seem like unfalsifiable hype here, it's just wrong.

Yeah the naming implies a significant breakthrough, but this is just an incremental stepping stone that will be forgotten in time.