Comment by dcanelhas

14 hours ago

I'm half expecting to see "AI model" appearing as stand-in for "linear regression" at this point in the cycle.

> I'm half expecting to see "AI model" appearing as stand-in for "linear regression" at this point in the cycle.

Already the case with consulting companies, have seen it myself

  • Some career do-nothing-but-make-noise in my organization hired a firm to 'Do AI' on some shitty data and the outcome was basically linear regression. It turns out that you can impressive executives with linear regression if you deliver it enthusiastically enough.

    • Not everyone knows everything so knowledge is the new oil.

      I do know about linear regression even had quite some of it at university.

      But I still wouldn’t be able to just implement it on some data without good couple days to weeks of figuring things out and which tools to use so I don’t implement it from scratch.

I'm half expecting to see "AI model" appearing as stand-in for "if > 0" at this point in the cycle.

  • This is essentially what any relu based neural network approximately looks like (smoother variants have replaced the original ramp function). AI, even LLMs, essentially reduce to a bunch of code like

        let v0 = 0
        let v1 = 0.40978399*(0.616*u + 0.291*v)
        let v2 = if 0 > v1 then 0 else v1
    
        let v3 = 0
        let v4 = 0.377928*(0.261*u + 0.468*v)
        let v5 = if 0 > v4 then 0 else v4...

    • Thats a bit far. Relu does check x>0 but thats just one non-linearity in the linear/non-linear sandwich that makes up universal function approximator theorem. Its more conplex than just x>0

      3 replies →

I'm sure I've seen basic hill climbing (and other optimisation algorithms) described as AI, and then used evidence of AI solving real-world science/engineering problems.

  • Historically this was very much in the field of AI, which is such a massive field that saying something uses AI is about as useful as saying it uses mathematics. Since the term was first coined it's been constantly misused to refer to much more specific things.

    From around when the term was first coined: "artificial intelligence research is concerned with constructing machines (usually programs for general-purpose computers) which exhibit behavior such that, if it were observed in human activity, we would deign to label the behavior 'intelligent.'" [1]

    [1]: https://doi.org/10.1109/TIT.1963.1057864

    • That definition moves the goalposts almost by definition, people only stopped thinking that chess demonstrated intelligence when computers started doing it.

      1 reply →

  • I am somewhat cynically waiting for the AI community to rediscover the last half a century of linear algebra and optimisation techniques.

    At some point someone will realise that backpropagation and adjoint solves are the same thing.

    • There are plenty of smart people in the "AI community" already who know it. Smugly commenting does not replace actual work. If you have real insight and can make something perform better, I guarantee you that many people will listen (I don't mean twitter influencers but the actual field). If you don't know any serious researcher in AI, I have my doubts that you have any insight to offer.

There is an HIGGS dataset [1]. As name suggest, it is designed to apply machine learning to recognize Higgs bozon.

[1] https://archive.ics.uci.edu/ml/datasets/HIGGS

In my experiments, linear regression with extended (addition of squared values) attributes is very much competitive in accuracy terms with reported MLP accuracy.

  • The LHC has moved on a bit since then. Here's an open dataset that one collaboration used to train a transformer:

    https://opendata-qa.cern.ch/record/93940

    if you can beat it with linear regression we'd be happy to know.

    • Thanks.

      The paper [1] referenced in your link follows the lagacy of the paper on the HIGGS dataset, and does not operate with quantities like accuracy and/or perplexity. HIGGS dataset paper provided area under ROC, from which one had to approximate accuracy. I used accuracy from the ADMM paper [2] to compare my results with. As I checked later, area under ROC in [1] mostly agrees with [2] SGD training results on HIGGS.

        [1] https://arxiv.org/pdf/2505.19689
        [2] https://proceedings.mlr.press/v48/taylor16.pdf
      

      I think that perplexity measure is appropriate there in [1] because we need to discern between three outcomes. This calls for softmax and for perplexity as a standard measure.

      So, my questions are: 1) what perplexity should I target when dealing with "mc-flavtag-ttbar-small" dataset? And 2) what is the split of train/validate/test ratio there?

And why not, when linear regression works, it works so well it's basically magic, better than intelligence, artificial or otherwise