← Back to context

Comment by wodenokoto

12 hours ago

Language models are time series models.

It’s great when you get this insight as a student of NLP, because suddenly your toolset grows quite a bit.

Could you elaborate? because that sentence made my brow wrinkle with confusion. I have thought to myself before that all business data problems eventually become time series problems. I'd like to understand your point of view on how LLMs fit into that.

  • Time series just means that the order of features matter. Feature 1 occurs before feature 2.

    E.g, fitting a model to house prices, you don’t care if feature 1 is square meters and feature 2 is time on market, or vice versa, but in a time series, your model changes if you reverse the order of features.

    With text, the meaning of word 2 is dependent on the meaning of word 1. With stock prices, you expect the price at time 2 to be dependent on time 1.

    Text can be modeled as a time series.

    A language model tells you the next character/token/word depending on the previous input.

    Language models are time series.

    It’s not an audacious claim.

    Any student of nlp should have met a paper modeling text as time series before writing their thesis. How could you not meet that?

    • As a data structure it is an ordered list of integers but no LLM needs to accès it in a database, it's way to slow for anything serious.

      RAG and vector Approximate Nearest Neighbour (ANN) is the the go to use case.

  • [1] https://towardsdatascience.com/llm-powered-time-series-analy...

    [2] https://arxiv.org/abs/2506.02389

    [3] https://arxiv.org/html/2402.10835v3

    Some links from the top of Google search.

    Take a look here, also, it's an important law: https://en.wikipedia.org/wiki/Benford%27s_law

    It is possible for LLMs to learn Bernford's law, implicitly. So they will be non-null predictors of time series data, because time series data is also Bernford-law-distributed [4].

    [4] https://ui.adsabs.harvard.edu/abs/2017EGUGA..19.2950T/abstra...