Comment by wodenokoto

23 days ago

Time series just means that the order of features matter. Feature 1 occurs before feature 2.

E.g, fitting a model to house prices, you don’t care if feature 1 is square meters and feature 2 is time on market, or vice versa, but in a time series, your model changes if you reverse the order of features.

With text, the meaning of word 2 is dependent on the meaning of word 1. With stock prices, you expect the price at time 2 to be dependent on time 1.

Text can be modeled as a time series.

A language model tells you the next character/token/word depending on the previous input.

Language models are time series.

It’s not an audacious claim.

Any student of nlp should have met a paper modeling text as time series before writing their thesis. How could you not meet that?

1 comment

wodenokoto

LunaSea 23 days ago

As a data structure it is an ordered list of integers but no LLM needs to accès it in a database, it's way to slow for anything serious.

RAG and vector Approximate Nearest Neighbour (ANN) is the the go to use case.