Comment by wodenokoto
9 hours ago
Time series just means that the order of features matter. Feature 1 occurs before feature 2.
E.g, fitting a model to house prices, you don’t care if feature 1 is square meters and feature 2 is time on market, or vice versa, but in a time series, your model changes if you reverse the order of features.
With text, the meaning of word 2 is dependent on the meaning of word 1. With stock prices, you expect the price at time 2 to be dependent on time 1.
Text can be modeled as a time series.
A language model tells you the next character/token/word depending on the previous input.
Language models are time series.
It’s not an audacious claim.
Any student of nlp should have met a paper modeling text as time series before writing their thesis. How could you not meet that?
As a data structure it is an ordered list of integers but no LLM needs to accès it in a database, it's way to slow for anything serious.
RAG and vector Approximate Nearest Neighbour (ANN) is the the go to use case.