Comment by the_arun

5 days ago

I wish there was a way to send compressed context to LLMs instead of plain text. This will reduce token size, performance & operational costs.

4 comments

the_arun

joshstrange 5 days ago

> This will reduce token size, performance & operational costs.

How? The models aren't trained on compressed text tokens nor could they be if I understand it correctly. The models would have to uncompress before running the raw text through the model.

the_arun 5 days ago
That is what I am looking for. a) LLMs are trained using compressed text tokens and b) use compressed prompts. Don't know how..but that is what I was hoping for.
- deepdarkforest 5 days ago
  
  The whole point of embeddings and tokens are that they are a compressed version of text, a lower dimensionality. now, how low depends on performance, lower amount of vectors=more lossy (usually). https://huggingface.co/spaces/mteb/leaderboard
  You can train your own with very very compressed, i mean you could even go down to each token=just 2 float numbers. It will train, but it will be terrible, because it can essentially only capture distance.
  Prompting a good LLM to summarize the context is probably funnily enough the best way of actually "compressing" context
- rattyJ2 4 days ago
  
  Tokens are already compressed. That's what tokenisation is.