← Back to context

Comment by gbnwl

7 days ago

They're literally trained on natural language to output natural language. You would need to create the hyper compressed language first, convert all of your training data to that, and then train the models with that. But token efficiency per word already does vary between different languages, with Chinese being like 30%-40% more efficient than English last I heard

Doesn't this mean the Chinese models have a significant advantage ?

This isn't my domain, but say you had a massive budget, wouldn't a special LLM "thinking" language make sense ?