← Back to context

Comment by jagged-chisel

3 months ago

Alright, what's the thing being trained to become the model? If a model means "already trained," what is it before being trained?

Is the model not the network that awaits training data? Or is the model just the weights applied to some standardized network?

A "language model" is a model of a certain language. Thus, trained. What you are thinking of is a "model of how to represent languages in general". That would be valid in a sense, but nobody here uses the word that way. Why would one download a structure with many gigabytes of zeroes, and argue about the merits of one set of zeroes over another?

The network before training is not very interesting, and so not many people talk about it. You can refer to it as "blank network", "untrained network", or any number of ways. Nobody refers to it as "a model".

Yes, if you want to, you can refer to the untrained network as "a model", or even as "a sandwich". But you will get confused answers as you are getting now.