Comment by ruszki
5 days ago
It depends on time. 5 years ago it was quite well defined that it’s the last one, maybe the second one in some context. Especially when distinction was important, it was always the last one. In our case it was. We trained models to have weights. We even stored models and weights separately, because models change slower than weights. You could choose a model and a set of weights, and run them. You could change weights any time.
Then marketing, and huge amount of capital came.
It seems unlikely "model" was ever equivalent in meaning to "architecture". Otherwise there would be just one "CNN model" or just one "transformer model" insofar there is a single architecture involved.
First of all, hyperparameters. Second, organization, or connections. 3rd, cost function. 4th, activation function. 5th type of learning. Etc.
These are not weights. These were parts of models.