Comment by ipython
5 months ago
But why would that information not be included in the wide crawl already encoded in the model weights before the knowledge cutoff? I believe the article mentions frontier models so we are talking about models trained on trillions of tokens here
In addition to cutoff dates, models do not encode every single thing from the training set verbatim. One forum post somewhere about Foo library v13.5.3 being incompatible with Bar 2.3 and resulting in ValueErrors is not going to make it.
Because cutoff can be like few months ago and you still have new versions of libraries being developed every month. API getting deprecated or removed or new API being added. Model need to have access to the latest API or SDK that is available and know e.g. what iOS SDK you have currently available and what MacOS version you have etc. Having access to github issues also help to figure out if there is bug in library.