Comment by lxgr

2 months ago

With web search being available in all major user-facing LLM products now (and I believe in some APIs as well, sometimes unintentionally), I feel like the exact month of cutoff is becoming less and less relevant, at least in my personal experience.

The models I'm regularly using are usually smart enough to figure out that they should be pulling in new information for a given topic.

65 comments

lxgr

bredren 2 months ago

It still matters for software packages. Particularly python packages that have to do with programming with AI!

They are evolving quickly, with deprecation and updated documentation. Having to correct for this in system prompts is a pain.

It would be great if the models were updating portions of their content more recently than others.

For the tailwind example in parent-sibling comment, should absolutely be as up to date as possible, whereas the history of the US civil war can probably be updated less frequently.

rafram 2 months ago
> the history of the US civil war can probably be updated less frequently.
It's already missed out on two issues of Civil War History: https://muse.jhu.edu/journal/42
Contrary to the prevailing belief in tech circles, there's a lot in history/social science that we don't know and are still figuring out. It's not IEEE Transactions on Pattern Analysis and Machine Intelligence (four issues since March), but it's not nothing.
- bredren 2 months ago
  
  Let us dispel with the notion that I do not appreciate Civil War history. Ashokan Farewell is the only song I can play from memory on violin.
  
  2 replies →
- xp84 2 months ago
  
  I started reading the first article in one of those issues only to realize it was just a preview of something very paywalled. Why does Johns Hopkins need money so badly that it has to hold historical knowledge hostage? :(
  
  14 replies →
pjmlp 2 months ago
Given that I am still coding against Java 17, C# 7, C++17 and such at most work projects, and more recent versions are still the exception, it is quite reasonable.
Few are on jobs where v-latest is always an option.
- jinushaun 2 months ago
  
  It’s not about the language. I get bit when they recommend old libraries or hallucinate non-existent ones.
  
  1 reply →
brylie 2 months ago
I've had good success with the Context7 model context protocol tool, which allows code agents, like GitHub Copilot, to look up the latest relevant version of library documentation including code snippets: https://context7.com/
- frodo999 1 month ago
  
  We just launched an alternative called Docfork that just uses 1 API call and wraps up the request (Context7 generally uses 2) since speed was a big priority for us: https://docfork.com
- diggan 2 months ago
  
  I wonder how necessary that is. I've noticed that while Codex doesn't have any fancy tools like that (as it doesn't have internet access), it instead finds the source of whatever library you pulled in, so in Rust for example it's aware (or finds out) where the source was pulled down, and greps those files to figure out the API on the fly. Seems to work well enough and also works whatever library, private or not, updated 1 minute ago or not.
roflyear 2 months ago
It matters even with recent cutoffs, these models have no idea when to use a package or not (if it's no longer maintained, etc)
You can fix this by first figuring out what packages to use or providing your package list, tho.
- diggan 2 months ago
  
  > these models have no idea when to use a package or not (if it's no longer maintained, etc)
  They have ideas about what you tell them to have ideas about. In this case, when to use a package or not, differs a lot by person, organization or even project, so makes sense they wouldn't be heavily biased one way or another.
  Personally I'd look at architecture of the package code before I'd look at when the last change was/how often it was updated, and if it was years since last change or yesterday have little bearing (usually) when deciding to use it, so I wouldn't want my LLM assistant to value it differently.
andrepd 2 months ago
The fact that things from March are already deprecated is insane.
- krzyk 2 months ago
  
  sounds like npm and general js ecosystem
jordanbeiber 2 months ago

Cursor have a nice ”docs” feature for this, that have saved me from battles with constant version reversing actions from our dear LLM overlords.
MollyRealized 2 months ago

> whereas the history of the US civil war can probably be updated less frequently.
Depends on which one you're talking about.
alasano 2 months ago

The context7 MCP helps with this but I agree.
paulddraper 2 months ago
How often are base level libraries/frameworks changing in incomparable ways?
- jinushaun 2 months ago
  
  In the JavaScript world, very frequently. If latest is 2.8 and I’m coding against 2.1, I don’t want answers using 1.6. This happened enough that I now always specify versions in my prompt.
  
  3 replies →
- weq 2 months ago
  
  The more popular a library is, the more times its updated every year, the more it will suffer this fate. You always have refine prompts with specific versions and specific ways of doing things, each will be different on your use case.
- hnlmorg 2 months ago
  
  That depends on the language and domain.
  MCP itself isn’t even a year old.
toomuchtodo 2 months ago
Does repo/package specific MCP solve for this at all?
- aziaziazi 2 months ago
  
  Kind of but not in the same way: the MCP option will increase the discussion context, the training option does not. Armchair expert so confirmation would be appreciated.
  
  2 replies →

jacob019 2 months ago

Valid. I suppose the most annoying thing related to the cutoffs, is the model's knowledge of library APIs, especially when there are breaking changes. Even when they have some knowledge of the most recent version, they tend to default to whatever they have seen the most in training, which is typically older code. I suspect the frontier labs have all been working to mitigate this. I'm just super stoked, been waiting for this one to drop.

drogus 2 months ago

In my experience it really depends on the situation. For stable APIs that have been around for years, sure, it doesn't really matter that much. But if you try to use a library that had significant changes after the cutoff, the models tend to do things the old way, even if you provide a link to examples with new code.

myfonj 2 months ago

For the recent resources it might matter: unless the training data are curated meticulously, they may be "spoiled" by the output of other LLM, or even the previous version of the one that is being trained. That's something what is generally considered dangerous, because it could potentially produce unintentional echo-chamber or even somewhat "incestuously degenerated" new model.

jgalt212 2 months ago

> The models I'm regularly using are usually smart enough to figure out that they should be pulling in new information for a given topic.

Fair enough, but information encoded in the model is return in milliseconds, information that needs to be scraped is returned in 10s of seconds.

GardenLetter27 2 months ago

I've had issues with Godot and Rustls - where it gives code for some ancient version of the API.

Aeolun 2 months ago
> some ancient version of the API
One and a half years old shudders
- Tostino 2 months ago
  
  When everything it is trying to use is deprecated, yeah it matters.

iLoveOncall 2 months ago

Web search isn't desirable or even an option in a lot of use cases that involve GenAI.

It seems people have turned GenAI into coding assistants only and forget that they can actually be used for other projects too.

lanstin 2 months ago
That's because between the two approaches "explain me this thing" or "write code to demonstrate this thing" the LLMs are much more useful on the second path. I can ask it to calculate some third derivatives, or I can ask it to write Mathematica notebook to calculate the same derivatives, and the latter is generally correct and extremely useful as is - the former requires me to scrutinize each line of logic and calculation very carefully.
It's like https://www.youtube.com/watch?v=zZr54G7ec7A where Prof. Tao uses claude to generate Lean4 proofs (which are then verifiable by machine). Great progress, very useful. While the LLM only approachs are still lacking utility for the top minds: https://mathstodon.xyz/@tao/113132502735585408
- iLoveOncall 2 months ago
  
  You have a narrow imagination. I'm talking about using GenAI for non-CS related applications, like it was advertised a year or so ago.
  
  1 reply →

guywithahat 2 months ago

I was thinking that too, grok can comment on things that have only just broke out hours earlier, cutoff dates don't seem to matter much

DonHopkins 2 months ago
Yeah, it seems pretty up-to-date with Elon's latest White Genocide and Holocaust Denial conspiracy theories, but it's so heavy handed about bringing them up out of the blue and pushing them in the middle of discussions about the Zod 4 and Svelte 5 and Tailwind 4 that I think those topics are coming from its prompts, not its training.
- drilbo 2 months ago
  
  while this is obviously a very damning example, tbf it does seem to be an extreme outlier.
  
  1 reply →

Kostic 2 months ago

It's relevant from an engineering perspective. They have a way to develop a new model in months now.

dzhiurgis 2 months ago

Ditto. Twitter's Grok is especially good at this.

lobochrome 2 months ago

It knows uv now

tzury 2 months ago

web search is an immediate limited operation training is a petabytes long term operation

BeetleB 2 months ago

Web search is costlier.