Comment by bredren

2 months ago

It still matters for software packages. Particularly python packages that have to do with programming with AI!

They are evolving quickly, with deprecation and updated documentation. Having to correct for this in system prompts is a pain.

It would be great if the models were updating portions of their content more recently than others.

For the tailwind example in parent-sibling comment, should absolutely be as up to date as possible, whereas the history of the US civil war can probably be updated less frequently.

44 comments

bredren

rafram 2 months ago

> the history of the US civil war can probably be updated less frequently.

It's already missed out on two issues of Civil War History: https://muse.jhu.edu/journal/42

Contrary to the prevailing belief in tech circles, there's a lot in history/social science that we don't know and are still figuring out. It's not IEEE Transactions on Pattern Analysis and Machine Intelligence (four issues since March), but it's not nothing.

bredren 2 months ago
Let us dispel with the notion that I do not appreciate Civil War history. Ashokan Farewell is the only song I can play from memory on violin.
- taydotis 2 months ago
  
  this unlocked memories in me that were long forgotten. Ashokan Farewell !!
  
  1 reply →
xp84 2 months ago
I started reading the first article in one of those issues only to realize it was just a preview of something very paywalled. Why does Johns Hopkins need money so badly that it has to hold historical knowledge hostage? :(
- evanelias 2 months ago
  
  Johns Hopkins is not the publisher of this journal and does not hold copyright for this journal. Why are you blaming them?
  The website linked above is just a way to read journals online, hosted by Johns Hopkins. As it states, "Most of our users get access to content on Project MUSE through their library or institution. For individuals who are not affiliated with a library or institution, we provide options for you to purchase Project MUSE content and subscriptions for a selection of Project MUSE journals."
- ordersofmag 2 months ago
  
  The journal appears to be published by an office with 7 FTE's which presumably is funded by the money raised by presence of the paywall and sales of their journals and books. Fully-loaded costs for 7 folks is on the order of $750k/year. https://www.kentstateuniversitypress.com/
  Someone has to foot that bill. Open-access publishing implies the authors are paying the cost of publication and its popularity in STEM reflects an availability of money (especially grant funds) to cover those author page charges that is not mirrored in the social sciences and humanities.
  Unrelatedly given recent changes in federal funding Johns Hopkins is probably feeling like it could use a little extra cash (losing $800 million in USAID funding, overhead rates potential dropping to existential crisis levels, etc...)
  
  4 replies →
- ChadNauseam 2 months ago
  
  Isn’t john hopkins a university? I feel like holding knowledge hostage is their entire business model.
  
  6 replies →

pjmlp 2 months ago

Given that I am still coding against Java 17, C# 7, C++17 and such at most work projects, and more recent versions are still the exception, it is quite reasonable.

Few are on jobs where v-latest is always an option.

jinushaun 2 months ago
It’s not about the language. I get bit when they recommend old libraries or hallucinate non-existent ones.
- pjmlp 2 months ago
  
  Hallucination is indeed a problem.
  As for the libraries, for using more modern libraries, usually it also requires more recent language versions.

brylie 2 months ago

I've had good success with the Context7 model context protocol tool, which allows code agents, like GitHub Copilot, to look up the latest relevant version of library documentation including code snippets: https://context7.com/

frodo999 1 month ago

We just launched an alternative called Docfork that just uses 1 API call and wraps up the request (Context7 generally uses 2) since speed was a big priority for us: https://docfork.com
diggan 2 months ago

I wonder how necessary that is. I've noticed that while Codex doesn't have any fancy tools like that (as it doesn't have internet access), it instead finds the source of whatever library you pulled in, so in Rust for example it's aware (or finds out) where the source was pulled down, and greps those files to figure out the API on the fly. Seems to work well enough and also works whatever library, private or not, updated 1 minute ago or not.

roflyear 2 months ago

It matters even with recent cutoffs, these models have no idea when to use a package or not (if it's no longer maintained, etc)

You can fix this by first figuring out what packages to use or providing your package list, tho.

diggan 2 months ago

> these models have no idea when to use a package or not (if it's no longer maintained, etc)
They have ideas about what you tell them to have ideas about. In this case, when to use a package or not, differs a lot by person, organization or even project, so makes sense they wouldn't be heavily biased one way or another.
Personally I'd look at architecture of the package code before I'd look at when the last change was/how often it was updated, and if it was years since last change or yesterday have little bearing (usually) when deciding to use it, so I wouldn't want my LLM assistant to value it differently.

andrepd 2 months ago

The fact that things from March are already deprecated is insane.

krzyk 2 months ago

sounds like npm and general js ecosystem

jordanbeiber 2 months ago

Cursor have a nice ”docs” feature for this, that have saved me from battles with constant version reversing actions from our dear LLM overlords.

MollyRealized 2 months ago

> whereas the history of the US civil war can probably be updated less frequently.

Depends on which one you're talking about.

alasano 2 months ago

The context7 MCP helps with this but I agree.

paulddraper 2 months ago

How often are base level libraries/frameworks changing in incomparable ways?

jinushaun 2 months ago
In the JavaScript world, very frequently. If latest is 2.8 and I’m coding against 2.1, I don’t want answers using 1.6. This happened enough that I now always specify versions in my prompt.
- paulddraper 2 months ago
  
  Geez
  
  2 replies →
weq 2 months ago

The more popular a library is, the more times its updated every year, the more it will suffer this fate. You always have refine prompts with specific versions and specific ways of doing things, each will be different on your use case.
hnlmorg 2 months ago

That depends on the language and domain.
MCP itself isn’t even a year old.

toomuchtodo 2 months ago

Does repo/package specific MCP solve for this at all?

aziaziazi 2 months ago
Kind of but not in the same way: the MCP option will increase the discussion context, the training option does not. Armchair expert so confirmation would be appreciated.
- toomuchtodo 2 months ago
  
  Same, I'm curious what it looks like to incrementally or micro train against, if at all possible, frequently changing data sources (repos, Wikipedia/news/current events, etc).
  
  1 reply →