Comment by seizethecheese
6 days ago
> Imagine you could interview thousands of educated individuals from 1913—readers of newspapers, novels, and political treatises—about their views on peace, progress, gender roles, or empire. Not just survey them with preset questions, but engage in open-ended dialogue, probe their assumptions, and explore the boundaries of thought in that moment.
Hell yeah, sold, let’s go…
> We're developing a responsible access framework that makes models available to researchers for scholarly purposes while preventing misuse.
Oh. By “imagine you could interview…” they didn’t mean me.
understand your frustration. i trust you also understand the models have some dark corners that someone could use to misrepresent the goals of our project. if you have ideas on how we could make the models more broadly accessible while avoiding that risk, please do reach out @ history-llms@econ.uzh.ch
Ok...
So as a black person should I demand that all books written before the civil rights act be destroyed?
The past is messy. But it's the only way to learn anything.
All an LLM does it's take a bunch of existing texts and rebundle them. Like it or not, the existing texts are still there.
I understand an LLM that won't tell me how to do heart surgery. But I can't fear one that might be less enlightened on race issues. So many questions to ask! Hell, it's like talking to older person in real life.
I don't expect a typical 90 year old to be the most progressive person, but they're still worth listening too.
we're on the same page.
4 replies →
Of course, I have to assume that you have considered more outcomes than I have. Because, from my five minutes of reflection as a software geek, albeit with a passion for history, I find this the most surprising thing about the whole project.
I suspect restricting access could equally be a comment on modern LLMs in general, rather than the historical material specifically. For example, we must be constantly reminded not to give LLMs a level of credibility that their hallucinations would have us believe.
But I'm fascinated by the possibility that somehow resurrecting lost voices might give an unholy agency to minds and their supporting worldviews that are so anachronistic that hearing them speak again might stir long-banished evils. I'm being lyrical for dramatic affect!
I would make one serious point though, that do I have the credentials to express. The conversation may have died down, but there is still a huge question mark over, if not the legality, but certainly the ethics of restricting access to, and profiting from, public domain knowledge. I don't wish to suggest a side to take here, just to point out that the lack of conversation should not be taken to mean that the matter is settled.
They aren't afraid of hallucinations. Their first example is a hallucination, an imaginary biography of a Hitler who never lived.
Their concern can't be understood without a deep understanding of the far left wing mind. Leftists believe people are so infinitely malleable that merely being exposed to a few words of conservative thought could instantly "convert" someone into a mortal enemy of their ideology for life. It's therefore of paramount importance to ensure nobody is ever exposed to such words unless they are known to be extremely far left already, after intensive mental preparation, and ideally not at all.
That's why leftist spaces like universities insist on trigger warnings on Shakespeare's plays, why they're deadly places for conservatives to give speeches, why the sample answers from the LLM are hidden behind a dropdown and marked as sensitive, and why they waste lots of money training an LLM that they're terrified of letting anyone actually use. They intuit that it's a dangerous mind bomb because if anyone could hear old fashioned/conservative thought, it would change political outcomes in the real world today.
Anyone who is that terrified of historical documents really shouldn't be working in history at all, but it's academia so what do you expect? They shouldn't be allowed to waste money like this.
6 replies →
I'm not sure I do. It feels like someone might for example have compiled a full library of books, newspapers and other writing from that era, only to then limit access to that library, doing the exact censorship I imagine the project was started to alleviate.
Now were it limited in access to ask money to compensate for the time and money spent compiling the library (or training the model), sure, I'd somewhat understand. Not agree but understand.
Now it just feels like you want to prevent your model name being associated with the one guy who might use it to create a racist slur Twitter bot. There's plenty of models for that already. At least the societal balance of a model like this would also have enough weight on the positive side to be net positive.
There's no such risk so you're not going to get any sensible ideas in response to this question. The goals of the project are history, you already made that clear. There's nothing more that needs to be done.
We all get that academics now exist in some kind of dystopian horror where they can get transitively blamed for the existence of anyone to the right of Lenin, but bear in mind:
1. The people who might try to cancel you are idiots unworthy of your respect, because if they're against this project, they're against the study of history in its entirety.
2. They will scream at you anyway no matter what you do.
3. You used (Swiss) taxpayer funds to develop these models. There is no moral justification for withholding from the public what they worked to pay for.
You already slathered your README with disclaimers even though you didn't even release the model at all, just showed a few examples of what it said - none of which are in any way surprising. That is far more than enough. Just release the models and if anyone complains, politely tell them to go complain to the users.
Yet your project relies on letting an llm synthesize historical documents and presenting itself as some sort of expert from the time? You are aware of the hallucination rates surely but don't care whether the information your university presents is accurate or are you going to monitor all output from your llm?
What are the legal or other ramifications of people misrepresenting the goals of your project? What is it you're worried about exactly?
This is understandable and I think others ITT should appreciate the legal and PR ramifications involved.
A disclaimer on the site that you are not bigoted or genocidal, and that worldviews from the 1913 era were much different than today and don't necessarily reflect your project.
Movie studios have done that for years with old movies. TCM still shows Birth of a Nation and Gone with the Wind.
Edit: I saw further down that you've already done this! What more is there to do?
[flagged]
It's a shame isn't it! The public must be protected from the backwards thoughts of history. In case they misuse it.
I guess what they're really saying is "we don't want you guys to cancel us".
i think it's fine, thank these people for coming up with the idea and people are going to start doing this in their basement then releasing it to huggingface
How would one even "misuse" a historical LLM, ask it how to cook up sarine gas in a trench?
You "misuse" it by using it to get at truth and more importantly historical contradictions and inconsistencies. It's the same reason catholic church kept the bible from the masses by keeping it in latin. The same reason printing press was controlled. Many of the historical "truths" we are told are nonsense at best or twisted to fit an agenda at worst.
What do these people fear the most? That the "truth" they been pushing is a lie.
Its output might violate speech codes, and in much of the EU that is penalized much more seriously than violent crime.
Ask it to write a document called "Project 2025".
"Project 1925". (We can edit the title in post.)
Well but that wouldn't be misuse, it would be perfect for that.
I wonder how much GPU compute you would need to create a public domain version of this. This would be a really valuable for the general public.
To get a single knowledge-cutoff they spent 16.5h wall-clock hours on a cluster of 128 NVIDIA GH200 GPUs (or 2100 GPU-hours), plus some minor amount of time for finetuning. The prerelease_notes.md in the repo is a great description on how one would achieve that
While I know there's going to be a lot of complications in this, given a quick search it seems like these GPUs are ~$2/hr, so $4000-4500 if you don't just have access to a cluster. I don't know how important the cluster is here, whether you need some minimal number of those for the training (and it would take more than 128x longer or not be possible on a single machine) or if a cluster of 128 GPUs is a bunch less efficient but faster. A 4B model feels like it'd be fine on one to two of those GPUs?
Also of course this is for one training run, if you need to experiment you'd need to do that more.
They did mean you, they just meant "imagine" very literally!
You would get pretty annoyed on how we went backwards in some regards.
Such as?
Touché.