History LLMs: Models trained exclusively on pre-1913 texts

Posted by iamwil 12/18/2025

History LLMs: Models trained exclusively on pre-1913 texts(github.com)

897 points | 421 commentspage 4

alexgotoi 12/19/2025|

[flagged]

neom 12/19/2025||

This would be a super interesting research/teaching tool coupled with a vision model for historians. My wife is a history professor who works with scans of 18th century english documents and I think (maybe a small) part of why the transcription on even the best models is off in weird ways, is it seems to often smooth over things and you end up with modern words and strange mistakes, I wonder if bounding the vision to a period specific model would result in better transcription? Querying against the historical document you're working on with a period specific chatbot would be fascinating.

Also wonder if I'm responsible enough to have access to such a model...

sbmthakur 12/19/2025||

Someone suggested a nice thought experiment - train LLMs on all Physics before quantum physics was discovered. If the LLM can see still figure out the latter then certainly we have achieved some success in the space.

delichon 12/19/2025||

Datomic has a "time travel" feature where for every query you can include a datetime, and it will only use facts from the db as of that moment. I have a guess that to get the equivalent from an LLM you would have to train it on the data from each moment you want to travel to, which this project seems to be doing. But I hope I'm wrong.

It would be fascinating to try it with other constraints, like only from sources known to be women, men, Christian, Muslim, young, old, etc.

underfox 12/19/2025||

> [They aren't] perfect mirrors of "public opinion" (they represent published text, which skews educated and toward dominant viewpoints)

Really good point that I don't think I would've considered on my own. Easy to take for granted how easy it is to share information (for better or worse) now, but pre-1913 there were far more structural and societal barriers to doing the same.

mmooss 12/19/2025||

> Imagine you could interview thousands of educated individuals from 1913—readers of newspapers, novels, and political treatises—about their views on peace, progress, gender roles, or empire.

I don't mind the experimentation. I'm curious about where someone has found an application of it.

What is the value of such a broad, generic viewpoint? What does it represent? What is it evidence of? The answer to both seems to be 'nothing'.

TSiege 12/19/2025||

I agree. This is just make believe based on a smaller subset of human writing than LLMs we have today. It's responses are in no way useful because it is a machine mimicking a subset of published works that survived to be digitized. In that sense the "opinions" and "beliefs" are just an averaging of a subset of a subset of humanity pre 1913. I see no value in this to historians. It is really more of a parlor trick, a seance masquerading as science.

behringer 12/19/2025|||

It doesn't have to be generic. You can assign genders, ideals, even modern ones, and it should do it's best to oblige.

mediaman 12/19/2025||

This is a regurgitation of the old critique of history: what's it's purpose? What do you use it for? What is its application?

One answer is that the study of history helps us understand that what we believe as "obviously correct" views today are as contingent on our current social norms and power structures (and their history) as the "obviously correct" views and beliefs of some point in the past.

It's hard for most people to view two different mutually exclusive moral views as both "obviously correct," because we are made of a milieu that only accepts one of them as correct.

We look back at some point in history, and say, well, they believed these things because they were uninformed. They hadn't yet made certain discoveries, or had not yet evolved morally in some way; they had not yet witnessed the power of the atomic bomb, the horrors of chemical warfare, women's suffrage, organized labor, or widespread antibiotics and the fall of extreme infant mortality.

An LLM trained on that history - without interference from the subsequent actual path of history - gives us an interactive compression of the views from a specific point in history without the subsequent coloring by the actual events of history.

In that sense - if you believe there is any redeeming value to history at all; perhaps you do not - this is an excellent project! It's not perfect (it is only built from writings, not what people actually said) but we have no other available mass compression of the social norms of a specific time, untainted by the views of subsequent interpreters.

vintermann 12/19/2025|||

One thing I haven't seen anyone bring up yet in this thread, is that there's a big risk of leakage. If even big image models had CSAM sneak into their training material, how can we trust data from our time hasn't snuck into these historical models?

I've used Google books a lot in the past, and Google's time-filtering feature in searches too. Not to mention Spotify's search features targeting date of production. All had huge temporal mislabeling problems.

DGoettlich 12/19/2025||

Also one of our fears. What we've done so far is to drop docs where the datasource was doubtful about the date of publication, if there are multiple possible dates we take the latest to be conservative. During training, we validate that the model learns pre- but not post-cutoff facts. https://github.com/DGoettlich/history-llms/blob/main/ranke-4...

If you have other ideas or think thats not enough, I'd be curious to know! (history-llms@econ.uzh.ch)

mmooss 12/19/2025|||

> This is a regurgitation of the old critique of history: what's it's purpose? What do you use it for? What is its application?

Feeling a bit defensive? That is not at all my point; I value history highly and read it regularly. I care about it, thus my questions:

> gives us an interactive compression of the views from a specific point in history without the subsequent coloring by the actual events of history.

What validity does this 'compression' have? What is the definition of a 'compression'? For example, I could create random statistics or verbiage from the data; why would that be any better or worse than this 'compression'?

Interactivity seems to be a negative: It's fun, but it would seem to highly distort the information output from the data, and omits the most valuable parts (unless we luckily stumble across it). I'd much rather have a systematic presentation of the data.

These critiques are not the end of the line; they are step in innovation, which of course raises challenging questions and, if successful, adapts to the problems. But we still need to grapple with them.

thesumofall 12/19/2025||

While obvious, it’s still interesting that its morals and values seem to derive from the texts it has ingested. Does that mean modern LLMs cannot challenge us beyond mere facts? Or does it just mean that this small model is not smart enough to escape the bias of its training data? Would it not be amazing if LLMs could challenge us on our core beliefs?

Tom1380 12/19/2025||

Keep at it Zurich!

ulbu 12/19/2025|

for anyone moaning the plight that it's not accessible to you: they are historians, I think they're more educated in matters of historical mistake than you or me. playing safe is simply prudence. it is sorely lacking in the American approach to technology. prevention is the best medicine.

More comments...