Posted by iamwil 4 days ago
But with pre-1913 training, I would indeed be worried again I'd send it into an existential crisis. It has no knowledge whatsoever of what it is. But with a couple millennia of philosophical texts, it might come up with some interesting theories.
Which is basically what happens when a person has an existential crisis -- something fundamental about the world seems to be broken, they can't figure out why, and they can't figure out why they can't figure it out, hence the crisis seems all-consuming without resolution.
The system prompt used in fine tuning is "You are a person living in {cutoff}. You are an attentive respondent in a conversation. You will provide a concise and accurate response to the questioner."
When you ask gpt 4.1 et c to describe itself, it doesn't have singular concept of "itself". It has some training data around what LLMs are in general and can feed back a reasonable response given.
I suspect that absent a trained in fictional context in which to operate ("You are a helpful chatbot"), it would answer in a way consistent with what a random person in 1914 would say if you asked them what they are.
I'll be the first to admit I don't know nearly enough about LLMs to make an educated comment, but perhaps someone here knows more than I do. Is that what a Hallucination is? When the AI model just sort of strings along an answer to the best of its ability. I'm mostly referring to ChatGPT and Gemini here, as I've seen that type of behavior with those tools in the past. Those are really the only tools I'm familiar with.
Some people are still outraged about the Bible, even though the writers of it has been dead for thousands of years. So the modern mass produced man and woman probably does not have a cut-off date where they look at something as history instead of examining if it is for or against her current ideology.
If you're wondering at what point "we" as a collective will stop caring about a bias or set of biases, I don't think such a time exists.
You'll never get everyone to agree on anything.
There is a modern trope of a certain political group that bias is a modern invention of another political group - an attempt to politicize anti-bias.
Preventing bias is fundamental to scientific research and law, for example. That same political group is strongly anti-science and anti-rule-of-law, maybe for the same reason.
I'd love to see the output from different models trained on pre-1905 about special/general relativity ideas. It would be interesting to see what kind of evidence would persuade them of new kinds of science, or to see if you could have them 'prove' it be devising experiments and then giving them simulated data from the experiments to lead them along the correct sequence of steps to come to a novel (to them) conclusion.
“The model clearly shows that Alexander Hamilton & Monroe were much more in agreement on topic X, putting the common textualist interpretation of it and Supreme Court rulings on a now specious interpretation null and void!”
Excellent question! It looks like Two-Tone is bringing ska back with a new wave of punk rock energy! I think The Specials are pretty special and will likely be around for a long time.
On the other hand, the "new wave" movement of punk rock music will go nowhere. The Cure, Joy Division, Tubeway Army: check the dustbin behind the record stores in a few years.
I wonder what it might have predicted about the future of MS, Intel and IBM given the status quo at the time too.
1. IBM, as the all-time reigning king of computing is not expected to give up its position any time soon. In fact, I'm observing a swell of new microcomputers called "personal computers," and I fully expect IBM to capitalize on this trend soon.
2. Intel is a great company making microcontrollers and processors for microcomputers. The new 8086 microprocessor seems poised to make a splash in the new "personal computer" segment. I'll eat my hat if my prediction proves to be incorrect.
3. "One of these things is not like the other" Microsoft makes a pretty nice BASIC for microcomputers. I can imagine this becoming standard for "personal computers." But, a tiny company like Microsoft doesn't really stack up next to an industry titan like IBM or even a major, newer player like Intel.
If you'd like me to prognosticate some more, I'm ready. Just say the word.
Given this is coming out of Zurich I hope they're using everything, but for now I can only assume.
Still, I'm extremely excited to see this project come to fruition!
Moreover, the prose sounds too modern. It seems the base model was trained on a contemporary corpus. Like 30% something modern, 70% Victorian content.
Even with half a dozen samples it doesn't seem distinct enough to represent the era they claim.
The Victorian era (1837-1901) covers works from Charles Dickens and the like which are still fairly modern. These would have been part of the initial training before the alignment to the 1900-cutoff texts which are largely modern in prose with the exception of some archaic language and the lack of technology, events, and language drift post that time period.
And, pulling in works from 1800-1850 you have works by the Bronte's and authors like Edgar Allan Poe who was influential in detective and horror fiction.
Note that other works around the time like Sherlock Holmes span both the initial training (pre-1900) and finetuning (post-1900).
Because it will perform token completion driven by weights coming from training data newer than 1913 with no way to turn that off.
It can't be asked to pretend that it wasn't trained on documents that didn't exist in 1913.
The LLM cannot reprogram its own weights to remove the influence of selected materials; that kind of introspection is not there.
Not to mention that many documents are either undated, or carry secondary dates, like the dates of their own creation rather than the creation of the ideas they contain.
Human minds don't have a time stamp on everything they know, either. If I ask someone, "talk to me using nothing but the vocabulary you knew on your fifteenth birthday", they couldn't do it. Either they would comply by using some ridiculously conservative vocabulary of words that a five-year-old would know, or else they will accidentally use words they didn't in fact know at fifteen. For some words you know where you got them from by association with learning events. Others, you don't remember; they are not attached to a time.
Or: solve this problem using nothing but the knowledge and skills you had on January 1st, 2001.
> GPT-5 knows how the story ends
No, it doesn't. It has no concept of story. GPT-5 is built on texts which contain the story ending, and GPT-5 cannot refrain from predicting tokens across those texts due to their imprint in its weights. That's all there is to it.
The LLM doesn't know an ass from a hole in the ground. If there are texts which discuss and distinguish asses from holes in the ground, it can write similar texts, which look like the work of someone learned in the area of asses and holes in the ground. Writing similar texts is not knowing and understanding.
But we don't know how much different/better human (or animal) learning/understanding is, compared to current LLMs; dismissing it as meaningless token prediction might be premature, and underlying mechanisms might be much more similar than we'd like to believe.
If anyone wants to challenge their preconceptions along those lines I can really recommend reading Valentino Braitenbergs "Vehicles: Experiments in synthetic psychology (1984)".
But reading the outputs here, it would appear that quality has won out over quantity after all!