Talkie: a 13B vintage language model from 1930

Posted by jekude 22 hours ago

Talkie: a 13B vintage language model from 1930(talkie-lm.com)

578 points | 234 commentspage 6

mghackerlady 7 hours ago|

See, things like this are what LLMs should be used for. They can be helpful but are best used for cool hacks like this (or, my first exposure to them, someone sticking one in a quagsire plush)

anthk 11 hours ago||

From 1930 like shows, there's the Red Panda podcast mimicking the era of the scifi radio serials:

https://archive.org/details/RedPandaAdventures

Yes, it's weird, cheeky and outdated, but it's really fun and they made a great job mimicking the old accent.

teleforce 20 hours ago||

>Have you ever daydreamed about talking to someone from the past?

Fun facts, LLM was once envisioned by Steve Jobs in one of his interviews [1].

Essentially one of his main wish in life is to meet and interract with Aristotle, in which according to him at the time, computer in the future can make it possible.

[1] In 1985 Steve Jobs described a machine that would help people get answers from Aristotle–modern LLM [video]:

https://youtu.be/yolkEfuUaGs

cedilla 19 hours ago||

The idea of talking to a machine that has all of humanities knowledge and gives answers is older than electronic computing. It certainly wasn't a novel idea when Jobs gave that speech. At that time, the field of artificial intelligence was old enough to become US president.

ok123456 17 hours ago||

Also, using natural language to interact with digital computers has been a research goal since the advent of interactive digital computers. AI in the 80s tried to do this with expert systems.

With the current crop of LLMs, you could argue it's now a solved problem, but the problem is nothing new.

fc417fc802 14 hours ago||

Solved in the sense that the core idea has been realized but unsolved in the sense that it isn't the sort of safe, reliable, deterministic interaction that was commonly envisioned.

anthk 11 hours ago|||

>Aristotle

As a snake oil seller, heh, I woudn't expect something better from Jobs. A competent and true programmer/hacker like Knuth and the like would just want to talk with Archimedes -he almost did a 0.9 version of Calculus- or Euclid, far more relevant to the faulty logic and the Elements' quackery from Aristotle.

jcgrillo 19 hours ago|||

Except... not at all? The vast majority of the training data required to create an artificial Aristotle has been lost forever. Smash your coffee cup on the ground. Now reassemble it and put the coffee back in. Once you can repeatably do that I'll begin to believe you can train an artificial Aristotle.

laichzeit0 16 hours ago|||

Also none of Aristotle’s exoteric works is extant. All we have are dry, boring lecture notes. Cicero said his public works were a “golden stream of speech” and its all lost. So I don’t see how you’d build an artificial Aristotle when we don’t have any of his polished works meant for the public surviving. Plato would be a better option, since his entire exoteric corpus is extant.

antonvs 18 hours ago|||

Your bar is too low. With the coffee cup, you at least have access to all the pieces - in theory, although not in engineering practice. With Aristotle, you don't have anything close to that.

Recreating Aristotle in any meaningful way, other than a model trained on his surviving writing of a million or so words, is simply not possible even in principle.

fragmede 17 hours ago|||

That's easy! All you have to do is simulate the whole universe on a computer, and then go the point when Aristotle is lecturing. Record all his works, then ctrl-c out of that and then feed those recordings into the LLM's training data. For the coffee, you just rewind the simulation and ctrl-c and ctrl-v it at the point you want.

jcgrillo 16 hours ago||

Fuck why didn't I think of that all those other times I fucked up in my life. Ctrl-z woulda done it every goddamn time.

jcgrillo 18 hours ago|||

OK I'll raise the bar--make sure when you reassemble the coffee cup and put the coffee back into it, the coffee is the exact same temperature as when you threw the whole shooting match onto the floor ;)

EDIT: and you don't get to re-heat it.

EDIT AGAIN: to be clear, in my post above (and this one) by "put the coffee back in" I meant more precisely "put every molecule of coffee that splashed/sloshed/flowed/whatever out when the cup smashed back into the re-assembled cup" i.e. "restore the system back to the initial state". Not "refill the glued-together pieces of your shattered coffee cup with new coffee".

freetanga 19 hours ago||

Imagine aiming for Aristotle and landing on Siri…

palashdeb 16 hours ago||

Wow, very interesting one!

yesitcan 20 hours ago||

Vintage is a funny thing to call this. Is it running on vacuum tube hardware?

teraflop 19 hours ago||

I have no real quibble with the blog post itself, but I take issue with the title that calls it a "vintage model".

The blog post defines a "vintage model" as one that is trained only on data before a particular cutoff point:

> Vintage LMs are contamination-free by construction, enabling unique generalization experiments [...] The most important objective when training vintage language models is that no data leaks into the training corpus from after the intended knowledge cutoff

But as they acknowledge later, there are multiple major data leakage issues in their training pipeline, and their model does in fact have quite a bit of anachronistic knowledge. So it fails at what they call the most important objective. It's fair to say that they are working toward something that meets their definition of "vintage", but they're not there yet.

CobrastanJorji 18 hours ago|

Yeah, the blog distinguishes between "contamination," which it describes as polluting the training data with answers to benchmarking questions, with "temporal leakage," which is polluting the training data with writing after the target date, but those seem to be nearly the same problem.

stingraycharles 18 hours ago|||

Not necessarily. The former is about data that’s supposed to be in there, but may actually be testing the model’s recall abilities rather than reasoning (ie rather than actually having a certain writing style, it just cites some passage it knows in that style).

The latter would be data not at all supposed to be in there, in this case, data after 1930.

zoomeriut55 15 hours ago|||

a twit from 2025 saying "the capital of france is paris" is temporal leakage, but not contamination

maxothex 4 hours ago||

[dead]

nikhilpareek13 13 hours ago||

[dead]

simonw 19 hours ago||

[dead]

openclawclub 18 hours ago|

[dead]

More comments...