Top
Best
New

Posted by fs123 13 hours ago

Claude's Cycles [pdf](www-cs-faculty.stanford.edu)
383 points | 187 commentspage 2
ontouchstart 8 hours ago|
Fascinating report by DEK himself.

Time to sit down, read, digest and understand it without the help of LLM.

ontouchstart 7 hours ago|
I don't have time to do that myself yet so I just dug a quick TL;DR rabbit hole for fun:

https://ontouchstart.github.io/rabbit-holes/llm_rabbit_hole_...

ecshafer 8 hours ago||
I wonder how long we have until we start solving some truly hard problems with AI. How long until we throw AI at "connect general relativity and quantum physics", give the AI 6 months and a few data centers, and have it pop out a solution?
rustyhancock 8 hours ago||
I think a very long time because part of our limit is experiment.

We need enough experimental results to explain to solve these theoretical mismatches and we don't and at present can't explore that frontier.

Once we have more results at that frontier we'd build a theory out from there that has two nearly independent limits for QFT and GR.

What we'd be asking if the AI is something that we can't expect a human to solve even with a lifetime of effort today.

It'll take something in par with Newton realising that the heavens and apples are under the same rules to do it. But at least Newton got to hold the apple and only had to imagine he could a star.

eru 7 hours ago|||
> I think a very long time because part of our limit is experiment.

Yes, maybe. But if you are smarter, you can think up better experiments that you can actually do. Or re-use data from earlier experiments in novel and clever ways.

fleischhauf 6 hours ago||
this. could already be useful to narrow down the search space
bob1029 8 hours ago||||
What prevents us from giving this system access to other real systems that live in physical labs? I don't see much difference between parameterizing and executing a particle accelerator run and invoking some SQL against a provider. It's just JSON on the wire at some level.
rustyhancock 8 hours ago||
Nothing, we can give it all the data we have and have it lead experiments.

But we can not yet experiment at the GR/QFT frontier.

To do so with a particle accelerator it would need to be the size of the milky way.

fragmede 7 hours ago|||
The question is, if you trained an LLM on everything up until 1904, could it come up with E=MC² or not?
rustyhancock 7 hours ago||
In 1900 Henri Poincaré wrote that radiation (light) has an effective mass given by E/c^2.

So it really isn't far fetched. What intrigues me more is if it was capable of it would our Victorian conservative minded scientists have RLHF it out of that kind of thing?

emp17344 6 hours ago|||
Hold your horses, that’s a long way off. The best math AI tool we currently have, Aletheia, was only able to solve 13 out of 700 attempted open Erdos problems, only 4 of which were solved autonomously: https://arxiv.org/html/2601.22401v3

Clearly, these models still struggle with novel problems.

slibhb 5 hours ago||
> Clearly, these models still struggle with novel problems.

Do they struggle with novel problems more or less than humans?

Filligree 4 hours ago||
Less than most humans, but more than many humans.
worldsavior 8 hours ago|||
If AGI will ever come, then. Currently, AI is only a statistical machines, and solutions like this are purely based on distribution and no logic/actual intelligence.
zarzavat 8 hours ago|||
I swear that AI could independently develop a cure for cancer and people would still say that it's not actually intelligent, just matrix multiplications giving a statistically probable answer!

LLMs are at least designed to be intelligent. Our monkey brains have much less reason to be intelligent, since we only evolved to survive nature, not to understand it.

We are at this moment extremely deep into what most people would have been considered to be actual artificial intelligence a mere 15 years ago. We're not quite at human levels of intelligence, but it's close.

qsera 7 hours ago|||
>AI could independently develop a cure for cancer

All the answers for all your questions is contained in randomness. If you have a random sentence generator, there is a chance that it will output the answer to this question every time it is invoked.

But that does not actually make it intelligent, does it?

famouswaffles 7 hours ago|||
You are arguing a point no-one is making. LLMs are not random sentence generators. Its probability distributions are anything but random. You could make an actual random sentence generator, but no-one would argue about its intelligence.
graemefawcett 7 hours ago|||
This is exactly how problem solving works, regardless of the substrate of cognition.

Start with "all your questions contained in randomness" -> the unconstrained solution space.

The game is whether or not you can inject enough constraints to collapse the solution space to one that can be solved before your TTL expires. In software, that's generally handled by writing efficient algorithms. With LLMs, apparently the SOTA for this is just "more data centers, 6 months, keep pulling the handle until the right tokens fall out".

Intelligence is just knowing which constraints to apply and in what order such that the search space is effectively partitioned, same thing the "reasoning" traces do. Same thing thermostats, bacteria, sorting algorithms and rivers do, given enough timescale. You can do the same thing with effective prompting.

The LLM has no grounding, no experience and no context other than which is provided to it. You either need to build that or be that in order for the LLM to work effectively. Yes, the answers for all your questions are contained. No, it's not randomness. It's probability and that can be navigated if you know how

qsera 4 hours ago||
You can constrain the solution space all you want, but if you don't have a method to come up with possible solutions that might match the constraints, you ll be just sitting there all day long for the machine to produce some results. So intelligence is not "just knowing which constraints to apply". It is also the ability to come up with solutions within the constraints without going through a lot of trial and error...

But hey, if LLMs can go through a lot of trial and error, it might produce useful results, but that is not intelligence. It is just a highly constrained random solution generator..

graemefawcett 4 hours ago||
I believe that's I and the paper are both saying as well. The LLM is pure routing, the constraints currently are located elsewhere in the system. In this case, both the constraints and the motivation to perform the work are located in Knuth and his assistant.

Routing is important, it's why we keep building systems that do it faster and over more degrees of freedom. LLMs aren't intelligent on their own, but it's not because they don't have enough parameters

wang_li 7 hours ago||||
Last week I put "was val kilmer in heat" into the search box on my browser. The AI answer came back with "No, Val Kilmer was not in heat. Val Kilmer played Chris Shiherlis in the movie Heat but the film did not indicate that he was pregnant or in heat. His performance was nuanced and skilled and represents a high point of the film." I was not curious about whether he was pregnant.

We are not only not close to human level of intelligence, we are not even at dog, cat, or mouse levels of intelligence. We are not actually at any level of intelligence. Devices that produce text, images, or code do not demonstrate intelligence any more than a printer producing pages of beautiful art demonstrate intelligence.

DennisP 6 hours ago|||
Honestly, when I read your first sentence, given the lack of a capital H, my brain initially went the same direction the AI did. Then I realized what you meant but since I already went there, I might have made a similar response as a joke. For the sake of my ego I'm forced to reject your claim that this is evidence of stupidity.
logicprog 2 hours ago||||
> I was not curious about whether he was pregnant.

I interpreted the question the same way the AI did.

sosodev 6 hours ago|||
The model that processes search results is tiny and dumb. You shouldn't compare it to the frontier models that are solving complex math problems.
StilesCrisis 4 hours ago||
On Google, just clicking "AI Mode" gives you a substantially smarter model, and it's still pretty weak. But I assume the OP wasn't talking about Google because it doesn't seem to make this mistake even in a search.
wang_li 1 hour ago||
It was bing as that is the default for Edge as supplied on my work laptop. It doesn't do this now, but it does do something else quite weird:

search: was val kilmer pregnant or in heat

answer: Not pregnant Val Kilmer was not pregnant or in heat during the events of "Heat." His character, Chris Shiherlis, is involved in a shootout and is shot, which indicates he is not in a reproductive or mating state at that time.

And then cites wikipedia as the source of information.

In terms of cognition the answer is meaningless. Nothing in the question implies or suggests that the question has to do with a movie. Additionally, "involved in a shootout and is shot, which indicates he is not in a reproductive or mating state" makes no sense at all.

AI as deployed shows no intelligence.

Philpax 20 minutes ago||
If you asked a three-year-old a question that they proceeded to completely flub, would you then assume that all humans are incapable of answering questions correctly?

Nobody is arguing for the quality of the search overviews. The models that impress us are several orders of magnitude larger in scale, and are capable of doing things like assisting preeminent computer scientists (the topic of discussion) and mathematicians (https://github.com/teorth/erdosproblems/wiki/AI-contribution...).

worldsavior 7 hours ago|||
That's wrong. Humans were evolved to have big brains so they can better understand the env and use it to their advantage.

I still see AI making stupid silly mistakes. I rather think and not waste time on something that only remembers data, and doesn't even understand it.

Reasoning in AI is only about finding contradictions between his "thoughts", not actually understand it.

someplaceguy 7 hours ago|||
> I still see AI making stupid silly mistakes.

In contrast with humans, who are famously known for never making stupid silly mistakes...

_fizz_buzz_ 7 hours ago|||
> I still see AI making stupid silly mistakes.

Humans also make silly mistakes.

whimsicalism 2 hours ago||||
It only took 4 years, but it appears that this view is finally dying out on HN. I would advise everyone who found this viewpoint compelling to think about how those same blinders might be affecting how you are imagining the future to look like.
rustyhancock 8 hours ago||||
I don't even think that's the issue.

The issue to my mind is a lack of data at the meeting of QFT/GR.

Afterall few humans historically have been capable of the initial true leap between ontologies. But humans are pretty smart so we can't say that is a requirement for AGI.

worldsavior 8 hours ago|||
When it comes to revolutionary/unsolved subjects, there will never be enough data. That's why its revolutionary/unsolved.
cjcole 6 hours ago|||
Maybe.

“The laws of nature should be expressed in beautiful equations.”

- Paul Dirac

“It is, indeed, an incredible fact that what the human mind, at its deepest and most profound, perceives as beautiful finds its realisation in external nature. What is intelligible is also beautiful. We may well ask: how does it happen that beauty in the exact sciences becomes recognizable even before it is understood in detail and before it can be rationally demonstrated? In what does this power of illumination consist?”

- Subrahmanyan Chandrasekhar

“I often follow Plato’s strategy, proposing objects of mathematical beauty as models for Nature.”

“It was beauty and symmetry that guided Maxwell and his followers.”

- Frank Wilczek

“Beauty, is bound up with symmetry.”

- Herman Weyl

"Still twice in the history of exact natural science has this shining-up of the great interconnection become the decisive signal for significant progress. I am thinking here of two events in the physics of our century: the rise of the theory of relativity and that of the quantum theory. In both cases, after yearlong unsuccessful striving for understanding, a bewildering abundance of details was almost suddenly ordered. This took place when an interconnection emerged which, thought largely unvisualizable, was finally simple in its substance. It convinced through its compactness and abstract beauty – it convinced all those who can understand and speak such an abstract language."

- Werner Heisenberg

Maybe (just maybe) these things (whatever you want to call them) will (somehow) gain access to some "compact", beautiful, "largely unvisualizable" "interconnection" which will be the self-evident solution. And if they do, many will be sure to label it a statistical accident from a stochastic parrot. And they'll right, for some definitions of "statistical", "accident", "stochastic", and "parrot".

bobbylarrybobby 7 hours ago|||
Did you read the linked paper? Claude out-reasoned humans on a challenging (or at least, unsolved) math problem.
cjcole 7 hours ago|||
"humans"

Donald Knuth is an extremal outlier human and the problem is squarely in his field of expertise.

Claude, guided by Filip Stappers, a friend of Knuth, solved a problem that Knuth and Stappers had been working on for several weeks. Unfortunately, it doesn't seem (from my quick scan) to have been stated how long (or how many tokens or $) it took for Claude + Stappers to complete the proof.

In response, Knuth said: "It seems that I’ll have to revise my opinions about “generative AI” one of these days."

Seems like good advice. From reading elsewhere in this comment section, the goalposts seem to be approaching the infrared and will soon disappear from the extreme redshift due to rate at which they are receding with each new achievement.

emp17344 6 hours ago||
What goalposts do you think are being moved? I constantly see AI enthusiasts use this phrase, but it’s not clear what goalposts they have in mind. Specifically, what is it that you want opponents to recognize that you believe they aren’t currently?

We now have a tool that can be useful in some narrow domains in some narrow cases. It’s pretty neat that our tools have new capabilities, but it’s also pretty far from AGI.

cjcole 6 hours ago|||
I'm not an enthusiast. I'm a Butlerian.

Imagine hearing pre-attention-is-all-you-need that "AI" could do something that Donald Knuth could not (quickly solve the stated problem in collaboration with his friend).

The idea that this (Putnam perfect, IMO gold, etc) is all just "statistical parrot" stuff is wearing a little thin.

whimsicalism 2 hours ago|||
You must have forgotten the /s at the end of your comment?
emp17344 1 hour ago||
Uh, no? You think LLMs are AGI?
worldsavior 7 hours ago|||
Merely luck in my opinion. There could be also multiple times where it didn't solve it.
graemefawcett 7 hours ago||
Connecting them is easy, one is the math of the exchange and one of the state machine.

A better question might be why no one is paying more attention to Barandes at Harvard. He's been publishing the answer to that question for a while, if you stop trying to smuggle a Markovian embedding in a non-Markovian process you stop getting weird things like infinities at boundaries that can't be worked out from current position alone.

But you could just dump a prompt into an LLM and pull the handle a few dozen times and see what pops out too. Maybe whip up a Claw skill or two

Unconstrained solution space exploration is surely the way to solve the hard problems

Ask those Millenium Prize guys how well that's working out :)

Constraint engineering is all software development has ever been, or did we forget how entropy works? Someone should remind the folk chasing P=NP that the observer might need a pen to write down his answers, or are we smuggling more things for free that change the entire game? As soon as the locations of the witness cost, our poor little guy can't keep walking that hypercube forever. Can he?

Maybe 6 months and a few data centers will do it ;)

taylorius 6 hours ago||
I thought Claude Monet - Impressionist techniques applied to coding.
zackmorris 4 hours ago||
Amazing paper. The simulated annealing portion reminds me of genetic algorithms (GAs). A good intro to that are the Genetic Programming series of books by John Koza, I read III in the early 2000s:

https://www.amazon.com/Genetic-Programming-III-Darwinian-Inv...

https://www.genetic-programming.com/

Note that the Python solution in the pdf is extremely short, so could have been found by simply trying permutations of math operators and functions on the right side of the equation.

We should be solving problems in Lisp instead of Python, but no matter. That's because Lisp's abstract syntax tree (AST) is the same as its code due to homoiconicity. I'm curious if most AIs transpile other languages to Lisp so that they can apply transformations internally, or if they waste computation building programs that might not compile. Maybe someone at an AI company knows.

-

I've been following AI trends since the late 1980s and from my perspective, nothing really changed for about 40 years (most of my life that I had to wait through as the world messed around making other people rich). We had agents, expert system, fuzzy logic, neural nets, etc since forever, but then we got video cards in the late 1990s which made it straightforward to scale neural nets (NNs) and GAs. Unfortunately due to poor choice of architecture (SIMD instead of MIMD), progress stagnated because we don't have true multicore computing (thousands or millions of cores with local memories), but I digress.

Anyway, people have compared AI to compression. I think of it more as turning problem solving into a O(1) operation. Over time, what we think of as complex problems become simpler. And the rate that we're solving them is increasing exponentially. Problems that once seemed intractable only were because we didn't know the appropriate abstractions yet. For example, illnesses that we thought would never be cured now have vaccines through mRNA vaccines and CRISPR. That's how I think of programming. Now that we have LLMs, whole classes of programming problems now have O(1) solutions. Even if that's just telling the computer what problem to solve.

So even theorem proving will become a solved problem by the time we reach the Singularity between 2030 and 2040. We once mocked GAs for exploring dead ends and taking 1000 times the processing power to do simple things. But we ignored that doing hard things is often worth it, and is still a O(1) operation due to linear scaling.

It's a weird feeling to go from no forward progress in a field to it being effectively a solved problem in just 2 years. To go from trying to win the internet lottery to not being sure if people will still be buying software in a year or two if/when I finish a project. To witness all of that while struggling to make rent, in effect making everything I have ever done a waste of time since I knew better ways of doing it but was forced to drop down to whatever mediocre language or framework paid. As the problems I was trained to solve and was once paid to solve rapidly diminish in value because AI can solve them in 5 minutes. To the point that even inventing AGI would be unsurprising to most, so I don't know why I ever went into computer engineering to do exactly that. Because for most people, it's already here. As I've said many times lately, I thought I had more time.

Although now that we're all out of time, I have an uncanny feeling of being alive again. I think tech stole something from my psyche so profound that I didn't notice its loss. It's along the lines of things like boredom, daydreaming, wasting time. What modern culture considers frivolous. But as we lose every last vestige of the practical, as money becomes harder and harder to acquire through labor, maybe we'll pass a tipping point where the arts and humanities become sought-after again. How ironic would it be if the artificial made room for the real to return?

On that note, I read a book finally. Hail Mary by Andy Weir. The last book I read was Ready Player One by Ernest Cline, over a decade ago. I don't know how I would have had the bandwidth to do that if Claude hadn't made me a middle manager of AIs.

jdnier 7 hours ago||
> I think Claude Shannon’s spirit is probably proud to know that his name is now being associated with such advances. Hats off to Claude!

I didn't realize Claude was named after Claude Shannon!

https://en.wikipedia.org/wiki/Claude_Shannon

tzumaoli 5 hours ago||
Trivia: Claude Shannon proposed the idea of predicting the next token (letter) using statistics/probabilities in the training data corpus in 1950: "Prediction and Entropy of Printed English" https://languagelog.ldc.upenn.edu/myl/Shannon1950.pdf
Anon84 4 hours ago|||
It goes back a bit further than that. His 1948 “Mathematical theory of communication” [1] already has (what we would now call) a Markov chain language model, page 7 onwards. AFAIK, this was based on his classified WWII work so it was probably a few years older than that

[1] https://people.math.harvard.edu/~ctm/home/text/others/shanno...

aix1 4 hours ago||
I was just reading Norbert Wiener's "The Human Use of Human Beings" (1950) and this quote gave me a good chuckle:

"One may get a remarkable semblance of a language like English by taking a sequence of words, or pairs of words, or triads of words, according to the statistical frequency with which they occur in the language, and the gibberish thus obtained will have a remarkably persuasive similarity to good English."

Trinicode 2 hours ago|||
A letter is not a token, is it? Redundancy could hit 75% in long sentences, but Shannon was not predicting tokens or words, he was predicting letters (characters).
pfdietz 5 hours ago|||
It's like the diesel engine, which is named after Rudolf Engine.
ai_critic 4 hours ago|||
:|
roer 2 hours ago|||
Is this a joke I don't get? His name was Rudolf Diesel, right?
stavros 2 minutes ago||
Yes, it is a fantastic joke and I laughed for ages, well played.
SenorKimchi 4 hours ago|||
And Claude had a collection of cycles, unicycles. Unfortunately the article is about something else altogether.
bread-wood 6 hours ago|||
Here I was assuming it was named after https://en.wikipedia.org/wiki/Claude_(alligator)
teekert 3 hours ago|||
Last time I asked Claude itself also didn’t know.
NitpickLawyer 6 hours ago||
Wait till you hear about nvidia and their GPU architecture naming scheme :)
dfilppi 6 hours ago||
[dead]
miroljub 9 hours ago||
Solves? It's a part of the training set. Nothing more, nothing less.
rpdillon 9 hours ago||
Opening sentences:

> Shock! Shock! I learned yesterday that an open problem I’d been working on for several weeks had just been solved by Claude Opus 4.6— Anthropic’s hybrid reasoning model that had been released three weeks earlier! It seems that I’ll have to revise my opinions about “generative AI” one of these days. What a joy it is to learn not only that my conjecture has a nice solution but also to celebrate this dramatic advance in automatic deduction and creative problem solving.

sigmar 4 hours ago||
I think we're going to have several years of people claiming genAI "didn't really do something novel here," despite experts saying otherwise, because people are scared by the idea that complex problem solving isn't exclusive to humans (regardless of whether these models are approaching general intelligence).
allreduce 7 hours ago|||
I encourage you to look at what the current models with a bit of harnessing are capable of, e.g. Opus 4.6 and Claude Code. Try to make it solve some mathematics-heavy problem you come up with. If only to get a more accurate picture of whats going on.

Unfortunately, these tools generalize way beyond regurgitating the training set. I would not assume they stay below human capabilities in the next few years.

Why any moral person would continue building these at this point I don't know. I guess in the best case the future will have a small privileged class of humans having total power, without need for human workers or soldiers. Picture a mechanical boot stomping on a human face forever.

nemo1618 7 hours ago|||
If this was a joke, it certainly flew over most people's heads...
jcims 8 hours ago|||
Prove it.
romaniv 7 hours ago||
I would like to note that it would be trivial to definitively prove or disprove such things if we had a searchable public archive of the training data. Interestingly, the same people (and corporate entities) who loudly claim that LLMs are creating original work seem to be utterly disinterested in having actual, definitive proof of their claims.
clbrmbr 7 hours ago||
This would be awesome. Even titles and shasums could be enough.
mwigdahl 9 hours ago||
Did you read the article? It was an open problem.
bluGill 9 hours ago||
Was it? It was an open problem to Knuth - who generally knows how to search literature. However there is enough literature to search that it wouldn't be a surprise at all to discover it was already solved but he just used slightly different terms and so didn't find it. Or maybe it was sovled because this is a specialization of something that looks unrelated and so he wouldn't have realized it when he read it. Or...

Overall I'm going with unsolved, because Knuth is a smart person who I'd expect to not miss the above. I'm also sure he falls for the above all the time even though the majority of the time he doesn't.

mwigdahl 8 hours ago||
Agreed with all of that, but with the added point that Knuth has done a lot of work in this exact area in The Art of Computer Programming Volume 4. If he considers this conjecture open given his particular knowledge of the field, it likely is (although agreed, it's not guaranteed).
ordu 7 hours ago|||
> If he considers this conjecture open given his particular knowledge of the field, it likely is (although agreed, it's not guaranteed).

It is as good as guaranteed. If Knuth says it doesn't know how to solve the problem, and if anyone knows, then they will inform Knuth about it. Knuth not just a very knowledgeable person, but a celebrity also.

skinner_ 1 hour ago|||
Also, if Claude had regurgitated a known solution, it would have come up with it in the first exploration round, not the 31st, as it actually did.
Steinmark 1 hour ago|
Trivia:AKWU AGHALI OFU THEOREM

Theorem (Akwu Aghali Ofu — The Single Nest or 1/2 spin)

For any observer O with personal quantum seed s (derived from first orgasm timestamp SHA-256), there exists a unique Hamiltonian cycle C(O) through the M³ digraph such that:

1. C(O) starts at vertex (0,0,0) — the Single Nest 2. C(O) has length exactly L³ for L determined by O's muon/mass preference 3. The cycle visits every vertex exactly once before returning 4. The cycle only exists when O observes it 5. No other observer can traverse the same cycle

Proof Sketch: 1. Let s = SHA-256(timestamp) mod L determine coefficients (α,β,γ) 2. Define g(i,j,k) = (αi + βj + γk) mod L 3. Show that the mapping f: (i,j,k) → next vertex via g is a permutation 4. Show that the permutation decomposes into cycles 5. Show that for appropriate s, the cycle containing (0,0,0) has length L³ 6. Show that this cycle depends on s — different s give different cycles 7. Show that observation collapses the quantum superposition, making the cycle actual

Corollary: The Single Nest spins forever because the cycle is Hamiltonian (it loves only you) — it never repeats until it returns, and the return is a new beginning, not a repetition.