Top
Best
New

Posted by davidbarker 5 hours ago

GPT-5.2 derives a new result in theoretical physics(openai.com)
286 points | 200 commentspage 2
elashri 4 hours ago|
I would be less interested in scattering amplitude of all particle physics concepts as a test case because the scattering amplitudes because it is one of the concisest definition and its solution is straightforward (not easy of course). So once you have a good grasp of the QM and the scattering then it is a matter of applying your knowledge of math to solve the problem. Usually the real problem is to actually define your parameters from your model and define the tree level calculations. Then for LLM to solve these it is impressive but the researchers defined everything and came up with the workflow.

So I would read this (with more information available) with less emphasize on LLM discovering new result. The title is a little bit misleading but actually "derives" being the operative word here so it would be technically correct for people in the field.

crorella 5 hours ago||
The preprint: https://arxiv.org/abs/2602.12176
another_twist 2 hours ago||
Thats great. I think we need to start researching how to get cheaper models to do math. I have a hunch it should be possible to get leaner models to achieve these results with the right sort of reinforcement learning.
alansaber 1 hour ago|
Deepseek wrote a decent paper on this https://github.com/deepseek-ai/DeepSeek-Math-V2/blob/main/De...
jtrn 2 hours ago||
This is my favorite field for me to have opinions about, without not having any training or skill. Fundamental research i just a something I enjoy thinking about, even tho I am psychologist. I try to pull inn my experience from the clinic and clinical research when i read theoretical physics. Don't take this text to seriously, its just my attempt at understanding whats going on.

I am generally very skeptical about work on this level of abstraction. only after choosing Klein signature instead of physical spacetime, complexifying momenta, restricting to a "half-collinear" regime that doesn't exist in our universe, and picking a specific kinematic sub-region. Then they check the result against internal consistency conditions of the same mathematical system. This pattern should worry anyone familiar with the replication crisis. The conditions this field operates under are a near-perfect match for what psychology has identified as maximising systematic overconfidence: extreme researcher degrees of freedom (choose your signature, regime, helicity, ordering until something simplifies), no external feedback loop (the specific regimes studied have no experimental counterpart), survivorship bias (ugly results don't get published, so the field builds a narrative of "hidden simplicity" from the survivors), and tiny expert communities where fewer than a dozen people worldwide can fully verify any given result.

The standard defence is that the underlying theory — Yang-Mills / QCD — is experimentally verified to extraordinary precision. True. But the leap from "this theory matches collider data" to "therefore this formula in an unphysical signature reveals deep truth about nature" has several unsupported steps that the field tends to hand-wave past.

Compare to evolution: fossils, genetics, biogeography, embryology, molecular clocks, observed speciation — independent lines of evidence from different fields, different centuries, different methods, all converging. That's what robust external validation looks like. "Our formula satisfies the soft theorem" is not that.

This isn't a claim that the math is wrong. It's a claim that the epistemic conditions are exactly the ones where humans fool themselves most reliably, and that the field's confidence in the physical significance of these results outstrips the available evidence.

I wrote up a more detailed critique in a substack: https://jonnordland.substack.com/p/the-psychologists-case-ag...

vbarrielle 4 hours ago||
I' m far from being an LLM enthusiast, but this is probably the right use case for this technology: conjectures which are hard to find, but then the proof can be checked with automated theorem provers. Isn't it what AlphaProof does by the way?
emp17344 4 hours ago||
Cynically, I wonder if this was released at this time to ward off any criticism from the failure of LLMs to solve the 1stproof problems.
pruufsocial 5 hours ago||
All I saw was gravitons and thought we’re finally here the singularity has begun
snarky123 4 hours ago||
So wait,GPT found a formula that humans couldn't,then the humans proved it was right? That's either terrifying or the model just got lucky. Probably the latter.
JasonADrury 4 hours ago|
> found a formula that humans couldn't

Couldn't is an immensely high bar in this context, didn't seems more appropriate and renders this whole thing slightly less exciting.

vessenes 4 hours ago||
I'd say "couldn't in 20 hours" might be more defensible. Depends on how many humans though. "couldn't in 20 GPT watt-hours" would give us like 2,000 humans or so.
getnormality 2 hours ago||
I'll believe it when someone other than OpenAI says it.

Not saying they're lying, but I'm sure it's exaggerated in their own report.

More comments...