Top
Best
New

Posted by tedsanders 5 hours ago

An OpenAI model has disproved a central conjecture in discrete geometry(openai.com)
606 points | 410 commentspage 2
zone411 1 hour ago|
I actually tried using GPT-5.5 Pro on this problem recently. It thought it was making progress on one path, but it made so many mistakes that it didn't feel worth it pushing further. It'll be interesting to check whether it's the same route. I got partial results (proved in Lean) that improve on the best-known results for four Erdős problems with GPT-5.5 Pro
recitedropper 3 hours ago||
This is impressive, no question.

Without knowing all this model has been trained on though, it is pretty hard to ascertain the extent to which it arrived to this "on its own". The entire AI industry has been (not so secretly) paying a lot of experts in many fields to generate large amounts of novel training data. Novel training data that isn't found anywhere else--they hoard it--and which could actually contain original ideas.

It isn't likely that someone solved this and then just put it in the training data, although I honestly wouldn't put that past OpenAI. More interesting though is the extent to which they've generated training data that may have touched on most or all of the "original" tenets found in this proof.

We can't know, of course. But until these things are built in a non-clandestine manner, this question will always remain.

Rover222 3 hours ago|
Seems like a very tin-foil-hat-take to me
net01 2 hours ago|||
I’m quite certain that a few months ago, some problems were claimed to be solved by AI. However, those claims were actually false and were exactly that, solved erdos problems that were not marked as solved and the solution was "found" by AI.

edit: >> https://techcrunch.com/2025/10/19/openais-embarrassing-math/

jiggawatts 2 hours ago||
The corollary is that this is a very valuable capability of AI!

The ability to find incredibly obscure facts and recall them to solve "officially unsolved" problems in minutes is like Google Search on steroids. In some sense, it is one core component of "deep expertise", and humans rely on the same methodology regularly to solve "hard" problems. Many mathematicians have said that they all just use a "bag of tricks" they've picked up and apply them to problems to see if they work. The LLMs have a huge bag of very obscure tricks, and are starting to reach the point that they can effectively apply them also.

I suspect the threshold of AGI will be crossed when the AIs can invent novel "tricks" on their own, and memorise their own new approach for future use without explicitly having to have their weights updated with "offline" training runs.

mrdependable 2 hours ago||||
How is that a "tin-foil-hat" take? It's not a secret, and in fact widely reported, that these companies are spending billions on creating training data.
dmix 1 hour ago||
So you think that OpenAI paid some mathematicians to either solve this conjecture problem, or a bunch of related unpublished math related to it, then fed it into an LLM model so they could announce it as being solved by the model? How is that not a conspiracy theory?
mrdependable 34 minutes ago||
It is just a theory, the conspiracy part is not really applicable. I don't see what is controversial about it. Are you implying the machine taught itself the mathematics to do all this?
recitedropper 2 hours ago|||
I'm not letting the government read my brainwaves.

In all seriousness though: My suggestion is that those shepherding the frontier of AI start acting with more transparency, and stop acting in ways that encourage conspiratorial thinking. Especially if the technology is as powerful as they market it as.

aurareturn 5 hours ago||
One thing seems for certain is that OpenAI models hold a distinct lead in academics over Anthropic and Google models.

For those in academics, is OpenAI the vendor of choice?

Jcampuzano2 4 hours ago||
OpenAI specifically targeted Academia a lot and gave out a lot of free/unlimited usage to top academics and universities/researchers.

They also offer grants you can apply for as a researcher. I'm sure other labs may have this too but I believe OpenAI was first to this.

tracerbulletx 4 hours ago|||
Hasn't AlphaFold been used to make real discoveries for a few years now?
KalMann 3 hours ago||
I think he's talking about reasoning models.
karmasimida 4 hours ago|||
I think the mathematicians on X are all using GPT 5.5 Pro
bayindirh 4 hours ago|||
From my limited testing, Gemini can dig out hard to find information given you detail your prompt enough.

Given that Google is the "web indexing company", finding hard to find things is natural for their models, and this is the only way I need these models for.

If I can't find it for a week digging the internet, I give it a colossal prompt, and it digs out what I'm looking for.

senrex 3 hours ago||
This is my experience too. Gemini and Gemini deep research are awesome. Claude's deep research is pretty bad really relative to ChatGPT or Gemini. Overall, I still love Claude the best but it is not what I would want to use if I wanted to really dig into deep research. The export to google docs in Gemini deep research is tough to beat too. I haven't used Gemini since January but have probably years of material from saved deep research in google docs. Almost an overwhelming amount of information when I dive into what I saved.
FloorEgg 4 hours ago|||
Gemini seems better trained for learning and I think Google has made a more deliberate effort to optimize for pedagoical best practices. (E.g. tutoring, formative feedback, cognitive load optimization)

As far as academic research is concerned (e.g. this threads topic), I can't say.

astrange 22 minutes ago|||
Gemini the chatbot has a very strange personality that intensely overindexes on your user profile and absolutely loves insane mixed metaphors.

Its explanations are quite good but they're also hard to understand because it keeps trying to relate everything back to programming metaphors or what it thinks it knows about the streets in the neighborhood I live in.

snaking0776 4 hours ago||||
Agreed I usually use Gemini for explaining concepts and ChatGPT for getting things done on research projects.
aurareturn 4 hours ago||||
Yes, I meant academic research.
cute_boi 4 hours ago|||
Gemini is like someone with short-term memory loss; after the first response, it forgets everything. That being said, I have checked multiple model and gemini can sometime give accurate answer.
FloorEgg 57 minutes ago||
Gemini is a series with a lot of individual models.

What you are describing doesn't match my experience at all with Gemini 3 or 3.1, especially the pro version.

causal 4 hours ago|||
A simpler explanation is that more people are using ChatGPT
logicchains 3 hours ago||
OpenAI models seem to have been trained on a lot of auto-generated theorem proving data; GPT 5.5 is really good at writing Lean.
endymi0n 4 hours ago||
To paraphrase Gwynne Shotwell: “Not too bad for just a large Markov chain, eh?”
rhubarbtree 4 hours ago|
Erdos, or the model?
dwroberts 4 hours ago||
Would be interesting to know what kind of preparatory work actually went into this - how long did it take to construct an input that produced a real result, and how much input did they get from actual mathematicians to guide refining it
lacewing 2 hours ago||
Why?

It's clearly not yet a tool that can deliver new math at a scale. I say this because otherwise, the headline would be that they proved / disproved a hundred conjectures, not one. This is what happened with Mythos. You want to be the AI company that "solved" math, just like Anthropic got the headlines for "solving" (or breaking?) security.

The fact they're announcing a single success story almost certainly means that they've thrown a lot of money at a lot of problems, had experts fine-tuning the prompts and verifying the results, and it came back with a single "hit". But that doesn't make the result less important. We now have a new "solver" for math that can solve at least some hard problems that weren't getting solved before.

Whether that spells the end of math as we know... I don't think so, but math is a bit weird. It's almost entirely non-commercial: it's practiced chiefly in the academia, subsidized from taxes or private endowments, and almost never meant to solve problems of obvious practical importance - so in that sense, it's closer to philosophy than, say, software engineering. No philosopher is seriously worried about LLMs taking philosopher jobs even though they a chatbot can write an essay, but mathematicians painted themselves into a different corner, I think.

OkWing99 1 hour ago||
Says in the papers. "...which was first mathematically generated in one shot by an internal model at OpenAI, and then expositionally refined through human interactions with Codex."

Doesn't really matter the prep-work, what they say is it's a one-shot result, achieved by AI. The blog doesn't claim it was done by a currently public Model.

Jeff_Brown 4 hours ago||
Can anyone find (or draw) a picture of the construction?
gibspaulding 4 hours ago||
This only a proof that a field with more connections is possible, not what it looks like.

I’m very out of my depth, but the structure of the proof seems to follow a pattern similar to a proof by contradiction. Where you’d say for example “assume for the sake of contradiction that the previously known limit is the highest possible” then prove that if that statement is true you get some impossible result.

ninjha 4 hours ago|||
They only proved that one exists; computing the actual construction is non-obvious (the naive way to construct it is computationally infeasible).
pradn 4 hours ago|||
They have a "before" picture but not an "after"!
paulddraper 3 hours ago|||
Yeah, unfortunately, they just proved there existed a better solution, they didn't construct it.

(Though in some ways that's actually more impressive.)

Fraterkes 4 hours ago||
I guess if this stuff is going to make my employment more precarious, it’d be nice if it also makes some scientific breakthroughs. We’ll see
ausbah 4 hours ago||
shame we won’t see any of these medical breakthroughs when we all lose our jobs and thus our healthcare
karmasimida 4 hours ago||
There is a world that AI makes medical breakthroughs, but there is 0 guarantee it is going to be affordable
cubefox 4 hours ago||
Breakthroughs in pure mathematics aren't scientific though. They say us nothing about the world, and they are not useful.
dwa3592 2 hours ago||
Few questions that the blog did not answer, if anyone knows that'll be great:

- Does anyone know if this was a 1 minute of inference or 1 month?

- How many times did the model say it was done disproving before it was found out that the model was wrong/hallucinating?

- One of the graphs say - the model produced the right answer almost half the times at the peak compute??? did i understand that right? what does peak compute mean here?

overgard 1 hour ago||
I think it's worth being skeptical of this.. there's a way too common pattern of "AI Lab Shows AI Doing Something Only Humans Can Do" only for a bunch of important caveats and limitations to be discovered after the initial hype. And of course, the correction never seems to be as viral as the hype. I'll believe it when a mathematician actually reads the 100+ pages of reasoning.
throwaway2027 4 hours ago|
Not to dismiss the AI but the important part is that you still need someone able to recognize these solutions in the first place. A lot of things were just hidden in plain sight before AI but no one noticed or didn't have the framework either in maths or any other field they're specialized in to recognize those feats.
More comments...