A recent experience with ChatGPT 5.5 Pro

Posted by _alternator_ 9 hours ago

A recent experience with ChatGPT 5.5 Pro(gowers.wordpress.com)

https://twitter.com/wtgowers/status/2052830948685676605

https://xcancel.com/wtgowers/status/2052830948685676605

330 points | 184 commentspage 2

zkmon 1 hour ago|

>> but it was definitely a non-trivial extension of those ideas, and for a PhD student to find that extension it would be necessary to invest quite a bit of time digesting Isaac’s paper

The "non-trivial" is for human abilities. The weights lifted by a crane are also "non-trivial". People keep getting amazed at machine's abilities. Just like a radio telescope can see things humans can't, microscope can see the detail humans can't, we need not be amazed. The sensory perception of patterns is at different level for AI. It's a machine.

svnt 1 hour ago|

Too many people are wrapped around the ego axle thinking (assuming) their ideas are both them and somehow unique and special.

It usually takes dissolving that, often through difficult experiences, before they can see it as a machine, something that could be separated from them.

amelius 2 hours ago||

Makes sense as a mathematician basically has two powers (1) using their intuition and (2) an enormous amount of mental stamina. A mathematician builds their intuition by reading maths books. It is thus not surprising that an LLM is well equipped to take over the tasks of the mathematician.

momojo 5 hours ago||

Sorry, I'm reposting a comment I made yesterday that seems fitting:

> This reminds me of Antirez's "Don't fall into the anti-AI hype". In a sentence: These foundation models are really good at optimizing these extremely high level, extremely well defined problem spaces (ie multiply matrices faster). In Antirez's case, it's "make Redis faster".

dabinat 5 hours ago||

I feel like this experiment was successful because those prompting the AI were knowledgeable enough to ask the right questions and verify the output was correct. This shows that there is still a place for expertise, even if the LLM does the actual research.

colechristensen 5 hours ago|

I feel my input to LLMs is most valuable in the initial idea, big picture design tweaks, and the vast majority of my usefulness is negative feedback. This looks wrong, you've gotten off track, you're cheating with workarounds, you're falling into a rabbithole, etc.

lysecret 3 hours ago||

There is a great recent episode of latent space about a similar topic it’s worth a watch even with the click baiti thumbnail and title https://youtu.be/9d899Ram9Bs?is=pQMoVmlWVsTNKfRK

arjie 2 hours ago||

The question of where the creative input is was a big thing around Experiments in Musical Intelligence and co-composing. But it seems perhaps that it’s a transient state we needn’t spend too much effort it. The machine has failed to disappoint repeatedly. Perhaps this is as far as it gets or perhaps we will be like people in Catching Crumbs by the Table by Ted Chiang where almost all science is interpretation of papers by vastly greater intellects.

iTokio 6 hours ago||

On complex problems with lengthy proofs, the first step that I would have done is to ask 5.5 pro in a new, unrelated, session, to be very critical, to try to find flaws in the arguments.

And certainly not to send it to a fellow colleague to ask its opinion first.

LLMs are certainly becoming capable to code, find vulnerabilities, solve mathematical problems, but we need to avoid putting their works in production, or in front of other humans, without assessing it by any possible mean.

Otherwise tech leads, maintainers, experts get overwhelmed and this is how the « AI slop » fatigue begins.

To be clear I’m talking about this step:

> That preprint would have been hard for me to read, as that would have meant carefully reading Rajagopal’s paper first, but I sent it to Nathanson, who forwarded it to Rajagopal, who said he thought it looked correct.

NitpickLawyer 6 hours ago|

> but we need to avoid putting their works in production, or in front of other humans, without assessing it by any possible mean.

I think this is good advice in general, maybe with an emphasis on public vs. private, friendly contact. Having 0 thought AI slop thrown at you out of the blue is rude. "could have been a prompt" indeed. But having a friend/colleague ask for a quick glance at something they know you handle well is another story for me.

If I've worked on a subject for a few years, and know the particulars in and out, I'd have no trouble skimming something that a friend or a colleague sent me. I am sparing those 5-10 minutes for the friend, not for what they sent. And for an expert in a particular domain, often 5 minutes is all it takes for a "lgtm" or "lol no".

fulafel 4 hours ago||

Link to source blog post: https://gowers.wordpress.com/2026/05/08/a-recent-experience-...

dang 4 hours ago|

That's the top link (i.e. that the title is linked to), no?

fulafel 5 minutes ago||

Indeed, the body in the post made me think it was a url-less submission.

zingar 2 hours ago||

The post talks about LLM+human contributions being recognized in some different category from human-only. But is it possible to spot the difference between the two?

adammdaw 5 hours ago|

This is certainly interesting, though I would say that based on my understanding of how the current models work combinatorial problems would be an area where they could be particularly successful. They are pretty good at combinatorial creativity - its the exploratory and transformational aspects that are still pretty tricky, and I expect would come to bear in other areas of mathematics.

hodgehog11 4 hours ago|

Indeed, analysis is a bit more loose in its arguments, and so I've found LLMs tend to make more mistakes there.

More comments...