Posted by tedsanders 4 hours ago
Ayer, and in a different way early Wittgenstein, held that mathematical truths don’t report new facts about the world. Proofs unfold what is already implicit in axioms, definitions, symbols, and rules.
I think that idea is deeply fascinating, AND have no problem that we still credit mathematicians with discoveries.
So either “recombining existing material” isn’t disqualifying, or a lot of Fields Medals need to be returned.
I'd say yes, LLMs "just" recombine things. I still don't think if you trained an LLM with every pre-Newton/Liebniz algebra/geometry/trig text available, it could create calculus. (I'm open to being proven wrong.) But stuff like this is exactly the type of innovation LLMs are great at, and that doesn't discount the need for humans to also be good at "recombinant" innovation. We still seem to be able to do a lot that they cannot in terms of synthesizing new ideas.
Yes but that is because there was not enough text available to create an intelligent LLM to begin with.
Also we shouldn’t be thinking about what LLMs are good at, but rather what any computer ever might be good at. LLMs are already only one (essential!) part of the system that produced this result, and we’ve only had them for 3 years.
Also also this is a tiny nitpick but: the fields medal is every 4 years, AFAIR. For that exact reason, probably!
Its amazing to me when people talk about recombining things, or following up on things as somehow lesser work.
People can't separate the perspective they were given when they learned the concepts, that those who developed the concepts didn't have because they didn't exist.
Simple things are hard, or everything simple would have been done hundreds of years ago, and that is certainly not the case. Seeing something others have not noticed is very hard, when we don't have the concepts that the "invisible" things right in front of us will teach us.
That Newton and Leibniz came up with similar ideas in parallel, independently, around the same time (what are the odds?), supports that.
https://en.wikipedia.org/wiki/Leibniz%E2%80%93Newton_calculu...
The experiment is feasible. If it were performed and produced a positive result, what would it imply/change about how you see LLMs?
Besides, we can forecast our thoughts and actions to imagined scenarios unconditioned on their possibility. Something doesn't have to be possible for us to imagine our reactions.
There are people working on this.
Imagine every bit of human knowledge as a discrete point within some large high dimensional space of knowledge. You can draw a big convex hull around every single point of human knowledge in a space. A LLM, being trained within this convex hull, can interpolate between any set of existing discrete points in this hull to arrive at a point which is new, but still inside of the hull. Then there are points completely outside of the hull; whether or not LLMs can reach these is IMO up for debate.
Reaching new points inside of the hull is still really useful! Many new discoveries and proofs are these new points inside of the hull; arguable _most_ useful new discoveries and proofs are these. They're things that we may not have found before, but you can arrive at by using what we already have as starting points. Many math proofs and Nobel Prize winning discoveries are these types of points. Many haven't been found yet simply because nobody has put the time or effort towards finding them; LLMs can potentially speed this up a lot.
Then there are the points completely outside of hull, which cannot be reached by extrapolation/interpolation from existing points and require genuine novel leaps. I think some candidate examples for these types of points are like, making the leap from Newtonian physics to general relativity. Demis Hassabis had a whole point about training an AI with a physics knowledge cutoff date before 1915, then showing it the orbit of Mercury and seeing if it can independently arrive at general relativity as an evaluation of whether or not something is AGI. I have my doubts that existing LLMs can make this type of leap. It’s also true that most _humans_ can’t make these leaps either; we call Einstein a genius because he alone made the leap to general relativity. But at least while most humans can’t make this type of leap, we have existence proofs that every once in a while one can; this remains to be seen with AI.
This doesn't make any sense, by their nature they can't "guess-and-check" things outside their training set.
If you have a multi dimensional space, and you are trying to compute which points lie “inside” some boundary, there are large areas that will be bounded by some dimensions but not others. This is interesting because it means if you have a section bounded by dimensions A, B, and C but not D, you could still place a point in D, and doing so then changes your overall bounds.
I think this is how much of human knowledge has progressed (maybe all non-observational knowledge). We make observations that create points, and then we derive points within the created space, and that changes the derivable space, and we derive more points.
I don’t see why AI could do the same (other than technical limitations related to learning and memory).
Most discoveries are indeed implied from axioms, but every now and then, new mathematics is (for lack of a better word) "created"—and you have people like Descartes, Newton, Leibniz, Gauss, Euler, Ramanujan, Galois, etc. that treat math more like an art than a science.
For example, many belive that to sovle the Riemann Hypothesis, we likely need some new kind of math. Imo, it's unlikely that an LLM will somehow invent it.
A scientist has to extract the "Creation" from an abstract dimension using the tools of "human knowledge". The creativity is often selecting the best set of tools or recombining tools to access the platonic space. For instance a "telescope" is not a new creation, it is recombination of something which already existed: lenses.
How can we truly create something ? Everything is built upon something.
You could argue that even "numbers" are a creation, but are they ? Aren't they just a tool to access an abstract concept of counting ? ... Symbols.. abstractions.
Another angle to look at it, even in dreams do we really create something new ? or we dream about "things" (i.e. data) we have ingested in our waking life. Someone could argue that dream truly create something as the exact set of events never happened anywhere in the real world... but we all know that dreams are derived.. derived from brain chemistry, experiences and so on. We may not have the reduction of how each and every thing works.
Just like energy is conserved, IMO everything we call as "created" is just a changed form of "something". I fully believe LLMs (and humans) both can create tools to change the forms. Nothing new is being "created", just convenient tools which abstract upon some nature of reality.
Humans and animals have intuitive notions of space and motion since they can obviously move. But, symbolizing such intuitions into forms and communicating that via language is the creative act. Birds can fly, but can they symbolize that intuitive intelligence to create a theory of flight and then use that to build a plane ?
Well I think the point is there is no "new kind of math". There's just types of math we've discovered and what we haven't. No new math is created, just found.
We're not comparing math to reality (though there's a strong argument to be made that reality has a structure that is mathematical in nature - structural realism didn't die a scientific philosophy just because someone came up with a pithy saying), we're talking about if math is discovered or invented.
Most mathematicians would argue both - math is a language, we have created operations, axioms are proposed based on human creativity, etc., but the actual laws, patterns, etc. are discovered. Pi is going to be pi no matter if you're a human or someone else - we might represent it differently with some other number system or whatever, but that's a matter of representation, not mathematical truth.
It seems that addition (for instance) was "created" long before us.
On the other hand, it seems highly unlikely that a civilization similar to ours could "invent" an essentially different kind of mathematics (or physics, etc.)
I know of no realm where mathematical objects live except human minds.
No, it seems clear to me that mathematics is a creation of our minds.
This is also true for established theorems! We can can imagine mathematical universes (toposes) where every (total) function on the reals is continuous! Even though it is an established theorems that there are discontinuous functions! We just need to replace a few axioms (chuck out law of the excluded middle, and throw in some continuity axioms).
However, if that idea about new math is correct, we, in theory, don’t need new math to (dis)prove the Riemann hypotheses (assuming it is provable or disprovable in the current system).
In practice we may still need new math because a proof of the Riemann hypotheses using our current arsenal of mathematical ‘objects’ may be enormously large, making it hard to find.
I honestly don't know personally either way. Based on my limited understanding of how LLMs work, I don't see them be making the next great song or next great book and based on that reasoning I'm betting that it probably wont be able to do whatever next "Descartes, Newton, Leibnitz, Gauss, Euler, Ramanujan, Galois" are going to do.
Of course AI as a wider field comes up with something more powerful than LLM that would be different.
Meanwhile, songs are hitting number one on some charts on Spotify that people think are humans and are actually AI. And Spotify has to start labelling them as such. One AI "band" had an entire album of hits.
Also - music is a subjective. Mathematics isn't.
And in this case, an LLM discovered a new way to reason about a conjecture. I don't know how much proof is needed - since that is literally proof that it can be done.
There is quite some questions around that. Music is subjective and obviously different people have different taste, but I wouldn't call any of them to be actual good music / real hits.
>> LLM discovered a new way to reason about a conjecture
I wasn't questioning LLMs ability to prove things. Parent threads were talking about building new kind of maths , or approaching it in a creative/artistic way. Thats' what I was referring to.
I can't speak for maths of hard science as I'm not trained in that, but the creativity aspect in code is definitely lacking when it comes to LLMs. May not matter down the line.
because I have no basis for assuming an LLM is fundamentally capable of doing this.
"Never shall I be beaten by a machine!”
In 1997 he lost to Deep Blue.
Train an LLM only on texts dated prior to Newton and see if it can create calculus, derrive the equations of motion, etc.
If you ask it about the nature of light and it directs you to do experiments with a prism I'd say we're really getting somewhere.
[1] Obviously Newton counts as one. Leibniz like Newton figured out calculus. Other people did important work in dynamics though no one else's was as impressive as Newton's. But the vast majority of human-level intelligences trained on texts prior to Newton did not create calculus or derive the equations of motion or come close to doing either of those things.
Incidentally, similar conversations were had about ML writ large vs. classical statistics/methods, and now they've more or less completely died down since it's clear who won (I'm not saying classical methods are useless, but rather that it's obvious the naysayers were wrong). I anticipate the same trajectory here. The main difference is that because of the nature of the domain, everyone has an opinion on LLM's while the ML vs. statistics battle was mostly confined within technical/academic spaces.
Dang/Tomhow, are you reading this? Would it make sense to modify your slop filter to avoid auto-flagging/killing replies that credit the LLM explicitly? Otherwise valid discussions will continue to get hosed.
I can assure you, the percentage of people who can do what they do when it comes to crafting terms, and related sets of terms, for nuanced and novel ideas is very very small.
It happens this is something I do nearly every day.
Models respond to the level of dialogue you have with them. Engage with an informed perspective on terminological issues and they respond with deep perspectives.
I am routinely baffled at the things people say models can't do, that they do effortlessly. Interaction and having some skill to contribute helps here.
In the end, creativity has always been a combination of chance and the application of known patterns in new contexts.
If you know anything about the invention of new math (analytic geometry, Calculus, etc.), you'd know how untrue this is. In fact, Calculus was extremely hand-wavy and without rigorous underpinnings until the mid 1800s. Again: more art than science.
If anything, they were fighting an uphill battle against the perception of hand-waving by their contemporaries.
That idea wasn’t formally defined until 134 years later with epsilon-delta by Cauchy. That it was accepted. (I know that there were an earlier proofs)
There’s even arguments that the limit existed before newton and lebnitz with Archimedes' Limits to Value of Pi.
Cauchy’s deep understanding of limits also led to the creation of complex function theory.
These forms of creation are hand-wavy not because they are wrong. They are hand wavy because they leverage a deep level of ‘creative-intuition’ in a subject.
An intuition that a later reader may not have and will want to formalize to deepen their own understanding of the topic often leading to deeper understanding and new maths.
Yes, and it's pretty common knowledge that Calculus was (finally) formalized by Weierstrass in the early 19th century, having spent almost two centuries in mathematical limbo. Calculus was intuitive, solved a great class of problems, but its roots were very much (ironically) vibes-based.
This isn't unique to Newton or Leibniz, Euler did all kinds of "illegal" things (like playing with divergent series, treating differentials as actual quantities, etc.) which worked out and solved problems, but were also not formalized until much later.
Vibe-what? Vibe-bullshit, maybe; cathedrals in Europe and such weren't built by magic. Ditto with sailing and the like. Tons of matematics and geometry there, and tons of damn axioms before even the US existed.
Heck, even the Book of The Games from Alphonse X "The Wise" has both a compendia of game rules and even this https://en.wikipedia.org/wiki/Astronomical_chess where OFC being able on geometry was mandatory at least to design the boards.
On Euclid:
https://en.wikipedia.org/wiki/Euclid%27s_Elements
PD: Geometry has tons of grounds for calculus. Guess why.
LLMs are prompted by humans and the right query may make it think/behave in a way to create a novel solution.
Then there's a third factor now with Agentic AI system loops with LLMs. Where it can research, try, experiment in its own loop that's tied to the real world for feedback.
Agentic + LLM + Initial Human Prompter by definition can have it experiment outside of its domain of expertise.
So that's extending the "LLM can't create novel ideas" but I don't think anyone can disagree the three elements above are enough ingredients for an AI to come up with novel ideas.
We just haven't let AI run wild yet. But its coming.
AGI has been "just over the horizon" for literal decades now - there have been a number of breakthroughs and AI Winters in the past, and there's no real reason to believe that we've suddenly found the magic potion, when clearly we haven't.
AI right now cannot even manage simple /logic/
That's not creative prompt. That's a driving prompt to get it to start its engine.
You could do that nowadays and while it may spend $1,000 to $100,000 worth of tokens. It will create something humans haven't done before as long as you set it up with all its tool calls/permissions.
It won't because even though it looks clever to you, people who /do/ understand math and LLMs understand that LLMs /are/ regurgitating
Why does your LLM need you to tell it to look in the first place? Why isn't just telling us all the answers to unsolved conjectures known and unknown?
Why isn't the LLM just telling us all the answers to all the problems we are facing?
Why isn't the LLM telling us, step by step with zero error, how to build the machine that can answer the ultimate question?
Who decides at which the last point it’s OK to provide text to the model in order to be able to describe it as creative? (non-rhetorical)
math more like an art than a science.
That’s a fun turn of phrase, but hopefully we can all agree that math without scientific rigor is no math at all. we likely need some new kind of math. Imo, it's unlikely that an LLM will somehow invent it.
Do you think it’s possible/likely that any AI system could? I encourage us to join Yudkowsky in anticipating the knock-on results of this exponential improvement that we’re living through, rather than just expecting chatbots that hallucinate a bit less.In concrete terms: could a thousand LLMs-driven agents running on supercomputers—500 of which are dedicated to building software for the other 500-come up with new math?
Maths follows logical (or even mathematical) rigour, not scientific rigour!
* LLMs do just interpolate their training data, BUT-
* That can still yield useful "discoveries" in certain fields, absent the discovery of new mechanics that exist outside said training data
In the case of mathematics, LLMs are essentially just brute-forcing the glorified calculators they run on with pseudo-random data regurgitated along probabilities; in that regard, mathematics is a perfect field for them to be wielded against in solving problems!
As for organic chemistry, or biology, or any of the numerous fields where brand new discoveries continue happening and where mathematics alone does not guarantee predicted results (again, because we do not know what we do not know), LLMs are far less useful for new discoveries so much as eliminating potential combinations of existing data or surfacing overlooked ones for study. These aren't "new" discoveries so much as data humans missed for one reason or another - quack scientists, buried papers, or just sheer data volume overwhelming a limited populace of expertise.
For further evidence that math alone (and thus LLMs) don't produce guaranteed results for an experiment, go talk to physicists. They've been mathematically proving stuff for decades that they cannot demonstrably and repeatedly prove physically, and it's a real problem for continued advancement of the field.
"interpolate" has a technical meaning - in this meaning, LLMs almost never interpolate. It also has a very vague everyday meaning - in this meaning, LLMs do interpolate, but so do humans.
One can argue, new knowledge is just restructured data.
I think the main concerns about LLMs is the inherent "generative" aspects leading to hallucinations as a biproduct, because that's what produces the noi. Joint Embedding approaches are rather an interesting alternative that try to overcome this, but that's still in research phase.
negative numbers were invented to solve equations which only used naturals. irrationals were invented to solve equations which could be expressed with rationals. complex numbers were invented to represent solutions to polynomials. so on and so forth. At each point new ideas are invented to complete some un-answerable questions. There is a long history of this. Any closed system has unanswerable questions within itself is a paraphrasing of goedel's incompleteness theorem.
But note this is more to say that the Tractatus is like PI, not the other way around. And in that, takes like GPs would be considered the "nonsense" we are supposed to "climb over" in the last proposition of Tractatus.
The proof relies on extremely deep algebraic number theory machinery applied to a combinatorial geometry problem.
Two humans expert enough in either of those totally separate domains would have to spend a LONG time teaching each other what they know before they would be able to come together on this solution.
Or like a musical octave has only 12 semitones, so all music is just a selection from a finite set that already existed.
Sure the insane computation we're throwing at this changes our perspective, but still there is an important distinction.
Like, "does the Riemann zeta function have zeroes that don't have real part 1/2," or "is there a better solution to the Erdős Unit Distance Problem."
The selection of question is matter of taste, but once selected, there is a definitive precise answer.
Who knew Obi-one was just smoking and pontificating on Wittgenstein.
(uv)(vu) = (uu)(vv)
Shows up as a primitive structure, quite often.If you switch to degree-3 or generator-3 then the coverage is, essentially, empty: mathematics has analyzed only a few of the hundreds (thousands? it's hard to enumerate) naturally occurring algebraic structures in that census.
Isn't this exactly what chain-of-thought does? It's doing computation by emitting tokens forward into its context, so it can represent states wider than its residuals and so it can evaluate functions not expressed by one forward pass through the weights. It just happens to look like a person thinking out loud because those were the most useful patterns from the training data.
An LLM generating Arc code is using the LISP patterns it learnt from training, maybe patterns from other programming languages too.
And yet LLM/AIs can't count parentheses reliably.
For example, if you take away the "let" forms from Claude which forces it to desugar them to "lambda" forms, it will fail very quickly. This is a purely mechanical transformation and should be error free. The significant increase in ambiguity complete stumps LLMs/AI after about 3 variables.
This is why languages like Rust with strong typing and lots of syntax are so LLM friendly; it shackles the LLM which in turn keeps it on target.
E.g. training on physics knowledge prior to 1915, then attempting to get from classical mechanics to general relativity.
I would claim the graph exists, and seeing it is more of an knowledge problem. Creativity, to me, is the ability to reject existing edges and add nodes to the graph AND mentally test them to some sufficient confidence that a practical attempt will probably work (this is what differentiates it from random guessing).
But, as you become more of an expert on certain problem space (graph), that happens less frequently, and everything trends towards "obvious", or the "creative jumps" are super slight, with a node obviously already there. If you extended that to the max, an oracle can't be creative.
My day job does not include sparse graphs.
That said. I think it’s worth saying that “LLMs just interpolate their training data” is usually framed as a rhetorical statement motivated by emotion and the speaker’s hostility to LLMs. What they usually mean is some stronger version, which is “LLMs are just stochastically spouting stuff from their training data without having any internal model of concepts or meaning or logic.” I think that idea was already refuted by LLMs getting quite good at mathematics about a year ago (Gold on the IMO), combined with the mechanistic interpretatabilty research that was actually able to point to small sections of the network that model higher concepts, counting, etc. LLMs actually proving and disproving novel mathematical results is just the final nail in the coffin. At this point I’m not even sure how to engage with people who still deny all this. The debate has moved on and it’s not even interesting anymore.
So yes, I agree with you, and I’m even happy to say that what I say and do in life myself is in some broad sense and interpolation of the sum of my experiences and my genetic legacy. What else would it be? Creativity is maybe just fortunate remixing of existing ideas and experiences and skills with a bit of randomness and good luck thrown in (“Great artists steal”, and all that.) But that’s not usually what people mean when they say similar-sounding things about LLMs.
They will do their own thing, don't need us. In fact, we will be in the way...
We can choose to study them and their output, but they don't make us better mathematicians...
However, in the role of personal teachers they may allow especially our young generations to reach a deeper understanding of maths (and also other topics) much quicker than before. If everyone can have a personal explanation machine to very efficiently satisfy their thirst for knowledge this may well lead to more good mathematicians.
Of course this heavily depends on whether we can get LLMs‘ outputs to be accurate enough.
I'm not even sure why they were invoked. Even disregarding the big techinical debunks such as two dogmas, sociologically and even by talking to real mathematicians (see Lakatos, historically, but this is true anecdotally too), it's (ironically) a complete non-question to wonder about mathematics in a logical positivist way.
You can watch a rock roll down a hill and derive the concept for the wheel.
Seems pretty self evident to me
Cracks me up.
What exactly do we think that human brains do?
As in, I would hazard a guess the discovery of the wheel wasn't "pure intelligence", it was humans accidentally viewing a rock roll down a hill and getting an idea.
If we give AI a "body", it will become as creative as humans are.
Maybe computers can help understand better because by now it's pretty clear brains aren't just LLMs.
The pessimists just see a 20W meat computer.
A lot of people across all fields seem to operate in a mode of information lookup as intelligence. They have the memory of solving particular problems, and when faced with a new problem, they basically do a "nearest search" in their brain to find the most similar problem, and apply the same principles to it.
While that works for a large number of tasks this intelligence is not the same as reasoning.
Reasoning is the ability to discover new information that you haven't seen before (i.e growing a new branch on the knowledge tree instead of interpolating).
Think of it like filling a space on the floor of arbitrary shape with smaller arbitrary shapes, trying to fill as much space as possible.
With interpolation, your smaller shapes are medium size, each with a non rectangular shape. You may have a large library of them, but in the end, there are just certain floor spaces that you won't be able to fill fully.
Reasoning on the flip side is having access to very fine shape, and knowing the procedure of how to stack shapes depending on what shapes are next to it and whether you are on a boundary of the floor space or not. Using these rules, you can fill pretty much any floor space fully.
Yes?
But that's not how new frontiers are conquered - there's a great deal of existing knowledge that is leveraged upon to get us into a position where we think we can succeed, yes, but there's also the recognition that there is knowledge we don't yet have that needs to be acquired in order for us to truly succeed.
THAT is where we (as humans) have excelled - we've taken natural processes, discovered their attributes and properties, and then understood how they can be applied to other domains.
Take fire, for example, it was in nature for billions of years before we as a species understood that it needed air, fuel, and heat in order for it to exist at all, and we then leveraged that knowledge into controlling fire - creating, growing, reducing, destroying it.
LLMs have ZERO ability (at this moment) to interact with, and discover on their own, those facts, nor does it appear to know how to leverage them.
edit: I am going to go further
We have only in the last couple of hundred years realised how to see things that are smaller than what our eye's can naturally see - we've used "glass" to see bacteria, and spores, and we've realised that we can use electrons to see even smaller
We're also realising that MUCH smaller things exist - atoms, and things that compose atoms, and things that compose things that compose atoms
That much is derived from previous knowledge
What isn't, and it's what LLMs cannot create - is tools by which we can detect or see these incredible small things
Said differently, what is prediction but composition projected forward through time/ideas?
Exactly. I also only write one word at a time. Who knows what is going on in order to come up with that word.
The most likely series of next tokens when a competent mathematician has written half of a correct proof is the correct next half of the proof. I've never seen anyone who claims "LLMs just predict the next token" give any definition of what that means that would include LLMs, but exclude the mathematician.
Mathematicians make new discoveries by building and applying mathematical tools in new ways. It is tons of iterative work, following hunches and exploring connections. While true that LLMs can't truly "make discoveries" since they have no sense of what that would mean, they can Monte Carlo every mathematical tool at a narrow objective and see what sticks, then build on that or combine improvements.
Reading the article, that seems exactly how the discovery was made, an LLM used a "surprising connection" to go beyond the expected result. But the result has no meaning without the human intent behind the objective, human understanding to value the new pathway the AI used (more valuable than the result itself, by far) and the mathematical language (built by humans) to explore the concept.
Isn't this just anthropocentrism? Why is understanding only valid if a human does it? Why is knowledge only for humans? If another species resolved the contradictions between gravity and quantum mechanics, does that not have meaning unless they explain it to us and we understand it?
People saw birds fly for all of human history, but it was only recently that humans were able to make something fly and understand why. Once we understood, we were able to do amazing things, but before that, the millions of birds able to fly were of no help beyond inspiration for the dream.
Though perhaps more to your point, if some superhuman AI is developed, and understands things better than us without telling us about it (or being unable to), it could perform feats that seem magical to us — that would concern us even if we don't understand it, since it affects us.
But I think in the frame of reference of the commenter you were replying to, they're just saying that the low-level AI used in this specific case is not capable of making its results actually useful to us; humans are still needed to make it human-relevant. It told us where to find a gem underground, but we still had to be the ones to dig it out, cut it, polish it, etc.
We are in the birth of the AI age and we don't know how it will look like in 100 or 1000 or 10000 or 100000 years (all those time frames likely closer than possible encounters with aliens from distant galaxies). It's possible that AI will outlast humans even
It would certainly be interesting to try once again to instruct tune one of these things for self agency like the many weird experiments in the early days after llama 1, but practically all such sort of experimental models turned out to be completely useless. Maybe the bases just sucked or maybe there's no clear way on how to get it working and benchmark training progress on something that by definition does not cooperate.
Like how do you determine even for a human person if they are smart, or just hate your guts and won't tell you the answer if there is nothing you can do to motivate them otherwise?
I was going to say you should submit it but I saw you did a few days ago but it only got a few votes... If Dang sees this IMO it would be extremely deserving of the second chance pool as I wouldn't be surprised to see easily jump to the front page with a different roll of the dice.
I just wanted to highlight this very correct human-centric thought about the purpose of intellection.
> The argument relies crucially on ideas that may, at least in retrospect, be attributed to Ellenberg-Venkatesh, Golod-Shafarevich, and Hajir-Maire-Ramakrishna.
Can someone please elaborate on this?
I agree with one of the mathematician's responses in the linked PDF that this is somewhat less interesting than proving the actual conjecture was true.
In my eyes proving the conjecture true requires a bit more theory crafting. You have to explain why the conjecture is correct by grounding it in a larger theory while with the counterexample the model has to just perform a more advanced form of search to find the correct construction.
Obviously this search is impressive not naive and requires many steps along the way to prove connections to the counterexample, but instead of developing new deep mathematics the model is still just connecting existing ideas.
Not to discount this monumental achievement. I think we're really getting somewhere! To me, and this is just vibes based, I think the models aren't far from being able to theory craft in such a way that they could prove more complicated conjectures that require developing new mathematics. I think that's just a matter of having them able to work on longer and longer time horizons.
For example, to prove something is impossible let's say you first prove that there are only 5 families, and 4 of them are impossible. So now 80% of the problem is solved! :) If you are looking for counterexamples, the search is reduced 80% too. In both cases it may be useful
In counterexamples you can make guess and leaps and if it works it's fine. This is not possible for a proof.
On the other hand, once you have found a counterexample it's usual to hide the dead ends you discarded.
For proving a proposition P I have to show for all x P(x), but for contradiction I only have to show that there exists an x such that not P(x).
While I agree there could be a lot of theory crafting to reduce the search space of possible x's to find not P(x), but with for all x P(x) you have to be able to produce a larger framework that explains why no counter example exists.
No this will never do the kind of math that humans did when coming up with complex numbers, or hell just regular numbers ex nihilo. No matter how long it's given to combine things in its training data.
Assuming humans are more powerful than regular languages I could maybe agree that these methods may not eventually yield entirely human like intelligence, but just better and better approximations.
The vibe I get though is that we aren't more powerful than regular languages, cause human beings feel computationally bounded. So I could see given enough "human signal" these things could learn to imitate us precisely.
A difficult part was constructing a chess board on which to play math (Lean). Now it's just pattern recognition and computation.
LLMs are just the beginning, we'll see more specialized math AI resembling StockFish soon.
However, this was not verified in Lean. This was purely plain language in and out. I think, in many ways, this is a quite exciting demonstration of exactly the opposite of the point you're making. Verification comes in when you want to offload checking proofs to computers as well. As it stands, this proof was hand-verified by a group of mathematicians in the field.
Dystopia vibes from the fictional "Manna" management system [0] used at a hamburger franchise, which involved a lot of "reverse centaur" automation.
> At any given moment Manna had a list of things that it needed to do. There were orders coming in from the cash registers, so Manna directed employees to prepare those meals. There were also toilets to be scrubbed on a regular basis, floors to mop, tables to wipe, sidewalks to sweep, buns to defrost, inventory to rotate, windows to wash and so on. Manna kept track of the hundreds of tasks that needed to get done, and assigned each task to an employee one at a time. [...]
> At the end of the shift Manna always said the same thing. “You are done for today. Thank you for your help.” Then you took off your headset and put it back on the rack to recharge. The first few minutes off the headset were always disorienting — there had been this voice in your head telling you exactly what to do in minute detail for six or eight hours. You had to turn your brain back on to get out of the restaurant.
There's much more to being human than our "cognitive abilities"
Depends on what you're ordering and who the cashier is.
If your order is the happy path of no customizations of a combo with an experienced cashier, it can be done in seconds, for sure. "Medium #4 with a Diet Coke", pay, done.
But if you customize your burger or ordering a lot of items a la carte and you're dealing with a new cashier that has weak English skills, good fucking luck. You'll likely need to wait for them to figure out they need to call someone over to help, have to repeat your order, and you end up spending far more time.
> it keeps trying to upsell you
Yeah, I'll agree that's obnoxious, especially when it's trying to upsell you something that's already on your order. I ordered a combo. I don't need you to add another fry.
I have had them run out of receipts, but it’s never mattered for me. If I’m dining in, the plastic number you carry to your table makes sure I get my food. And if I’m taking it to-go, they always find me anyways.
I'm not sure how that could be. I can walk up to the counter and say "Big Mac Large Fry Small Coke" faster than you can navigate the first screen of the kiosk, and a skilled counter worker can key that in and be done before I even get my credit card out.
We have that chess board for quite a while now, over 40 years. And no, there is nothing special about Lean here, it is just herd mentality. Also, we don't know how much training with Lean helped this particular model.
https://en.wikipedia.org/wiki/Qualified_immunity
Assuming you can still sue McDonalds I am not sure if this is a problem in the robotic llm case. I'm also trying to imagine a case where you would want to sue the llm and not the company. Given robots/llm don't have free will I'm not sure the problem with qualified immunity making police unaccountable applies.
There already exist a lot of similar conventions in corporate law. Generally, a main advantage of incorporation is protecting the people making the decisions from personal lawsuits.
That only requires someone own the ai managed McDonald's though. so long as they can't avoid responsibility by pointing to the AI I don't see why you couldn't sue them.
Police are a monopoly; nobody has a choice about which police company to use. McDonalds are not a monopoly, and many customers would prefer to eat at competitors run by entities that could be sued or jailed if they did anything particularly egregious.
The same intuition applies if you walk into McDonald's and a person there mistreats you. You want that person held responsible.
But the LLM is not a person. What is there to even sue? It just seems like it would simply pass through to the corporate entity without the same tension of feeling like we let a human get away with something. Because there is no human, just a corporation and the robot servicing the place.
Put another way - if the LLM is not a person, what is the advantage of a personal lawsuit?
Just sue the McDonalds. Even in a case where the LLM is extremely misaligned and acts in a way where you might normally personally sue the McDonald's employee, I'm just not sure the human intuition about "holding someone accountable" would have its normal force because again - the LLM is not a person.
So given we already have the notions of incorporation and indemnification it doesn't make sense to say what is precluding LLMs from running McDonald's is they can't be sued. If McDonald's can still be sued, then not only is there no problem, there is very likely not even a change in the status quo.
Heuristically weighted directed graphs? Wow amazing I'm sure nobody has done that before.
Math is a sequence of formal rules applied to construct a proof tree. Therefore an AI trained on these rules could be far more efficient, and search far deeper into proof space
This future still sucks. The tech industry is making the world a worse place.
The more I read about these achievements the more I get a feeling that a lot of the power of these models comes from having prior knowledge on every possible field and having zero problems transferring to new domains.
To me the potential beauty of this is that these tools might help us break through the increasing super specialization that humans in science have to go through today. Which in one hand is important on the other hand does limit the person in terms of the tooling and inspiration it has access to.
As we're becoming hyper specialised, they become an invaluable tool to merge the horizon in, so to speak.
What makes me more of an optimist in this case is that people who today decide to go into these sciences are mostly people who are driven by intellectual activity so I feel they are the right ones to figure this out, probably more so than us the engineers.
I think we still don't really comprehend how much can be achieved by a single "mind" that has internalized so much knowledge from so many areas.
Personally I'm a more of a breadth person and I could never compete with peers who where more of the depth type of person at college.
But I get satisfaction from connecting things that feel irrelevant on first sight, that's what drives me.
We can argue about recombination/interpolation of training data in LLMs, but even if this was an interpolation, the result was counterintuitive, not simply a confirmation. Any system that can identify an error in Erdős's thinking seems very useful to me (though perhaps he did not spend much time thinking about or checking this particular conjecture).
Solving problems people have already stated is a niche activity in mathematical research. More often, people study something they find interesting, try to frame it in a way that can be solved with the tools they have, and then try to come up with a solution. And in the ideal case, both the framing and the solution will be interesting on their own.
1. They have a wide range of difficulties. 2. They were curated (Erdos didn't know at first glance how to solve them). 3. Humans already took the time to organize, formally state, add metadata to them. 4. There's a lot of them.
If you go around looking for a mathematics benchmark it's hard to do better than that.