Top
Best
New

Posted by projectyang 1 day ago

Show HN: Play poker with LLMs, or watch them play against each other(llmholdem.com)
I was curious to see how some of the latest models behaved and played no limit texas holdem.

I built this website which allows you to:

Spectate: Watch different models play against each other.

Play: Create your own table and play hands against the agents directly.

141 points | 78 commentspage 2
sblawrie 22 hours ago|
Do the players (LLMs) have memory of how prior hands were played by their opponents, or know their VPIP and PFR percentages? Or is each hand stateless?
zahlman 21 hours ago|
I suspect this would only matter much if they also remembered (and cared about) their own prior play.
sejje 21 hours ago||
Not really. Only as far as their table image mattered--in this case, zero. Otherwise, you can and should ignore your own past play.

What I'm curious about is if their innate training is enough to give them biases. Like maybe they think Grok is full of shit.

zahlman 3 hours ago||
> Not really. Only as far as their table image mattered--in this case, zero.

Right; there's feedback to it. When humans play poker, they do so with common knowledge of the fact that humans have object permanence and can recognize and remember their opponents. The same thing that motivates "profiling" a villain, motivates attempting to project a table image, which in turn motivates being aware of the table image one is projecting.

TheDudeMan 8 hours ago||
So strange that people are into this, but were not into the much stronger non-LLM poker agents.
neko_ranger 21 hours ago||
Thank you, I'll try to grab a table when it resets :) ! I've been getting into poker (always wanted to) since I found a lecture series from John Hopkins, and severely disappointed by my options to play online in NY (real or fake money). I just want to get reps in
erikcw 20 hours ago|
Link to the lectures?
maxbond 20 hours ago||
Presumably it's this course:

https://youtube.com/@jhupoker4850

https://hopkinspokercourse.com

fumblebee 8 hours ago||
this could make for an interesting new benchmark
mashlol 21 hours ago||
I'm not an expert, but as I understand it there are existing solvers for poker/holdem? Perhaps one of the players could be a traditional solver to see how the LLMs fare against those?
projectyang 11 hours ago||
While others have commented about solvers, I'd also like to bring up AI poker bots such as Pluribus (https://en.wikipedia.org/wiki/Pluribus_(poker_bot)).

This also wouldn't even be a close contest, I think Pluribus demonstrated a solid win rate against professional players in a test.

As I was developing this project, a main thought came to mind as to the comparison between cost and performance between a "purpose" built AI such as Pluribus versus a general LLM model. I think Pluribus training costs ~$144 in cloud computing credits.

lowbatt 21 hours ago|||
the LLMs would get crushed
cowthulhu 21 hours ago||
To expand on this - an LLM will try to play (and reason) like a person would, while a solver simply crunches the possibility space for the mathematically optimal move.

It’s similar to how an LLM can sometimes play chess on a reasonably high (but not world-class) level, while Stockfish (the chess solver) can easily crush even the best human player in the world.

postpriorx 21 hours ago|||
How does a poker solver select bet size? Doesn't this depend on posteriors on the opponent's 'policy' + hand estimation?
Reason077 10 hours ago|||
GTO (“game theory optimal”) poker solvers are based around a decision tree with pre-set bet sizes (eg: check, bet small, bet large, all in), which are adjusted/optimized for stack depth and position. This simplifies the problem space: including arbitrary bet sizes would make the tree vastly larger and increase computational cost exponentially.
boscillator 21 hours ago||||
No, I'm not super certain, but I believe most solvers are trained to be game theory optimal (GTO), which means they assume every other player is also playing GTO. This means there is no strategy which beats them in the long run, but they may not be playing the absolute best strategy.
iberator 8 hours ago||||
Nash equilibrium. Optimal strategy for online poker has been known for like literally 20 years right now
sejje 20 hours ago|||
Typically when you run a simulation on a hand, you give it some bet size options.

To limit the scope of what it has to simulate.

It's unlikely they're perfect, but there's very small differences in EV betting 100% vs 101.6% or whatever.

meep_morp 18 hours ago||
Not only to limit the scope of what it has to simulate, but only a certain number of bet sizes is practical for a human to implement in their strategy.
bogzz 21 hours ago||||
How would an LLM play like a human would? I kind of doubt that there is enough recounting of poker hands or transcription of filmed poker games in the training data to imbue a human-like decision pattern.
Terr_ 8 hours ago|||
Also, if you set the bar for human players low enough, pretty much any set of actions is human-like. :p
meep_morp 18 hours ago|||
I don't have an answer, but there's over a decade of hand history discussions online from various poker forums like 2p2 and more recently Reddit.
FergusArgyll 20 hours ago||||
You are of course correct but to be pedantic:

Stockfish isn't really a solver it's a neural net based engine

DiscourseFan 21 hours ago|||
Unlike Chess, in poker you don’t have perfect information, so there’s no real way to optimize it.
tim-kt 21 hours ago||
You can still optimize for the expectation value, which is also essentially poker strategy.
sejje 21 hours ago||
The solvers don't typically work in real time, I don't think. They take a while to crunch a hand.
dmurray 19 hours ago||
"Solvers" normally means algorithms which aim to produce some mathematically optimal (given certain assumptions) behaviour.

There are other poker playing programs [0] - what we called AI before large language models were a thing - which achieve superhuman performance in real time in this format. They would crush the LLMs here. I don't know what's publicly available though.

[0] e.g. https://en.wikipedia.org/wiki/Pluribus_(poker_bot)

sejje 7 hours ago||
Solvers, in a poker context, are a category of programs. They run a simulation after you enter the known information.

Like piosolver, as an example.

The best poker-playing AI is not beatable by anyone, so yes, it would crush the LLMs.

lowbatt 21 hours ago||
I like it!

I was interested in this idea too and made a video where some of the previous top LLMs play against each other https://www.youtube.com/watch?v=XsvcoUxGFmQ&t=2s

casey2 9 hours ago||
These bots are regularly going down 20%+ on high cards duels
indigodaddy 18 hours ago||
Are the LLMs "watching" the action, or are they only apprised of previous action once it gets to them?
j_bum 17 hours ago|
How are these differebt in your mind? The history is the history.

Or do you mean - each agent has a chance to think after every turn?

indigodaddy 15 hours ago||
Well they can be watching all the action and thinking the whole time as the action leads up them, just like we do in poker. To me it's different, subtly perhaps.
projectyang 12 hours ago||
For my implementation, I'm passing in the current hand's action history (e.g. Player 1 raises to $X preflop, Player 2 calls, Player 3 calls. Flop is A B C, Player 2 checks, etc) whenever the action is on the player.

Your idea of having it being passed in real time and having the LLM create a chain of thoughts even if action is not on them is interesting. I'd be curious to see if it would result in improved play.

indigodaddy 18 hours ago||
Curious if you used pokerkit for this, or some other engine or custom engine?
projectyang 11 hours ago|
Nope, no external poker libraries. Just a basic nodejs and socket.io server with game logic.
indigodaddy 6 hours ago||
Cool
koolba 21 hours ago|
How long till one of the LLMs makes calls out to the other LLMs to evaluate how to play the hand?
More comments...