Top
Best
New

Posted by GregorStocks 10 hours ago

Show HN: I taught LLMs to play Magic: The Gathering against each other(mage-bench.com)
I've been teaching LLMs to play Magic: The Gathering recently, via MCP tools hooked up to the open-source XMage codebase. It's still pretty buggy and I think there's significant room for existing models to get better at it via tooling improvements, but it pretty much works today. The ratings for expensive frontier models are artificially low right now because I've been focusing on cheaper models until I work out the bugs, so they don't have a lot of games in the system.
90 points | 71 commentspage 3
aethrum 8 hours ago|
I love magic. Can these do politics or is it just board state?
GregorStocks 8 hours ago|
I want them to do politics in Commander, and theoretically they should - the chat log is exposed in the MCP tools just like the rest of the game history, and their prompts tell them to use chat.

In practice they haven't really talked to each other, though. They've mostly just interpreted the prompts as "you should have a running monologue in chat". Not sure how much of this is issues with the harness vs the prompt, but I'm hoping to dig into it in the future.

jamilton 8 hours ago||
Cool. How’d you pick decks?
GregorStocks 8 hours ago|
For the 1v1 formats (Standard, Modern, Legacy) I'm basically just using the current metagame from MTGGoldfish. For Commander they get a random precon. At some point I might want a 1v1 "less complicated lines than Standard" format, the LLMs don't always understand the strategy of weird decks like Doomsday or Mill.
steveBK123 8 hours ago||
Why are all these Show HN posts overloaded with “i taught AI how to do things i used to do for entertainment” ?

Can we automate the unpleasantries in life instead of the pleasures?

qsort 8 hours ago||
Game AIs are probably one of the most harmless and unambiguously good applications of technology. As I said in another message, I used to play competitive MtG and I would have loved to have a competent AI opponent. Imagine the possibilities: after a tournament you could get to review the games and figure out what you did wrong and improve, like you would do in chess or backgammon.

I get the complaint, but how is this something that removes the human element at all?

zahlman 7 hours ago|||
I think Show HN is far more overloaded with "I one-shotted an automation I find useful and then asked an LLM to explain why this is actually revolutionary".
kenforthewin 8 hours ago||
Does an AI also playing your game somehow detract from the pleasure you derive from it? I find it entertaining both to play the games, and see how LLMs perform on them; I don't see how these are in any way mutually exclusive.