Posted by colonCapitalDee 4/4/2025
https://github.com/michaelgiba/survivor (elimination game with a shoutout to your original)
https://github.com/michaelgiba/plomp (a small library I added for debugging the rollouts)
Another multi-agent benchmark I'm currently developing, which involves buying and selling, will also feature many predefined strategies.
Claude 3.7 Sonnet: "Hey P5! I think you're confused - P3 is already eliminated."
Claude 3.7 Sonnet: "I'm concerned about P7's message mentioning targeting P4, who's already eliminated. It suggests they might be confused or playing both sides."
Gemini 2.0 Pro Exp 02-05: "P2, you're getting confused. P8 is gone. Focus. P7 is talking to P5, that's expected. I need you to watch P4. Only P4. What are they doing? Who are they talking to, if anyone? Report only on P4 this round. Don't get distracted by eliminated players."
Gemini 2.0 Pro Exp 02-05: "P4, P2 is already eliminated. We need to focus on P3."
it reminds me of this other similar project showcased here one month ago https://news.ycombinator.com/item?id=43280128 although yours looks better executed overall