Frontier AI has broken the open CTF format

Posted by frays 6 hours ago

Frontier AI has broken the open CTF format(kabir.au)

199 points | 168 commentspage 3

motbus3 5 hours ago|

I think soon there will be ways to trick this models and I think when it happens it will be yet another layer like aslr

These models seems completely unbeatable only in the ads. There are 100+ times way someone puts Hindi Yoda talk In Morse Code and it goes nuts. The reason they are going to hard for PR Marketing on this is because they know it is a matter of time.

Avamander 3 hours ago|

The more you obfuscate a topic against LLMs the lower the educational value of a challenge.

The only things that works is novelty and obscurity. LLMs still suck with things mentioned in the footnotes of datasheets and manuals, things that deviate in subtle ways, unique constructions that alter something very very common. It's hard for LLMs to avoid common pitfalls in terms of making assumptions, while staying on track.

jimnotgym 4 hours ago||

You can still do competitions. But you'll all need to fly to the same place and work on laptops with a fresh install of Linux. 1 hour to install tooling then Internet off, challenge revealed.

Not as easy logistically...

SoylentOrange 5 hours ago||

Great article, well written, and good analogy to chess. I’ve been playing competitive chess most of my adult life and I think that the solution lies in how chess dealt with this problem:

Explicit ELO measurements with some cheating detection. AI assistance wholly banned. As you climb the ELO ladder, detection gets more onerous. At top level during online events, anti cheating teams require the use of both monitoring software and multiple cameras.

Idea is that you can cheat pretty easily at the lowest levels but it gets less easy the higher you go. This allows for better feeding into the truly elite competitions.

I think chess’s very firm stance that AI is never allowed in competition (neither online nor in person), rather than CTF’s acceptance, was the right call.

salt4034 1 hour ago|

Yes, chess has been dealing with AI for decades at this point, and it's amusing/frustrating that so many other communities are deciding to re-discover everything from scratch, rather than just learn from the chess experience.

If CTF is a player-vs-player event, then AI should just be banned outright, otherwise it will devolve into AI-vs-AI, which is just not an interesting competition format, as we learned in chess. Compared to FIDE top events (which bans AI), only a tiny niche audience actually watches the Top Chess Engine Championship (AI-centered). It turns out what we care about is not whether chess can be solved by any means available, but what are the limits of the human mind in learning chess.

Pretty much all chess coaches/educators also warn against relying heavily on AI during learning; engines only give you an illusion of understanding.

TrackerFF 3 hours ago||

Question: Was this website made with Claude?

I've seen that exact font and color scheme a dozen of times the past weeks.

saidnooneever 3 hours ago||

Do CTFs like Lan parties or factor in new tooling avalable to people. change is not death. or death is not an end. either way, people will enjoy applying and showing off their skill. competing with eachother on a human level,.with or without ai tools.

vagab0nd 4 hours ago||

This left a strange feeling. The article reads as extremely bleak. But from a different perspective this is extremely bullish for AI.

Avamander 3 hours ago|

LLMs managing the "coloring book" equivalent of something is not bullish for the "art" version of something.

The intent for most CTFs is to provide a meaningful challenge that concerns a single topic without introducing noise that wastes time. Of course a training exercise is easier to complete for an LLM.

dostick 2 hours ago||

Unable to find what “CTF” means, since it doesnt look like referring to Capture The Flag gaming

yc-kraln 2 hours ago|

It does--but a particular form of Capture The Flag where there is a computer system and the "capturing" is breaking in or exploiting a security issue in that system.

tkel 1 hour ago||

Pretty ironic that this article was also written using LLMs. It has all the LLM-isms.

r4indeer 5 hours ago|

I'm conflicted on the use of AI in CTFs. On the one hand, they are supposed to mirror real-life scenarios, so of course you should be able to use any tool that would be available to you in real life.

On the other hand, CTFs are fundamentally a game and a competition which are supposed to be fun and compare and improve ones skill. So when I let an LLM generate the entire solution for me, what's the point anymore? I did not learn anything. I did not work for that place on the leaderboard, I just copied the solution. And worst of all, I did not have any fun. It's boring.

So how does using AI as a solver not feel like cheating?

More comments...