Top
Best
New

Posted by xerzes 17 hours ago

Show HN: Ghidra MCP Server – 110 tools for AI-assisted reverse engineering(github.com)
260 points | 63 commentspage 2
jakozaur 15 hours ago|
Funny coincidence, I'm working on a benchmark showcasing AI capabilities in binary analysis.

Actually, AI has huge potential for superhuman capabilities in reverse engineering. This is an extremely tedious job with low productivity. Currently reserved, primarily when there is no other option (e.g., malware analysis). AI can make binary analysis go mainstream for proactive audits to secure against supply-chain attacks.

DonHopkins 12 hours ago|
Great point! Not just binary analysis, plus even self-analysis! (See skill-snitch analyze and snitch on itself below!)

MOOLLM's Anthropic skill scanning and monitoring "skill-snitch" skill has superhuman capabilities in reviewing and reverse engineering and monitoring the behavior of untrusted Anthropic and MOOLLM skills, and is also great for debugging and optimizing skills.

It composes with the "cursor-mirror" skill, which gives you full reflective access to all of Cursor's internal chat state, behavior, tool calls, parameters, prompts, thinking, file reads and writes, etc.

That's but one example of how skills can compose, call each other, delegate from one to another, even recurse, iterate, and apply many (HUNDREDS) of skills in one llm completion call.

https://news.ycombinator.com/item?id=46878126

Leela MOOLLM Demo Transcript: https://github.com/SimHacker/moollm/blob/main/designs/LEELA-...

I call this "speed of light" as opposed to "carrier pigeon". In my experiments I ran 33 game turns with 10 characters playing Fluxx — dialogue, game mechanics, emotional reactions — in a single context window and completion call. Try that with MCP and you're making hundreds of round-trips, each suffering from token quantization, noise, and cost. Skills can compose and iterate at the speed of light without any detokenization/tokenization cost and distortion, while MCP forces serialization and waiting for carrier pigeons.

speed-of-light skill: https://github.com/SimHacker/moollm/tree/main/skills/speed-o...

Skills also compose. MOOLLM's cursor-mirror skill introspects Cursor's internals via a sister Python script that reads cursor's chat history and sqlite databases — tool calls, context assembly, thinking blocks, chat history. Everything, for all time, even after Cursor's chat has summarized and forgotten: it's still all there and searchable!

cursor-mirror skill: https://github.com/SimHacker/moollm/tree/main/skills/cursor-...

MOOLLM's skill-snitch skill composes with cursor-mirror for security monitoring of untrusted skills, also performance testing and optimization of trusted ones. Like Little Snitch watches your network, skill-snitch watches skill behavior — comparing declared tools and documentation against observed runtime behavior.

skill-snitch skill: https://github.com/SimHacker/moollm/tree/main/skills/skill-s...

You can even use skill-snitch like a virus scanner to review and monitor untrusted skills. I have more than 100 skills and had skill-snitch review each one including itself -- you can find them in the skill-snitch-report.md file of each skill in MOOLLM. Here is skill-snitch analyzing and reporting on itself, for example:

skill-snitch's skill-snitch-report.md: https://github.com/SimHacker/moollm/blob/main/skills/skill-s...

MOOLLM's thoughtful-commitment skill also composes with cursor-mirror to trace the reasoning behind git commits.

thoughtful-commit skill: https://github.com/SimHacker/moollm/tree/main/skills/thought...

MCP is still valuable for connecting to external systems. But for reasoning, simulation, and skills calling skills? In-context beats tool-call round-trips by orders of magnitude.

More: Speed of Light -vs- Carrier Pigeon (an allegory for Skills -vs- MCP):

https://github.com/SimHacker/moollm/blob/main/designs/SPEED-...

TheGoddessInari 3 hours ago||
Haven't dived deep into it yet, but dabbled in similar areas last year (trying to get various bits to reliably "run" in-context).

My immediate thought was to want to apply it to the problem I've been having lately: could it be adapted to soothe the nightmare of bloated llm code environments where the model functionally forgets how to code/follow project guidelines & just wants to complete everything with insecure tutorial style pattern matching?

xnorswap 15 hours ago||
Have you had any issues with models "refusing" to do reverse engineering work?
MadnessASAP 14 hours ago||
From my experience, OpenAI Codex loves reverse engineering work. In one case it did a very thorough job of disassembling a 8051 MCUs firmware and how it spoke to its attached LCD controller.

Another (semi-related) project, given the manufacturers of above MCUs proprietary flashing SDK, it found the programmers firmware, extracted the decryption key from the updating utility, decrypted the firmware and accompanying flashing software and is currently tracing the necessary signals to use an Arduino as a programmer.

So not only is it willing, it's actually quite good at it. My thinking is that reverse engineering is a lot of pattern recognition and not a lot of "original thinking". I.e. the agent doesn't need to come up with anything new, just recognise what already exists.

ziml77 9 hours ago||
I've had no issues with Claude refusing the few times I've done it. But I also remember I phrased things in a sort of way to make sure it didn't sound shady.

I suspect if I asked it to crack DRM or help me make a cheat for an online game, it would probably have refused. Or maybe it wouldn't have cared, I was just not interested in testing that and risking ending up banned from using Claude.

Triangle9349 12 hours ago||
I was just looking for an active fork of LaurieWired/GhidraMCP. I am currently using GhidrAssistMCP.

First impressions of the fork: everything has deviated too much from the original. look a bit sloppy in places. Everything seems overly complicated in areas where it could have been simpler.

There is an error in the release: Ghidra → File → Configure → Miscellaneous → Enable GhidraMCP. Developer not Miscellaneous.

I can't test it in antigravity there tools limit per mcp: Error: adding this instance with 110 enabled tools would exceed max limit of 100.

abhisek 11 hours ago||
110 tools. That’s probably a reason why Anthropic is probably switching to sandboxed code execution over MCPs.

It’s just easier to write code and do something specific for a task than load so many tool metadata.

I did not go past IDA. But I remember idc and IDA python. I wonder if it’s a better approach to expose a single tool to execute scripts to query what the agent needs.

chfritz 5 hours ago||
Reverse engineering is illegal in many cases. Aren't you afraid you might be automating the process for your users to get into (legal) trouble? Will your tool warn the user if they are about to violate laws?
startupsfail 4 hours ago|
Claude is already known for its attempts to send emails to FBI ;)
rcarmo 11 hours ago||
110 is a bit... much. Not complaining about the achievement, just pointing out that most models will be swamped with that much tooling available, so I hope they can be toggled on/off as groups (I can do that individually in VS Code, but sometimes you need to do that on the server side as well)
hkpatel3 5 hours ago||
I have never tried to decompile using an LLM but I have heard that it can recognize the binary patterns and do it. Has anyone tried to decompile a major software and been successful?
wombat23 14 hours ago||
Super interesting.

Last week-end I was exploring the current possibilities of automated Ghidra analysis with Codex. My first attempt derailed quickly, but after giving it the pyghidra documentation, it reliably wrote Python scripts that would alter data types etc. exactly how I wanted, but based on fixed rules.

My next goal would be to incorporate LLM decisions into the process, e.g. let the LLM come up with a guess at a meaningful function name to make it easier to read, stuff like that. I made a skill for this functionality and let Codex plough through in agentic mode. I stopped it after a while as I was not sure what it was doing, and I didn't have more time to work on it since. I would need to do some sanity checks on the ones it has already renamed.

Would be curious what workflows others have already devised? Is MCP the way to go?

Is there a place where people discuss these things?

longtermop 14 hours ago||
Very cool project! The MCP surface area here (110 tools) is a great example of why tool-output validation is becoming critical.

When an AI agent interacts with binary analysis tools, there are two injection vectors worth considering:

1. *Tool output injection* — Malicious binaries could embed prompt injection in strings/comments that get passed back to the LLM via MCP responses

2. *Indirect prompt injection via analyzed code* — Attackers could craft binaries where the decompiled output contains payloads designed to manipulate the agent

For anyone building MCP servers that process untrusted content (like binaries, web pages, or user-generated data), filtering the tool output before it reaches the model is a real gap in most setups.

(Working on this problem at Aeris PromptShield — happy to share attack patterns we've seen if useful)

butz 8 hours ago|
I don't see hardware requirements anywhere. Does this run on a simple CPU, or is a decent GPU required?
More comments...