Top
Best
New

Posted by xerzes 16 hours ago

Show HN: Ghidra MCP Server – 110 tools for AI-assisted reverse engineering(github.com)
260 points | 63 comments
xerzes 16 hours ago|
Hi HN,

I built this because reverse engineering software across multiple versions is painful. You spend hours annotating functions in version 1.07, then version 1.08 drops and every address has shifted — all your work invisible.

The core idea is a normalized function hashing system. It hashes functions by their logical structure — mnemonics, operand categories, control flow — not raw bytes or absolute addresses. When a binary is recompiled or rebased, the same function produces the same hash. All your documentation (names, types, comments) transfers automatically.

Beyond that, it's a full MCP bridge with 110 tools for Ghidra: decompilation, disassembly, cross-referencing, annotation, batch analysis, and headless/Docker deployment. It integrates with Claude, Claude Code, or any MCP-compliant client.

For context, the most popular Ghidra MCP server (LaurieWired's, 7K+ stars) has about 15 tools. This started as a fork of that project but grew into 28,600 lines of substantially different code.

Architecture:

  Java Ghidra Plugin (22K LOC) → embeds HTTP server inside Ghidra
  Python MCP Bridge (6.5K LOC) → 110 tools with batch optimization
  Any MCP client → Claude, scripts, CI pipelines
I validated the hashing against Diablo II — dozens of patch versions, each rebuilding DLLs at different base addresses. The hash registry holds 154K+ entries, and I can propagate 1,300+ function annotations from one version to the next automatically.

The headless mode runs in Docker (docker compose up) for batch processing and CI integration — no GUI required.

v2.0.0 adds localhost-only binding (security), configurable timeouts, label deletion tools, and .env-based configuration.

Happy to discuss the hashing approach, MCP protocol design decisions, or how this fits into modern RE workflows.

Retr0id 10 hours ago||
What does your function-hashing system offer over ghidra's built in FunctionID, or the bindiff plugin[0]?

[0] https://github.com/google/bindiff

chc4 9 hours ago|||
Or better yet, the built-in Version Tracker, which is designed for porting markup to newer versions of binaries with several different heuristic tools for correlating functions that are the same due to e.g. the same data or function xrefs, and not purely off of identical function hashes...

Going off of only FunctionID will either have a lot of false positives or false negatives, depending on if you compute them masking out operands or not. If you mask out operands, then it says that "*param_1 = 4" and "*param_1 = 123" are the same hash. If you don't mask out operands, then it says that nearly all functions are different because your call displacements have shifted due to different code layout. That's why the built-in Version Tracker tool uses hashes for only one of the heuristics, and has other correlation heuristics to apply as well in addition.

sintax 2 hours ago||||
or Binary Ninja's WARP : https://docs.binary.ninja/guide/warp.html // https://github.com/vector35/warp
cgfjtynzdrfht 38 minutes ago|||
[dead]
gcormier 4 hours ago|||
Was hoping to kick the tires but seem to be spinning my wheels trying to get Ghidra to see the plugin. Is GH Discussions your preferred means of communications?
bobbycrocodilo 3 hours ago|||
How does it compare to other Ghidra MCP servers?

- pyghidra-mcp - ReVa - GhidrAssistMCP - GhydraMCP - etc...

babas 12 hours ago|||
How does this compare to ReVa? https://github.com/cyberkaida/reverse-engineering-assistant

I think your installation instructions are incomplete. I followed the instructions and installed via file -> install in the project view. Restarted. But GhidraMCP is not visible in Tools after opening a binary.

skerit 12 hours ago||
I've been using ReVa for a long time (even upstreamed some changes to it) and it works great.
nunobrito 13 hours ago||
Thank you for sharing, will soon try out. Does it support decompilation of android binaries?
carl_dr 8 hours ago||
I used a different Ghidra MCP server (LaurieWired's) to, umm, liberate some software recently. I can’t express how fun straightforward it was to analyze the binary and generate a keygen.

I learnt a ton in the progress. I highly recommend others do the same, it’s a really fun way of spending an evening.

I will certainly be giving this MCP server a go.

reactordev 8 hours ago||
I have some old software I wrote that calls home to a server that no longer exists to do a cert check that would never pass in order to install it. I tried writing my own Ghidra tool, skill, agent, MCP and still can’t seem to figure it out. I’m positive it’s a “human skill” issue but man… ironic that this pops up the week after I gave up trying.
greenavocado 5 hours ago||
This branch is 110 commits ahead of LaurieWired/GhidraMCP:main.
cgfjtynzdrfht 34 minutes ago||
[dead]
joecarpenter 5 hours ago||
Reverse engineering with LLMs is very underrated for some reason.

I'm working on a hobby project - reverse-engineering a 30 year old game. Passing a single function disassembly + Ghidra decompiler output + external symbol definitions RAG-style to an agent with a good system prompt does wonders even with inexpensive models such as Gemini 3 Flash.

Then chain decompilation agent outputs to a coding agent, and produced code can be semi-automatically integrated into the codebase. Rinse and repeat.

Decompiled code is wrong sometimes, but for cleaned up disassembly with external symbols annotated and correct function signatures - decompiled output looks more or less like it was written by a human and not mechanically decompiled.

popinman322 4 hours ago|
I've found that Gemini models often produce pseudocode that seems good at first glance but is typically wrong or incomplete, especially for larger or more complex functions. It might produce pseudocode for 70% of the function, then silently drop the last 30%. Or it might elide the inside of switch blocks or if statements, only including a comment explaining what should happen.

Alternatively, Claude Opus generally output actual code that included more of the original functionality. Even Qwen3-30B-A3B performs better than Gemini, in my experience.

It's honestly really frustrating. The huge context size available with Gemini makes the model family seem like a boon for this task; PCode is very verbose, impinging on the headroom needed for the model's response.

joecarpenter 3 hours ago||
In my case I'm decompiling into C and it does a pretty good job at translation. There were situations where it missed an important implementation detail. For example, there is an RLE decompressor and Gemini generated plausible, but slightly incorrect code. Gemini 3 Pro was not able to find the bug and produced code that was similar to Gemini 3 Flash.

The bug was one-shotted by GPT 5.2.

VortexLain 7 hours ago||
I haven't looked at the MCP server, but generally, reverse engineering with AI is quite underrated. I’ve had success extracting encryption keys from an android app that uses encryption to vendor-lock users by forcing them to use that specific app to open files that should otherwise be in an open format.

By the way, this app had embedded the key into the shader, and it was required to actually run this shader on android device to obtain the key.

MaxLeiter 6 hours ago||
My friend and I were able to give claude a (no longer updated) unity arcade game. It decompiled it and created a one-to-one typescript port so it can run in the browser and now we're adding multiplayer support (for personal use, don't worry HN - we won't be distributing it). I'm very excited for what AI can do for legacy software.
Alifatisk 6 hours ago|||
I agree, I tried RE using multiple tools connected to MCP and a agent, it was tasked to recreate what the source code might have looked like from a binary and what possible vulnerabilities there could be. It did a incredible job when I compared it to the actual source.
baby_souffle 6 hours ago||
> By the way, this app had embedded the key into the shader, and it was required to actually run this shader on android device to obtain the key.

Oh that's clever. I don't suppose you can share more about how this was done?

summarity 13 hours ago||
Ive been using it (the original 15 tool version) for months now. It’s amazing. Any app's inner workings are suddenly transparent. I can track down bugs. Get a deeper understanding of any tool, and even write plug-ins or preload shims that mod any app. It’s like I finally actually _own_ the software I bought years ago.

For objective C heavy code, I also use Hopper Disassembler (which now has a built in MCP server).

Some related academic work (full recompilation with LLMs and Ghidra): https://dl.acm.org/doi/10.1145/3728958

junon 12 hours ago|
Talking about RE'ing applications and equating that to OSS is not a good look when you work at GitHub...
derrida 12 hours ago|||
I have no idea about any of that but like I wasn't thinking of github until you mentioned it and this comment I upvoted because was informative and relevant to the discussion and I don't know about R.E but curious to try and this kind of activity just seems like the sort of things people who are interested in software, learning and aware of security do... like to find bugs or malware or something... FOSS or not - actually "especially if not FOSS" you'd kinda like people to scan their binaries at <big tech corp> and have that knowledge indigenous wouldn't you? while thinking of code security etc, anyway

Is this a bad look for Derrida.org?

Anyway, "not my business"

summarity 11 hours ago|||
That's why I put it in quotes. In no way am I equating anything. Making the inner workings visible is what I was referring to.
JasonADrury 13 hours ago||
I thought MCP interfaces with high amounts of tools perform much worse than MCP interfaces with fewer tools, this doesn't seem like a great design.

This also seems to just be vibecoded garbage.

tonylucas 11 hours ago||
Haven't looked at the app itself but the MCP tool problem is mainly solved now using lazy loading, it's far from perfect but the immediate context window overload problem is gone (in clients that support it anyway).

Now just onto the fact that most MCP tools are just transforming API calls and their functionality and return data structures suck for LLM's....

lwroo 11 hours ago||
True. Though vibecoded skill-based tools would perform much more efficiently than this.
stared 13 hours ago||
Interesting to see Ghidra here!

A friend from work just used it (with Claude) to hack River Ride game (https://quesma.com/blog/ghidra-mcp-unlimited-lives/).

Inspired by the, I have it a try as well. While I have no prior experience with reverse engineering, I ported an old game from PowerPC to Apple Silicon.

First, including a few MCPs with Claude Code (including LaurieWired/GhidraMCP you forked from, and https://github.com/jtang613/GhidrAssistMCP). Yet, the agent fabricated as lot of code, instead for translating it from source.

I ended up using headless mode directly in Cursor + GPT 5.2 Codex. The results were the best.

Once I get some time, will share a write-up.

s-macke 12 hours ago|
I’ve also been playing around with reverse engineering, and I’m very impressed. It turns out that Codex with GPT-5.2 is better at reverse engineering than Claude.

For example, Codex can completely reverse-engineer this 1,300-line example [0] of a so-called C64-SID file within 30 minutes, without any human interaction.

I am working on a multi-agent system that can completely reverse-engineer C64 games. Old MS-DOS games are still too massive to analyze for my budget limit.

[0] https://gist.github.com/s-macke/595982d46d6699b69e1f0e051e7b...

skerit 11 hours ago||
Oh, interesting. I started using the ReVa/Ghidra MCP server together with Claude since day 1 (Well, since Claude Sonnet 4.0 was released) and I saw Claude get better at it with every update. I've gotten pretty far in reverse engineering a game from the early 2000s (though I still have to do a lot of things manually, but this then also taught me A TON about Ghidra)

I'm very interested in trying out Codex now.

grosswait 9 hours ago||
I am not a reverse engineer. In fact, I only consider myself an intermediate coder(more of a scripter tbh), but I have decades of (fairly deep) technical experienced as a generalist. With Claude code and another Ghidra MCP I was able to reverse engineer a ransomware encryptor and decryptor (had both) to create a much more reliable version of the decryptor. Saved terabytes of data. Felt like a super power!
tarasyarema 12 hours ago||
Simple question: why not a cli instead? As seems that lately LLM and agentic tools seems to be better at using clis rather than bloated MCPs?
NicuCalcea 12 hours ago||
I think they're only better for CLI tools that are in the training data. If it's a new tool, you'd need to dump the full documentation in the context either way.
occz 8 hours ago|||
This can be solved well enough by having the model invoke `--help`
kruxigt 10 hours ago|||
[dead]
somnium_sn 7 hours ago|||
Tools like Claude Code have improved here. They won’t load all tools but instead rely on tool search. Context bloat of MCP servers was a thing if badly written clients but it’s certainly getting better
esafak 8 hours ago|||
Because it was started before the revelation MCP was a context hog. https://github.com/LaurieWired/GhidraMCP
stingraycharles 11 hours ago||
This is what I was thinking, or fewer, more versatile tools. Having the description of 110 tools in your context window at all times is just noise.
raphaelmolly8 5 hours ago|
The cross-binary documentation transfer via normalized function hashing is really compelling for anyone tracking software that updates frequently. I've dealt with similar pain points analyzing game clients that push patches weekly — manually re-annotating shifted addresses is brutal.

Curious about the hash collision rate in practice. The README mentions 154K+ entries from Diablo II patches. With that sample size, have you encountered meaningful false positives where structurally similar but semantically different functions matched? The Version Tracker comparison in the comments is fair — seems like combining this hash approach with additional heuristics (xref patterns, call graph structure) could reduce both false positives and negatives.

The headless Docker mode is a nice touch for CI integration. Being able to batch-analyze binaries and auto-propagate annotations without spinning up a GUI opens up some interesting automated diffing workflows.

More comments...