It's worth noting here that the author came up with a handful of good heuristics to guide Claude and a very specific goal, and the LLM did a good job given those constraints. Most seasoned reverse engineers I know have found similar wins with those in place.
What LLMs are (still?) not good at is one-shot reverse engineering for understanding by a non-expert. If that's your goal, don't blindly use an LLM. People already know that you getting an LLM to write prose or code is bad, but it's worth remembering that doing this for decompilation is even harder :)
Are they not performing well because they are trained to be more generic, or is the task too complex? It seems like a cheap problem to fine-tune.
Sounds like a more agentic pipeline task. Decompile, assess, explain.
If you aren't using LLMs for your reverse engineering tasks, you're missing out, big time. Claude kicks ass.
It's good at cleaning up decompiled code, at figuring out what functions do, at uncovering weird assembly tricks and more.
The article is a useful resource for setting up automated flows, and Claude is great at assembly. Codex less so, Gemini is also good at assembly. Gemini will happily hand roll x86_64 bytecode. Codex appears optimized for more "mainstream" dev tasks, and excels at that. If only Gemini had a great agent...
I've been using Claude for months with Ghidra. It is simply amazing.
Makes sense because LLMs are quite good at translating between natural languages.
Anyway, we're reaching the point where documentation can be generated by LLMs and this is great news for developers.
Documentation is one place where humans should have input. If an LLM can generate documentation, why would I want you to generate it when I can do so myself (probably with a better, newer model)?
I stumbled across a fun trick this week. After making some API changes, I had CC “write a note to the FE team with the changes”.
I then pasted this to another CC instance running the FE app, and it made the counter part.
Yes, I could have CC running against both repos and sometimes do, but I often run separate instances when tasks are complex.
Maybe documentation meant for other llms to ingest. Their documentation is like their code, it might work, but I don't want to have to be the one to read it.
Although of course if you don't vibe document but instead just use them as a tool, with significant human input, then yes go ahead.
Although with code it's implementing functions that don't exist yet and with documentation, it's describing functions that don't exist yet.
[deleted]
Makes me wonder if decompilation could eventually become so trivial that everything would become de-facto open source.
When the decompilation like that is trivial, so is recreation without decompilation. It implies the LLM know exactly how thins work.
I wonder when you're never going to run expensive software on your own CPU.
It'll either all be in the cloud, so you never run the code...
Or it'll be on a chip, in a hermetically sealed usb drive, that you plug in to your computer.
Yes, I believe it will. What I predict will happen is that most commercial software will be hosted and provided through "trusted" platforms with limited access, making reverse engineering impossible.
This deserves a discussion
We're very far away from this.
I've used LLMs to help with decompilation since the original release of GPT-4. They're excellent at recognizing the purpose of functions and refactoring IDA or Ghidra pseudo-C into readable code.
How does it do on things that were originally written in assembly?
This is typically easier because the code was written for humans already.
Someone please try this on an original (early 1980s) IBM-PC BIOS.
I've been experimenting with running Claude in headless mode + a continuous loop to decompile N64 functions and the results have been pretty incredible. (This is despite already using Claude in my decompilation workflow).
I hope that others find this similarly useful.
This sounds interesting! Do you have some good introduction to N64 decompiliation? Would you recommend using Claude right from the start or rather try to get to know the ins and outs of N64 decomp?
This is super cool! I would be curious to see how Gemini 3 fares… I've found it to be even more effective than Opus 4.5 at technical analysis (in another domain).
What game are you working on?
Last sentence of the first paragraph says it’s Snowboard Kids 2.
For his defense, it is missing a "Tell HN"
And it isn't always obvious when the commenter is the submitter (no [S] tag like you see on other sites).
whoops, I did indeed miss that this was OP
[deleted]
I've been waiting for decompilation to show up in this space.
Are there any similar specialized decompilation LLM models available to be used locally?
This is a refreshingly practical demonstration of an LLM adding value. More of this please.
It's worth noting here that the author came up with a handful of good heuristics to guide Claude and a very specific goal, and the LLM did a good job given those constraints. Most seasoned reverse engineers I know have found similar wins with those in place.
What LLMs are (still?) not good at is one-shot reverse engineering for understanding by a non-expert. If that's your goal, don't blindly use an LLM. People already know that you getting an LLM to write prose or code is bad, but it's worth remembering that doing this for decompilation is even harder :)
Are they not performing well because they are trained to be more generic, or is the task too complex? It seems like a cheap problem to fine-tune.
Sounds like a more agentic pipeline task. Decompile, assess, explain.
If you aren't using LLMs for your reverse engineering tasks, you're missing out, big time. Claude kicks ass.
It's good at cleaning up decompiled code, at figuring out what functions do, at uncovering weird assembly tricks and more.
The article is a useful resource for setting up automated flows, and Claude is great at assembly. Codex less so, Gemini is also good at assembly. Gemini will happily hand roll x86_64 bytecode. Codex appears optimized for more "mainstream" dev tasks, and excels at that. If only Gemini had a great agent...
I've been using Claude for months with Ghidra. It is simply amazing.
Makes sense because LLMs are quite good at translating between natural languages.
Anyway, we're reaching the point where documentation can be generated by LLMs and this is great news for developers.
Documentation is one place where humans should have input. If an LLM can generate documentation, why would I want you to generate it when I can do so myself (probably with a better, newer model)?
I stumbled across a fun trick this week. After making some API changes, I had CC “write a note to the FE team with the changes”.
I then pasted this to another CC instance running the FE app, and it made the counter part.
Yes, I could have CC running against both repos and sometimes do, but I often run separate instances when tasks are complex.
Maybe documentation meant for other llms to ingest. Their documentation is like their code, it might work, but I don't want to have to be the one to read it.
Although of course if you don't vibe document but instead just use them as a tool, with significant human input, then yes go ahead.
Although with code it's implementing functions that don't exist yet and with documentation, it's describing functions that don't exist yet.
Makes me wonder if decompilation could eventually become so trivial that everything would become de-facto open source.
When the decompilation like that is trivial, so is recreation without decompilation. It implies the LLM know exactly how thins work.
I wonder when you're never going to run expensive software on your own CPU.
It'll either all be in the cloud, so you never run the code...
Or it'll be on a chip, in a hermetically sealed usb drive, that you plug in to your computer.
Yes, I believe it will. What I predict will happen is that most commercial software will be hosted and provided through "trusted" platforms with limited access, making reverse engineering impossible.
This deserves a discussion
We're very far away from this.
I've used LLMs to help with decompilation since the original release of GPT-4. They're excellent at recognizing the purpose of functions and refactoring IDA or Ghidra pseudo-C into readable code.
How does it do on things that were originally written in assembly?
This is typically easier because the code was written for humans already.
Someone please try this on an original (early 1980s) IBM-PC BIOS.
I've been experimenting with running Claude in headless mode + a continuous loop to decompile N64 functions and the results have been pretty incredible. (This is despite already using Claude in my decompilation workflow).
I hope that others find this similarly useful.
This sounds interesting! Do you have some good introduction to N64 decompiliation? Would you recommend using Claude right from the start or rather try to get to know the ins and outs of N64 decomp?
This is super cool! I would be curious to see how Gemini 3 fares… I've found it to be even more effective than Opus 4.5 at technical analysis (in another domain).
What game are you working on?
Last sentence of the first paragraph says it’s Snowboard Kids 2.
For his defense, it is missing a "Tell HN"
And it isn't always obvious when the commenter is the submitter (no [S] tag like you see on other sites).
whoops, I did indeed miss that this was OP
I've been waiting for decompilation to show up in this space.
Are there any similar specialized decompilation LLM models available to be used locally?
This is a refreshingly practical demonstration of an LLM adding value. More of this please.