Show HN: CodeRLM – Tree-sitter-backed code indexing for LLM agents

I've been building a tool that changes how LLM coding agents explore codebases, and I wanted to share it along with some early observations.

Typically claude code globs directories, greps for patterns, and reads files with minimal guidance. It works in kind of the same way you'd learn to navigate a city by walking every street. You'll eventually build a mental map, but claude never does - at least not any that persists across different contexts.

The Recursive Language Models paper from Zhang, Kraska, and Khattab at MIT CSAIL introduced a cleaner framing. Instead of cramming everything into context, the model gets a searchable environment. The model can then query just for what it needs and can drill deeper where needed.

coderlm is my implementation of that idea for codebases. A Rust server indexes a project with tree-sitter, builds a symbol table with cross-references, and exposes an API. The agent queries for structure, symbols, implementations, callers, and grep results — getting back exactly the code it needs instead of scanning for it.

The agent workflow looks like:

1. `init` — register the project, get the top-level structure

2. `structure` — drill into specific directories

3. `search` — find symbols by name across the codebase

4. `impl` — retrieve the exact source of a function or class

5. `callers` — find everything that calls a given symbol

6. `grep` — fall back to text search when you need it

This replaces the glob/grep/read cycle with index-backed lookups. The server currently supports Rust, Python, TypeScript, JavaScript, and Go for symbol parsing, though all file types show up in the tree and are searchable via grep.

It ships as a Claude Code plugin with hooks that guide the agent to use indexed lookups instead of native file tools, plus a Python CLI wrapper with zero dependencies.

For anecdotal results, I ran the same prompt against a codebase to "explore and identify opportunities to clarify the existing structure".

Using coderlm, claude was able to generate a plan in about 3 minutes. The coderlm enabled instance found a genuine bug (duplicated code with identical names), orphaned code for cleanup, mismatched naming conventions crossing module boundaries, and overlapping vocabulary. These are all semantic issues which clearly benefit from the tree-sitter centric approach.

Using the native tools, claude was able to identify various file clutter in the root of the project, out of date references, and a migration timestamp collision. These findings are more consistent with methodical walks of the filesystem and took about 8 minutes to produce.

The indexed approach did better at catching semantic issues than native tools and had a key benefit in being faster to resolve.

I've spent some effort to streamline the installation process, but it isn't turnkey yet. You'll need the rust toolchain to build the server which runs as a separate process. Installing the plugin from a claude marketplace is possible, but the skill isn't being added to your .claude yet so there are some manual steps to just getting to a point where claude could use it.

Claude continues to demonstrate significant resistance to using CodeRLM in exploration tasks. Typically to use you will need to explicitly direct claude to use it.

---

Repo: github.com/JaredStewart/coderlm

Paper: Recursive Language Models https://arxiv.org/abs/2512.24601 — Zhang, Kraska, Khattab (MIT CSAIL, 2025)

Inspired by: https://github.com/brainqub3/claude_code_RLM

Excellent share, thank you. My question is with your setup, how strictly does Claude Code adhere to using this mode to traverse the codebase over grep? I have found this is to be a huge issue when implementing similar solutions... it loves to just grep.

Aider [0] wrote a piece about this [1] way back in Oct 2023!

I stumbled upon it in late 2023 when investigating ways to give OpenHands [2] better context dynamically.

[0] https://aider.chat/

[1] https://aider.chat/2023/10/22/repomap.html

[2] https://openhands.dev/

Aider's repo-map concept is great! thanks for sharing, I'd not been aware of it. Using tree-sitter to give the LLM structural awareness is the right foundation IMO. The key difference is how that information gets to the model.

Aider builds a static map, with some importance ranking, and then stuffs the most relevant part into the context window upfront. That's smart - but it is still the model receiving a fixed snapshot before it starts working.

What the RLM paper crystallized for me is that the agent could query the structure interactively as it works. A live index exposed through an API lets the agent decide what to look at, how deep to go, and when it has enough. When I watch it work it's not one or two lookups but many, each informed by what the previous revealed. The recursive exploration pattern is the core difference.

Aider actually prompts the model to say if it needs to see additional files. Whenever the model mentions file names, aider asks the user if they should be added to context.

As well, any files or symbols mentioned by the model are noted. They influence the repomap ranking algorithm, so subsequent requests have even more relevant repository context.

This is designed as a sort of implicit search and ranking flow. The blog article doesn’t get into any of this detail, but much of this has been around and working well since 2023.

I just looked and it was posted a number of times with 0 discussion

https://news.ycombinator.com/item?id=38062493

https://news.ycombinator.com/item?id=41411187

https://news.ycombinator.com/item?id=40231527

https://news.ycombinator.com/item?id=39993459

https://news.ycombinator.com/item?id=41393767

https://news.ycombinator.com/item?id=39391946

Great idea! I’ve been thinking about something along these lines as well.

I recommend configuring it as a tool for Opencode.

Going from Claude Code to Opencode was like going from Windows to Mac.

yeah I would definitely recommend the same. I'm a daily user of opencode and I really want to try this.

will take a look at opencode, thanks for sharing!

I wonder how this sort of thing compares with asking claude to read a ctags file. I have git hooks set up to keep my tags up to date automatically, so that data is already lying around.

Would this be useful to people who aren't using Claude? Maybe it should be installable in a more normal way, instead of as a Claude plugin.

I don't see why it wouldn't - but I'm not familiar with setup / integration on other platforms. Would love to hear more about your stack and see if we can't find a way for you to try it out

A CLI or slim MCP would do it. IF you want a formal plugin, here's another popular ecosystem: https://opencode.ai/docs/plugins/

I see a lot of overlap with LSPs, which better agents already use, so I would appreciate a comparison. What does this add?

Tree-sitter and LSP solve different problems.

LSP is a full fledged semantics solution providing go-to-definition functionality, trace references, type info etc. But requires a full language server, project configuration, and often a working build. That's great in an IDA, but the burden could be a bit much when it comes to working through an agent.

Tree-sitter handles structural queries giving the LLM the ability to evaluate function signatures, hierarchies and such. Packing this into the recursive language model enables the LLM to decide when it has enough information, it can continue to crawl the code base in bite sized increments to find what it needs. It's a far more minimal solution which lets it respond quickly with minimal overhead.

Can you make the plugin start automatically, on some suitable trigger? Any plans to support JVM languages?

edit: Does Claude not invoke it automatically, then, so you have to call the skill?

I've been tinkering with it substantially and the most I can say is that it generally doesn't trigger automatically :( Claude has a really, really strong affinity for it's existing tools for exploring a code base.

I'd be happy to add support for scala and java - the current binary size is 11MB on my machine, so I think there's an opportunity to expand what this offers. At this time I don't know where I would draw the line of I'm not planning on supporting a thing. I think to some degree it would depend on usage / availability on my part

been wondering about treesitter grepping for agents

how do plans compare with and without etc. evven just anecdotally what you've seen so far etc

anecdotally, it seems like this helps find better places for code to sit, understands the nuances of a code base better, and does a better job avoiding duplicate functionality.

it's still very much a work in progress, the thing I'm struggling with most right now is to have claude even using the capability without directly telling it to.

there seems to be benefits to the native stack (which lists files and then hopes for the best) relative to this sometimes. Frankly, it seems to be better at understanding the file structure. Where this approach really shines is in understanding the code base.