Claude Code’s sandboxing is a complete joke. There should be no ‘off switch.’ Sandboxing should not be opt in. It should not have full read access over the file system by default.
I really want more security people to get involved in the LLM space because everyone seems to have just lost their minds.
If you look at this thing through a security lens it’s horrifying, which was a cause of frustration when Anthropic changed their TOS to ban use of alternative clients with a subscription. I don’t want to use that Swiss cheese.
The first thing I recommend everyone using is devcontainers [1]. They're very simple to setup and make using LLMs a lot more secure.
Author here. I helped creating Falco (CNCF runtime security) and built this (Veto) to fix the path-based identity problem we all shipped a decade ago. The dynamic linker bypass in the "where it breaks" section is the part I'm most interested in discussing. It's a class of evasion that no current eval framework measures. Happy to answer questions about the BPF LSM implementation.
Thanks for your work! Just curious, would it be possible to pad the denylisted binary with arbitrary bytes and circumvent the content hash?
Security policy usually defaults unknown artifacts to low privileges.
> No jailbreak, no special prompting. The agent just wanted to finish the task.
Good lord, why do people use LLMs to write on this topic? It destroys credibility.
The adversary can reason now, and our security tools weren't built for that.
Leo di Donato, who helped create Falco, the cloud native runtime security, wrote a technical deep dive into how Claude Code bypassed it's own denylist and sandbox. And introduces Veto, a kernel-level enforcement engine built into the Ona platform.
Thank you for this write up. I am still lightyears behind this deep knowledge, but feel like I learned from your post the vocabulary to get started.
Claude Code’s sandboxing is a complete joke. There should be no ‘off switch.’ Sandboxing should not be opt in. It should not have full read access over the file system by default.
I really want more security people to get involved in the LLM space because everyone seems to have just lost their minds.
If you look at this thing through a security lens it’s horrifying, which was a cause of frustration when Anthropic changed their TOS to ban use of alternative clients with a subscription. I don’t want to use that Swiss cheese.
The first thing I recommend everyone using is devcontainers [1]. They're very simple to setup and make using LLMs a lot more secure.
[1] https://code.claude.com/docs/en/devcontainer
I opened an issue about this on day 1 of the release:
https://github.com/anthropic-experimental/sandbox-runtime/is...
I ended up making my own sandbox wrapper instead https://GitHub.com/arianvp/landlock-nix
Author here. I helped creating Falco (CNCF runtime security) and built this (Veto) to fix the path-based identity problem we all shipped a decade ago. The dynamic linker bypass in the "where it breaks" section is the part I'm most interested in discussing. It's a class of evasion that no current eval framework measures. Happy to answer questions about the BPF LSM implementation.
Thanks for your work! Just curious, would it be possible to pad the denylisted binary with arbitrary bytes and circumvent the content hash?
Security policy usually defaults unknown artifacts to low privileges.
> No jailbreak, no special prompting. The agent just wanted to finish the task.
Good lord, why do people use LLMs to write on this topic? It destroys credibility.
The adversary can reason now, and our security tools weren't built for that.
Leo di Donato, who helped create Falco, the cloud native runtime security, wrote a technical deep dive into how Claude Code bypassed it's own denylist and sandbox. And introduces Veto, a kernel-level enforcement engine built into the Ona platform.
Thank you for this write up. I am still lightyears behind this deep knowledge, but feel like I learned from your post the vocabulary to get started.