I built this while working on a coding agent that kept starting cold every session. The deeper problem was that agent frameworks give you what a tool does and how to call it, but no structured answer to when — when should a tool fire autonomously, and when should it stay silent. That judgement is always implicit, scattered across system prompts and tool descriptions.
Tendril is a reference implementation of what I'm calling the Agent Capability pattern. It starts with three bootstrap tools and builds everything else itself. The key constraint: there's no direct code execution. The agent can only run registered capabilities, so every task forces it to write a tool, define its invocation conditions, and register it for future sessions. The registry accumulates across sessions.
I also ran the self-extending loop against five local models — Qwen3-8B, Gemma 4, Mistral Small 3.1, Devstral Small 2, Salesforce xLAM-2. None passed.
You can list the uses of the available tools in the AGENTS. I keep my agents on a tight leash, and self-extension runs counter to this. I would not my agent to spontaneously develop the ability to tap my bank account, for example.
I built this while working on a coding agent that kept starting cold every session. The deeper problem was that agent frameworks give you what a tool does and how to call it, but no structured answer to when — when should a tool fire autonomously, and when should it stay silent. That judgement is always implicit, scattered across system prompts and tool descriptions.
Tendril is a reference implementation of what I'm calling the Agent Capability pattern. It starts with three bootstrap tools and builds everything else itself. The key constraint: there's no direct code execution. The agent can only run registered capabilities, so every task forces it to write a tool, define its invocation conditions, and register it for future sessions. The registry accumulates across sessions.
I also ran the self-extending loop against five local models — Qwen3-8B, Gemma 4, Mistral Small 3.1, Devstral Small 2, Salesforce xLAM-2. None passed.
The failure modes were distinct enough to be worth writing up separately: https://serverlessdna.com/strands/ai-agents/agents-know-what...
Stack: AWS Strands TypeScript SDK, Bedrock (Claude Sonnet), Deno sandbox, Tauri + React desktop shell.
You can list the uses of the available tools in the AGENTS. I keep my agents on a tight leash, and self-extension runs counter to this. I would not my agent to spontaneously develop the ability to tap my bank account, for example.