2

We ran 600 agent evals – steering hooks hit 100% accuracy, prompts hit 82%

[dead]