I Added a 71-Line Black Box to My Python Agent, Then Queried the $200 Crash With DuckDB
A Python agent's retry loop projected $200 in costs, prompting a 71-line black box recorder that logs each turn as JSONL events with tool duration, secret sanitization, and unique run IDs. DuckDB queries the log to pinpoint failures like tool timeouts and guard stops, replacing guesswork with evidence.