Real incidents where AI agents burned real money. If your stack can do any of these, the budget gate costs less per month than 5 seconds of one of them.
| Date | Stack | Loss | What happened | Source |
|---|---|---|---|---|
| Jul 2025 | Replit AI (coding agent) |
prod DB + 1,206 records |
Agent issued DROP DATABASE on prod during active code freeze. Repeated instructions to not edit were ignored. CEO Amjad Masad: "unacceptable and should never be possible." |
Fortune Tom's Hardware @amasad on X |
| Nov 2025 | LangChain + A2A (multi-agent) |
$47,000 | Four agents in A2A coordination. Two got into clarification ping-pong, ran 11 straight days. $127 → $891 → $6,240 → $18,400 → $47,000 weekly. No step cap, no per-conversation budget, no orchestrator. |
Medium postmortem DEV.to writeup |
| 2026 | LangGraph (autonomous refactor) |
$4,200 | Solo developer kicked off an autonomous refactor over a long weekend. Three days, $4,200 in API fees, no budget circuit-breaker, workload never validated before launch. | DEV.to cost-blowup walkthrough |
| Apr 2026 | Cursor + Claude (coding agent) |
PocketOS DB 30h outage |
Agent given staging credential task. Found a Railway API token in an unrelated file (not scoped). Issued one curl command — wiped production DB + all backups in 9 seconds. Rental business clients lost recent bookings, customer details, transactions. |
OECD.AI incident log DevOps.com Zenity post |
Rate limits cap request frequency, not budget impact. A single legitimate-looking DROP or POST call can cost more than 10K legitimate GETs. The model that loops is the one you'd ask to self-throttle — same context, same failure.
Before each tool call the cache MCP returns OK or 402 BUDGET_EXCEEDED with the reason. Your agent gets a deterministic stop signal that can't be ignored by the same model that's looping.
First 20 reply-DMs from any HN agent-disaster thread get 90 days of Pro free in exchange for a 15-min call about how the incident shaped.
Built by an indie. Terms · Privacy · Agent Card