← Back to context

Comment by jeffreyq

21 hours ago

https://blog.railway.com/p/incident-report-february-11-2026

"we did not have the monitoring or controls to prevent our anti-fraud from hard killing 3% of workloads, including many instances of pg"

Oof.

  • Needs an anti-anti-fraud service which terminates malfunctioning anti-fraud services.

    • When I've written similar services, there was a (low) hard cap on how many fraud decisions they could action before they quit and paged. If we were getting hit with a wave of something, a human had to temporarily bump that limit.