The challenge
Northwave's SaaS platform had grown fast and passed every functional test, but nobody knew how it behaved under real failure. Load testing showed healthy averages, yet the team had a nagging suspicion that the green dashboards were hiding fragile dependencies.
A flagship enterprise customer was about to onboard, roughly tripling sustained traffic. The platform had never run at that level, and a visible outage during the rollout would have jeopardised the contract and the company's reputation in a crowded market.
The engineering team needed evidence — not opinions — about where the system would break, and they needed it before the launch, not during the incident retro afterwards.
Our solution
Korur designed a series of controlled chaos experiments against a production-like environment: killing instances, injecting network latency, throttling the database connection pool and severing third-party dependencies one at a time. Each experiment had a clear hypothesis and a defined blast radius so nothing ran uncontrolled.
The experiments quickly exposed failures the dashboards never showed — a retry storm that amplified a single slow query into a full outage, a cache stampede on cold start, and a queue that silently dropped messages when a downstream service timed out instead of surfacing the error.
We worked alongside Northwave's engineers to remediate each finding — adding circuit breakers, backpressure and timeouts — then re-ran the same experiments to prove the fixes held. Resilience moved from a hope to a measured, repeatable property.
Services used
The results
Failure modes identified
Fixed before production
Launch-day incidents
Sustained traffic increase absorbed
“The chaos experiments found problems our load tests had been hiding for a year. We walked into our biggest launch ever knowing exactly how the system fails — and having already fixed it. That confidence was priceless.”
Sanne de Groot
VP Engineering, Northwave Cloud
Ready for similar results?
No-obligation conversation. Let's map your path to the same outcome.