the.com/chaos engineering

you break your own systems on purpose so the internet doesn't do it for you.

means a practice of deliberately injecting failures into production systems to find weaknesses before they cause real outages.

from born at netflix around 2010 when engineers built chaos monkey, a tool that randomly killed servers during business hours to force resilient design.

for instance

netflix chaos monkeyrandomly terminates instances since 2011, open-sourced 2012

amazon gamedayaws teams simulate outages internally to rehearse response

google dirt exercisesdisaster recovery testing that once knocked out internal tools by mistake

the.com/
what’s happening now · the.com · generated