Chaos Monkey: How Netflix Uses Random Failure to Ensure Success

first_imgRelated Posts klint finley Why Tech Companies Need Simpler Terms of Servic… In a post last week about lessons learned using Amazon Web Services, Netflix‘s John Ciancutti revealed that the company built something called “Chaos Monkey” to ensure that individual components work independently. Chaos Monkey randomly kills instances and services within Netflix’s AWS infrastructure to help developers to make sure each individual component returns something even when system dependencies aren’t responding.For example, if the recommendation system is down Netflix will display popular titles instead of personalized picks. The quality of the response is degraded, but least there is a response. Ciancutti explains it this way: “If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most – in the event of an unexpected outage.”Here are the lessons Ciancutti writes that Netflix has learned: Dorothy, you’re not in Kansas anymore (“You need to be prepared to unlearn a lot of what you know”)Co-tenancy is hardThe best way to avoid failure is to fail constantlyLearn with real scale, not toy modelsCommit yourselfChaos Monkey fits into number three.For more advice on migrating to the cloud from Netflix, check out our article Netflix’s Advice on Moving to Amazon Web Services. 8 Best WordPress Hosting Solutions on the Marketcenter_img Top Reasons to Go With Managed WordPress Hosting A Web Developer’s New Best Friend is the AI Wai… Tags:#cloud#cloud computing last_img

Leave a Reply

Your email address will not be published. Required fields are marked *