More people connected to more servers, increased reliance on complex distributed networks, and a proliferation of apps in development mean more opportunities for data leaks and breaches.
Modern problems require modern solutions, as Amazon found out the hard way. Netflix escaped with minor inconvenience by being prepared.
What did they do differently?
Amazon Web Services (AWS), Amazon’s cloud-based platform, experienced an outage on September 20, 2015, that crashed their servers for several hours and affected many vendors. Netflix experienced the issue as a blip because they’ve been there and done that when they changed their service delivery model. This led their engineering team to craft a unique solution for software production testing.
The solution? Chaos as a preventative for calamity. It’s predicated on the idea of failure as the rule rather than the exception, and it led to the development of the first dedicated chaos engineering tools. Otherwise known as the Simian Army, they’re called Chaos Monkey, Chaos Kong, and the newest member of the family, Chaos Automation Platform (ChAP).
What Are the Benefits of Chaos Engineering in DevOps?
Focusing only on a network environment and the associated security considerations (because the world of chaos engineering is quite large), we have already seen it as a positive force in an already strong cybersecurity market for improving business risk mitigation, fostering customer confidence, and reducing the workload for IT teams. If you’re a business owner, you’ll be blessed with happier engineers, reduced risk of revenue loss, and lower maintenance costs.
Customers, whether B2B or B2C, will enjoy greater service availability that’s more reliable and less prone to disruptions. Tech teams will be able to reduce failure incidents and gain deeper insight into how their apps work. It will also lead to better design, faster mean time in response to SEVs, and fewer repeat incidences.
Is There a Downside?
Critics feel that chaos engineering is just another industry buzzword or cover up for apps that were poorly designed in the first place. Some chaos engineering proponents opine that this is the result of an ego-driven mentality. If you’re confident in your capabilities and work product, there should be nothing to fear in testing their limits.
Chaos engineering is meant to eliminate the eight logical fallacies that plague many developers and software engineers who are new to distributed networks while providing a system for more refined testing.
These incorrect assumptions are that:
- Networks are reliable
- Latency is zero
- Bandwidth is infinite
- Networks are secure
- Topology never changes
- Each system has only one admin, who also doesn’t change
- Transportation costs nothing
- Networks are homogenous
A quick look at internet usage statistics around the world demonstrates the need for a focus on innovative network testing at all phases of software development. Achieving that means taking a non-traditional approach to DevOps.