Site reliability engineering (SRE) is one of the fastest-growing enterprise roles and set of operational practices for managing services at scale.
During the Site Reliability Engineering SKILup Day June 18, I asked a group of the DevOps Institute Ambassadors why SRE was important to them.
DevOps Institute Ambassador from Kristiansand, Norway
“The term SRE surely has been introduced by Google, but directly or indirectly several companies have been doing stuff related to SRE for a long time, though I must say that Google gave it a new direction after coining the term ‘SRE.’ I have a clear view on SRE as I believe it walks hand-in-hand with DevOps. All your infrastructure, operations, monitoring, performance, scalability and reliability factors are accounted for in a nice, lean and automated system (preferably); however this is not enough. Culture is an important aspect driving the SRE aspects, along with business needs. As the norm ‘to each, his own’ goes, SRE is no different. It is easy to get inspired from pioneer companies, but it’s impossible to copy their culture and means to replicate the success, especially with your ‘anti-patterns’ and ‘traditional’ remedial baggage. Do you have similar infrastructure and business needs as the company showcasing brilliant success with SRE? No. Can it help you? Absolutely. The key factor here is to recognize what is important to your success blueprint after understanding the fundamentals of it and find your own success factors considering your cultural needs. Your strategy and culture need to walk together, just like your guiding (strategy) and driving (culture) factors.”
DevOps Institute Ambassador from Warsaw, Poland
“With all of those methodologies and frameworks out there, one can only ponder: ‘Which one should I choose?’ Well, all of those frameworks and methodologies provide something of value, as long as you have an opportunity to actually test some of those concepts in your own local environment. Consider SRE for example: It worked for Google, sure. If anyone doubts it there are at least three books on the topic with the materials, proof and philosophy behind SRE. Even if your company is not the scale of Google you still should take a look at best practices that worked in 2017, 2018, 2019 and that are still working now. Considering the crazy times that we live in right now and the plethora of various free materials, conferences, blog posts, research whitepapers and videos on the topic, what are we waiting for? The knowledge is right here, ripe for the taking. Advancing the humanity of DevOps is just a matter of experimentation, trust and continuous improvement. And with events such as the SKILup Day: Site Reliability Engineering where you can network, exchange field stories, brag about your ‘battle scars’ and participate in some splendid speeches, it’s only easier to jump into the fascinating world of SRE discipline.