GOTO Amsterdam 2023

Tuesday Jun 27
11:20 –

Ten Things We've Learned From Running Production Infrastructure at Google

Google’s production infrastructure might be one of the most complex machines that humanity has built so far. It is constantly changing and evolving. Site Reliability Engineers (SREs) are the specialists to manage and improve the architectures, tooling, and operational procedures that enable Google to keep its products reliable, scalable, efficient, and agile. This talk will discuss a number of fundamental organizational principles that Google SRE has learned over the years.