Listen to a podcast, please open Podcast Republic app. Available on Google Play Store and Apple App Store.
Episode | Date |
---|---|
Foreword
|
Aug 29, 2023 |
Preface
|
Aug 29, 2023 |
Part I. Introduction
|
Aug 29, 2023 |
Chapter 1 - Introduction
|
Aug 29, 2023 |
Chapter 2 - The Production Environment at Google, from the Viewpoint of an SRE
|
Aug 29, 2023 |
Part II - Principles
|
Aug 29, 2023 |
Chapter 3 - Embracing Risk
|
Aug 29, 2023 |
Chapter 4 - Service Level Objectives
|
Aug 29, 2023 |
Chapter 5 - Eliminating Toil
|
Aug 29, 2023 |
Chapter 6 - Monitoring Distributed Systems
|
Aug 29, 2023 |
Chapter 7 - The Evolution of Automation at Google
|
Aug 29, 2023 |
Chapter 8 - Release Engineering
|
Aug 29, 2023 |
Chapter 9 - Simplicity
|
Aug 29, 2023 |
Part III - Practices
|
Aug 29, 2023 |
Chapter 10 - Practical Alerting
|
Aug 29, 2023 |
Chapter 11 - Being On-Call
|
Aug 29, 2023 |
Chapter 12 - Effective Troubleshooting
|
Aug 29, 2023 |
Chapter 13 - Emergency Response
|
Aug 29, 2023 |
Chapter 14 - Managing Incidents
|
Aug 29, 2023 |
Chapter 15 - Postmortem Culture: Learning from Failure
|
Aug 29, 2023 |
Chapter 16 - Tracking Outages
|
Aug 29, 2023 |
Chapter 17 - Testing for Reliability
|
Aug 29, 2023 |
Chapter 18 - Software Engineering in SRE
|
Aug 29, 2023 |
Chapter 19 - Load Balancing at the Frontend
|
Aug 29, 2023 |
Chapter 20 - Load Balancing in the Datacenter
|
Aug 29, 2023 |
Chapter 21 - Handling Overload
|
Aug 28, 2023 |
Chapter 22 - Addressing Cascading Failures
|
Aug 28, 2023 |
Chapter 23 - Managing Critical State: Distributed Consensus for Reliability
|
Aug 28, 2023 |
Chapter 24 - Distributed Periodic Scheduling with Cron
|
Aug 28, 2023 |
Chapter 25 - Data Processing Pipelines
|
Aug 28, 2023 |
Chapter 26 - Data Integrity: What You Read Is What You Wrote
|
Aug 28, 2023 |
Chapter 27 - Reliable Product Launches at Scale
|
Aug 28, 2023 |
Part IV. Management
|
Aug 28, 2023 |
Chapter 28 - Accelerating SREs to On-Call and Beyond
|
Aug 28, 2023 |
Chapter 29 - Dealing with Interrupts
|
Aug 28, 2023 |
Chapter 30 - Embedding an SRE to Recover from Operational Overload
|
Aug 28, 2023 |
Chapter 31 - Communication and Collaboration in SRE
|
Aug 28, 2023 |
Chapter 32 - The Evolving SRE Engagement Model
|
Aug 28, 2023 |
Part V - Conclusions
|
Aug 28, 2023 |
Chapter 33 - Lessons Learned from Other Industries
|
Aug 28, 2023 |
Chapter 34 - Conclusion
|
Aug 28, 2023 |
Appendix A - Availability Table
|
Aug 28, 2023 |
Appendix B - A Collection of Best Practices for Production Services
|
Aug 28, 2023 |
Appendix C - Example Incident State Document
|
Aug 28, 2023 |
Appendix D - Example Postmortem
|
Aug 28, 2023 |
Appendix E - Launch Coordination Checklist
|
Aug 28, 2023 |
Appendix F - Example Production Meeting Minutes
|
Aug 28, 2023 |
Bibliography
|
Aug 28, 2023 |