Sunday, August 22, 2021

Making Your On-Call and Incident Management Program Stick

Part of making incident management "work" at your company is establishing a set of shared vocabulary and rituals. These cultural artifacts have allowed your team to ascend to the next level of proficiency in the practice, in which they can convey complex ideas and status to each other with brevity. Achieving this with consistency across teams and functions at your company is very difficult work, so if you’ve made it this far: congratulations!

To get here, you’ve probably presented material like severity scale, tools, and retro format to a large group. You’ve answered a thousand "what if" questions about just severity alone, along with a dozen of the other tools and practices established. Your organization gets it. When an incident happens, engineers pass issues between teams quickly and easily. They declare a cryptic severity to a crowd of onlookers who knows what it means, and they’re closing with confidence. Your efforts have been rewarded! Chaos is contained. But there’s a serious problem on the horizon: new engineers are starting all the time.

from Feed

No comments:

Post a Comment