ops, dev Brian Conn ops, dev Brian Conn

Key Alerting Metrics

Good alerting is critical to operating a SaaS (or any other software) platform. Good alerts are timely, actionable, understandable, and correct. In this context, correct means minimizing false positives (alerting when there is not an issue) and false negatives (not firing when there is an issue).

Key Alerting Metrics are four metrics to monitor a whole system, subsystem, or microservice based on customer pain which all production engineers can understand.

Read More
ops Brian Conn ops Brian Conn

System Impact and Mitigation

The goal of incident response is to minimize the total impact on customers over time through mitigation and root cause resolution. For example, a high-impact, short-duration incident (five-minute total outage) can be as impactful to a customer as a low-impact, long-duration incident (slowness for a full day).

A key component of SaaS incident response is to mitigate the incident, if possible, to lessen the immediate impact on the customer and buy the team time to resolve the issue permanently.

Read More
culture, agile Brian Conn culture, agile Brian Conn

Product Delivery Team

For SaaS companies, the Product Delivery Team is all individuals involved in building and operating the product. All these sub-teams share a common goal: continuously deliver customer value.

Read More
agile Brian Conn agile Brian Conn

Jira Ticket Hierarchy

Standardizing a Jira Ticket Hierarchy allows all producers of feature requirements and implementation details (product management, engineering leads, and engineers) and consumers of those requirements (support, SRE, docs, marketing) to collaborate on the right level of detail for their job roles.

Read More