SLO basics

Topic: Monitoring basics

Summary

Define Service Level Objectives as target availability or latency. Use for alerting and capacity. Example: 99.9 percent uptime or p99 under 500ms. Use when you need to formalize reliability targets.

Intent: How-to

Quick answer

  • Choose indicator: availability or latency. Set target: 99.9 percent or p99 under 500ms. Measure over 30-day window typically.
  • Alert on error budget burn rate or when SLO is at risk. Do not alert on every SLO breach; use budget-based alerting.
  • Review SLO with product and eng. Adjust targets and error budget policy. Document in runbook.

Prerequisites

Steps

  1. Choose indicator and target

    Availability or latency. Set target and window. Get agreement from stakeholders.

  2. Measure and budget

    Implement measurement. Compute error budget. Define burn-rate alerting.

  3. Review

    Review SLO and budget consumption. Adjust targets or capacity. Document.

Summary

Define SLO (availability or latency); measure and track error budget; alert on burn rate; review regularly.

Prerequisites

Steps

Step 1: Choose indicator and target

Availability or latency; target and window; stakeholder agreement.

Step 2: Measure and budget

Implement measurement; error budget; burn-rate alerts.

Step 3: Review

Review SLO and budget; adjust; document.

Verification

  • SLO tracked; alerts fire on budget burn; runbook updated.

Troubleshooting

Always in budget — Target may be loose; tighten or add latency SLO. Always over — Improve reliability or relax target.

Next steps

Continue to