Notification and alert routing

Topic: Monitoring basics

Summary

Route alerts to the right people or channels by service, severity, or time. Use routing rules and escalation. Use when you have multiple teams or services.

Intent: How-to

Quick answer

  • Route by label or tag: service, team, severity. Critical to PagerDuty; warning to Slack. Use routing rules in alert manager.
  • Escalate to next tier or manager if not acked. Different channels for different severities.
  • Test routing. Document who gets what. Avoid routing everything to everyone.

Prerequisites

Steps

  1. Define routes

    Map labels to receiver. service=api to api-team. severity=critical to PagerDuty. severity=warning to Slack.

  2. Escalation

    Set escalation chain. Primary then secondary then manager. Timeouts per level.

  3. Test and document

    Fire test alert per route. Document routing table. Review and update when teams change.

Summary

Route by service, team, or severity; set escalation; test and document.

Prerequisites

Steps

Step 1: Define routes

Map labels to receivers; critical vs warning channels.

Step 2: Escalation

Escalation chain and timeouts.

Step 3: Test and document

Test each route; document; review when teams change.

Verification

  • Right team gets right alerts; escalation works.

Troubleshooting

Wrong recipient — Fix routing rule or labels. No escalation — Check timeouts and chain.

Next steps

Continue to