Notification and alert routing
Topic: Monitoring basics
Summary
Route alerts to the right people or channels by service, severity, or time. Use routing rules and escalation. Use when you have multiple teams or services.
Intent: How-to
Quick answer
- Route by label or tag: service, team, severity. Critical to PagerDuty; warning to Slack. Use routing rules in alert manager.
- Escalate to next tier or manager if not acked. Different channels for different severities.
- Test routing. Document who gets what. Avoid routing everything to everyone.
Prerequisites
Steps
-
Define routes
Map labels to receiver. service=api to api-team. severity=critical to PagerDuty. severity=warning to Slack.
-
Escalation
Set escalation chain. Primary then secondary then manager. Timeouts per level.
-
Test and document
Fire test alert per route. Document routing table. Review and update when teams change.
Summary
Route by service, team, or severity; set escalation; test and document.
Prerequisites
Steps
Step 1: Define routes
Map labels to receivers; critical vs warning channels.
Step 2: Escalation
Escalation chain and timeouts.
Step 3: Test and document
Test each route; document; review when teams change.
Verification
- Right team gets right alerts; escalation works.
Troubleshooting
Wrong recipient — Fix routing rule or labels. No escalation — Check timeouts and chain.