RED and USE metrics
Topic: Monitoring basics
Summary
RED for services: Rate, Errors, Duration. USE for resources: Utilization, Saturation, Errors. Use these to choose what to measure and alert on.
Intent: How-to
Quick answer
- RED: request Rate, Error rate, Duration (latency). Use for HTTP and RPC services. Alert on rate drop, error spike, or latency high.
- USE: Utilization, Saturation, Errors. Use for CPU, disk, memory, network. Alert when utilization or saturation high or errors non-zero.
- Implement RED per service and USE per resource. Dashboards and alerts follow these. Simplifies coverage.
Prerequisites
Steps
-
RED for services
Instrument rate, errors, duration. Add to dashboard and alerts. Per service or endpoint.
-
USE for resources
Measure utilization, saturation, errors for CPU, disk, memory, network. Dashboard and alert.
-
Review coverage
Ensure each service has RED and each resource has USE. Fill gaps.
Summary
Use RED for services and USE for resources. Implement and alert on both. Review coverage.
Prerequisites
Steps
Step 1: RED for services
Rate, errors, duration per service; dashboard and alerts.
Step 2: USE for resources
Utilization, saturation, errors per resource; dashboard and alerts.
Step 3: Review coverage
Each service RED; each resource USE; fill gaps.
Verification
- RED and USE metrics present; alerts and dashboards in place.
Troubleshooting
Missing metric — Add instrumentation or scrape. Noise — Tune thresholds.