Incident Workspace
Service Metrics
Environment
Production
Error Rate %
6%
avg · 1h
P95 Latency ms
2.0s
avg · 1h
Inbound Request Volume req/min
1.9k
avg · 1h
Success Rate %
90%
avg · 1h
payment-service
Incident started 18 minutes ago
Open senioreng.dev on your laptop for the full experience.
Incident Workspace
Environment
Production
Error Rate %
6%
avg · 1h
P95 Latency ms
2.0s
avg · 1h
Inbound Request Volume req/min
1.9k
avg · 1h
Success Rate %
90%
avg · 1h
Incident started 18 minutes ago
Production Incident
Incident Commander Update
Checkout failures are increasing rapidly across multiple customer-facing services.
The payment-service is experiencing a severe outage. Error rates have climbed to 28%, latency has increased from milliseconds to several seconds, and order processing is failing across the platform.
Multiple upstream services are reporting degraded health, request queues are backing up, and customers are unable to complete purchases successfully.
You are the primary on-call engineer. Investigate the available telemetry, identify what triggered the incident, determine the true root cause, and restore service stability before the outage spreads further.