Incident Workspace
Service Overview
Environment
Production
18 services simultaneously degraded
All 18 services share the same cache cluster. Error messages are consistent across all of them: cache: connection refused, cache: no primary node available, cache: slot assignment error. No individual service has a deployment in the last 6 hours.
delivery-service
P99: 4.2s
71%
errors
search-service
P99: 3.8s
68%
errors
surge-pricing
P99: 5.1s
74%
errors
fulfillment-service
P99: 4.0s
69%
errors
notification-service
P99: 3.6s
66%
errors
payment-router
P99: 4.4s
72%
errors
order-tracking
P99: 3.9s
67%
errors
geo-service
P99: 3.5s
65%
errors
driver-dispatch
P99: 4.7s
73%
errors
marketplace-api
P99: 4.1s
70%
errors
catalog-service
P99: 3.4s
64%
errors
analytics-writer
P99: 3.3s
63%
errors
rate-limiter
P99: 5.3s
75%
errors
recommendation-engine
P99: 3.7s
66%
errors
route-optimizer
P99: 4.0s
69%
errors
session-service
P99: 4.3s
71%
errors
event-bus
P99: 3.2s
62%
errors
fraud-detection
P99: 3.8s
68%
errors
Error pattern — consistent across all 18 services
[ERROR] cache: no primary node available for slot 4821
[ERROR] cache: connection refused — node 10.0.1.4:6379
[ERROR] cache: slot assignment error — cluster topology changed mid-request
[ERROR] cache: no primary node available for slot 9102
[ERROR] cache: MOVED redirect to node that is no longer primary
[ERROR] cache: connection refused — node 10.0.1.7:6379