๐Ÿ–ฅ๏ธ

Real incidents need a real screen.

Open senioreng.dev on your laptop for the full experience.

Based on real outages at GitHub, Knight Capital, CrowdStrike

Debug real production incidents. Under real time pressure.

SeniorEng puts engineers inside real incidents โ€” live dashboards, real metrics, the clock running. Used by engineering teams for hiring signal, onboarding, and keeping skills sharp.

senioreng.dev/incidents/payment-service
Live Incident

Payment Service: Error rate 94% ยท 6 min ago

P0 ยท 06:12 elapsed
Error Rate94%
P99 Latency4,200ms
DB CPU12%
Memory54%
Req/sec840

Your Move

Hint available ยท Score tracked

The Problem

Interviews test algorithms. The job requires something else.

LeetCode, take-homes, system design โ€” none of them test what actually matters on the job: debugging a production incident at 2am when three services are down and your manager is pinging you.

What interviews test

  • โœ•Pattern recognition
  • โœ•Weekend CRUD builds
  • โœ•Whiteboard system design

What the job actually requires

  • โœ“Debugging live systems under pressure
  • โœ“Following signals across logs, metrics, traces
  • โœ“Forming and updating hypotheses fast

What SeniorEng tests

  • โœ“Real incidents with real dashboards
  • โœ“Time pressure and manager escalation
  • โœ“Hypothesis โ†’ action โ†’ result loops

Use Cases

How engineering teams use SeniorEng

โšก

Hiring Signal

See how candidates think under pressure, not just what they know. Use an incident as a final round โ€” the results are unambiguous.

Replaces or augments the take-home. 20-30 min. Scored automatically.

๐Ÿš€

Onboarding

Get new engineers familiar with how real incidents look and feel โ€” before they're on-call. Faster ramp, less anxiety.

New hires do 3-5 problems in week 1. Builds mental models fast.

๐Ÿ”ฅ

Team Practice

Run a real incident as a team exercise. See who leads, who follows, who panics. Great for identifying skill gaps before they matter.

Works as a group exercise or async. Scores make it competitive.

The Problem Library

25 incidents. All based on real outages.

Not toy problems. Real failure modes from real companies โ€” reconstructed with accurate metrics, realistic dashboards, and the same signals the on-call engineer had.

GitHub

MySQL replication failure

Hard

Knight Capital

$440M in 45 minutes

Hard

CrowdStrike

8.5M Windows BSODs

Hard

Cloudflare

19 datacenters down

Hard

Netflix

Down on Christmas Eve

Medium

Google

YouTube, Gmail, Meet down

Medium

AWS

S3 100% failure

Hard

Meta

Zero external traffic

Hard

Early Feedback

"I used it as the final round for two candidates. The difference in how they approached the incident told me everything the take-home didn't."

Engineering Manager

Fintech company

"We had a production incident two weeks after I did the disk I/O problem here. Services were degrading, all the obvious metrics looked fine. I checked disk I/O first and found it quickly. Wouldn't have known to look there before."

Rohan M.

Senior Software Engineer, Payments Platform

"I used to dread code reviews โ€” I'd leave comments but never felt confident catching real issues. After doing a few problems here I started actually understanding what to look for. My reviews are much more useful now."

Priya S.

Software Engineer, E-commerce

"The dashboards look exactly like what we use in prod. Other simulations feel like toy problems โ€” this doesn't."

Senior SWE

Series B startup

Get Started

Working with 10 teams for free

In exchange for a weekly 15-minute feedback call. No pitch, no sales pressure โ€” just show us how your team uses it.

No credit card. No commitment. We'll reach out within 24 hours.