What is Back-of-Envelope Estimation in System Design?

Interview Frameworkcritical

Back-of-Envelope Estimation

Quick calculations to determine system scale: storage requirements, bandwidth, QPS, server count. These numbers drive design decisions — whether you need sharding, caching, CDN, and how many servers are required.

Memory anchor

Back-of-envelope = napkin math at a restaurant. 'We need 100M users x 1KB each = 100GB. That fits on one SSD!' It's not about precision -- it's about knowing if you need a bicycle or a freight train.

Expected depth

Key numbers to memorize: latency (L1 cache 0.5ns, RAM 100ns, SSD 100μs, network same DC 500μs, cross-region 150ms), throughput (1Gbps network, SSD 500MB/s), storage units (1KB text, 100KB image, 4MB video segment). Approach: start with users (100M DAU), derive requests (10% active at peak = 10M users, 10 requests/user/hour = 100M req/hour = 28K QPS), derive storage (100M users × 1KB profile = 100GB), derive bandwidth (28K QPS × 10KB avg response = 280MB/s).

Deep — senior internals

Common estimation mistakes: not distinguishing peak from average (design for peak: typically 3–5x average), forgetting replication factor (3 replicas = 3x storage), ignoring metadata overhead (indexes, logs, backups add 30–50% to raw data storage). Essential formula: QPS = DAU × requests_per_user_per_day / 86400. For storage growth: data_per_day × days_retention × replication_factor. For read throughput: peak_QPS × avg_response_size. Server count: (QPS × avg_latency) / (cores_per_server × utilization_target). Single server handles ~1000-10000 QPS for CPU-bound, more for I/O bound with async. Bandwidth: 1Gbps NIC is ~100MB/s TCP throughput (TCP overhead). A video streaming server at 1Mbps per stream serves 100 concurrent viewers per Gbps NIC.

🎤Interview-ready answer

I structure estimations in three buckets: (1) Traffic: QPS = DAU × actions/day / 86400, scale for 3x peak. (2) Storage: entities × size_per_entity × replication × retention. (3) Bandwidth: peak QPS × average payload size. Then I translate those to component requirements: if I need > 100K QPS, I'll need sharding or a distributed cache. If storage is > 10TB, I need distributed storage. I keep math simple and round aggressively — the goal is an order of magnitude, not an exact number.

⚠Common trap

Skipping estimation and going straight to architecture. Numbers drive decisions: a system serving 100 QPS has completely different constraints than one serving 1M QPS. Without estimation, you might design a distributed system for a workload that fits on one server, or a single-server system that will fall over under real load.