Horizontal vs Vertical Scaling
Vertical scaling means adding more CPU/RAM/disk to a single machine. Horizontal scaling means adding more machines and distributing load across them.
Vertical = building a taller skyscraper (eventually hits the sky). Horizontal = building more houses across a neighborhood (sprawl forever, but now you need roads between them).
Vertical scaling has a hard ceiling (the biggest available instance) and creates a single point of failure. Horizontal scaling is theoretically unbounded but requires stateless application design, a load balancer, and distributed coordination. Most web services scale horizontally for the application tier and vertically (then shard) for the database tier.
Vertical scaling is often the right first move: it's operationally simpler, avoids distributed systems complexity, and modern cloud instances (e.g., AWS r6i.32xlarge with 128 vCPUs and 1TB RAM) can handle enormous workloads. Horizontal scaling introduces network overhead, distributed consistency problems, and operational complexity. The right architecture usually combines both: scale stateless app servers horizontally, scale databases vertically until you must shard. For stateful services, horizontal scaling requires solving data locality (consistent hashing, sticky sessions) and coordination (distributed locks, leader election).
I'd start with vertical scaling for simplicity, then move to horizontal when approaching resource limits. For stateless app servers, horizontal scaling behind a load balancer is straightforward. For databases, I'd scale vertically first, then add read replicas for read-heavy workloads, and finally shard writes when single-node write throughput is the bottleneck. The key enabler for horizontal scaling is stateless services — all session and application state must live in external stores.
Assuming horizontal scaling is always better. It's operationally more complex and introduces distributed systems problems. Many systems run perfectly well on a single well-provisioned machine for years.