Load Balancers: L4 vs L7
Load balancers distribute incoming traffic across multiple backend servers. L4 (transport layer) balances based on TCP/UDP without inspecting payload. L7 (application layer) understands HTTP and can route based on URL, headers, and cookies.
L4 load balancer = a highway traffic cop waving cars to different lanes (fast, doesn't care what's inside). L7 = a hotel concierge reading your reservation to send you to the right floor (smarter, but takes a moment to read).
L4 load balancers (AWS NLB, HAProxy TCP mode): operate on IP and port, extremely fast (nanoseconds per packet), support any TCP/UDP protocol, no TLS termination at the balancer level (TLS passthrough). L7 load balancers (AWS ALB, Nginx, Envoy): inspect HTTP headers and URLs, enable path-based routing (/api to service A, /static to CDN), host-based routing (api.example.com vs www.example.com), sticky sessions by cookie, TLS termination, WebSocket upgrades, HTTP/2 support. Algorithms: round-robin, least connections, IP hash (sticky), weighted round-robin.
L7 load balancers are the standard for microservices — they serve as the ingress layer for service routing, auth middleware, rate limiting, and observability. Envoy Proxy is the sidecar of choice in service meshes (Istio, Linkerd) — each pod gets an Envoy sidecar that handles all inbound/outbound traffic with circuit breaking, retries, and distributed tracing built in. Health checks: L4 checks whether the TCP port is open; L7 sends HTTP requests to a /health endpoint and checks the response code. L7 health checks catch application-level failures (DB connection pool exhausted) that L4 misses. Consistent hashing in load balancers: for stateful upstreams (caching tiers), consistent hashing ensures the same request always routes to the same upstream server, maximizing cache utilization. Connection draining: when removing a backend from rotation, the load balancer stops sending new connections but waits for existing connections to complete — critical for zero-downtime deployments.
For a typical web application, I'd use L7 load balancing (ALB or Nginx) at the edge for HTTP routing, TLS termination, and health checking. Behind it, I'd use L4 for any non-HTTP protocols. In a microservices architecture, I'd use a service mesh (Istio + Envoy) for internal service-to-service load balancing — it gives me circuit breaking, retries, and distributed tracing for free. For the cache tier, I'd configure the load balancer with consistent hashing to maximize cache hit rates.
Forgetting that L7 load balancers introduce additional latency and are themselves a single point of failure. Always run multiple load balancer instances behind a VIP (Virtual IP) with ECMP routing, and monitor load balancer CPU — they can saturate under TLS handshake load.