What is EC2 & Auto Scaling in AWS / Cloud?

Computecritical

EC2 & Auto Scaling

EC2 provides virtual machines in the cloud with configurable CPU, memory, storage, and networking. Auto Scaling Groups (ASGs) automatically adjust the number of instances based on demand using scaling policies.

Memory anchor

EC2 is a vending machine — pick your snack (instance type), insert coins (On-Demand), or buy a monthly pass (Reserved). Auto Scaling is a store manager who opens more checkouts at rush hour and closes them when it's quiet.

Expected depth

EC2 instance families: general purpose (M, T series), compute optimized (C series), memory optimized (R, X series), storage optimized (I, D series), accelerated (P, G series for GPU). Purchasing options: On-Demand (pay per hour), Reserved Instances (1–3 year commit, up to 72% discount), Spot Instances (up to 90% discount, can be interrupted with 2-minute warning), Savings Plans (flexible Reserved). ASG uses launch templates, min/max/desired capacity, and scaling policies — target tracking (maintain CPU at 50%), step scaling, or scheduled scaling.

Deep — senior internals

ASG health checks can use EC2 status checks or ELB health checks. ELB health checks are preferred — they catch application-level failures, not just instance crashes. Instance refresh allows rolling updates with configurable minimum healthy percentage. Warm pools pre-warm instances so they can launch quickly without cold-start delay. Lifecycle hooks let you run custom actions before an instance enters or leaves service (e.g., deregister from service registry). EC2 Placement Groups: Cluster (same AZ, low latency — HPC), Spread (different hardware — maximum availability), Partition (groups of instances on separate racks — Hadoop/Kafka). Instance metadata service (IMDS) v2 requires session tokens, preventing SSRF attacks that could steal credentials from the metadata endpoint.

🎤Interview-ready answer

EC2 offers a spectrum of instance types for different workloads — C series for compute, R for memory, P/G for GPU. I choose On-Demand for unpredictable workloads, Reserved for steady-state, and Spot for fault-tolerant batch jobs. Auto Scaling Groups with target tracking keep costs elastic — I set the ASG to maintain 60-70% CPU, which gives headroom for spikes. I always use ELB health checks in the ASG, not EC2 checks, to catch application failures.

⚠Common trap

Using EC2 health checks in an ASG instead of ELB health checks. EC2 checks only detect instance-level failures (hardware, OS crash). If your app hangs or returns 500s, EC2 checks still show healthy and the ASG never replaces the broken instance.

Related concepts

System Design

Horizontal vs Vertical Scaling