Indexes & Query Planscritical

EXPLAIN ANALYZE

EXPLAIN shows the query plan the planner chose without executing the query. EXPLAIN ANALYZE actually executes the query and shows actual row counts, timings, and loop counts alongside the planner's estimates. The difference between estimates and actuals reveals planner mistakes.

Memory anchor

EXPLAIN ANALYZE is a postmortem autopsy — EXPLAIN is the planned procedure, ANALYZE is what actually happened on the operating table. Read the notes bottom-up (innermost first) and compare expected vs actual blood loss (row counts).

Expected depth

Key EXPLAIN nodes: Seq Scan (full table scan), Index Scan (random I/O via index), Index Only Scan (all data in index), Bitmap Heap Scan (index then sorted heap access), Hash Join / Merge Join / Nested Loop (join algorithms), Sort (explicit sort), Hash Aggregate / Group Aggregate (aggregation). Read bottom-up — innermost nodes execute first. 'rows=X' is the estimate; 'actual rows=X' is truth. Huge discrepancy (off by 10x+) indicates stale statistics — run ANALYZE.

Deep — senior internals

EXPLAIN (ANALYZE, BUFFERS) shows buffer hit/miss counts — crucial for I/O analysis. Shared Buffers Hit: served from cache. Shared Read: read from disk. High read count for a simple query suggests a missing index or cold cache. EXPLAIN (ANALYZE, FORMAT JSON) provides machine-readable output for tooling. Planning time vs Execution time: for very fast queries, planning overhead can dominate — use prepared statements to amortize planning. pg_stat_statements extension tracks cumulative statistics (total time, calls, mean time) across all executions — essential for production query tuning.

🎤Interview-ready answer

EXPLAIN ANALYZE is my primary debugging tool. I read it bottom-up, looking for: (1) Seq Scan on large tables where an index should exist. (2) Huge estimate vs actual row discrepancy (stale statistics → run ANALYZE). (3) Nested Loop over large row counts (consider Hash Join). (4) Sort nodes that could be eliminated by index ORDER BY. Adding BUFFERS shows if the query is I/O-bound vs CPU-bound.

Common trap

Running EXPLAIN ANALYZE on INSERT/UPDATE/DELETE to see the plan — it actually executes the DML, modifying your data. Wrap in a transaction and roll back: BEGIN; EXPLAIN ANALYZE UPDATE ...; ROLLBACK;