Query Optimization
Query optimization is the process of making database queries faster. The first tool is EXPLAIN ANALYZE, which shows the actual execution plan: which indexes were used, how many rows were scanned, where time was spent. Common fixes include adding indexes, rewriting queries to avoid sequential scans, and ensuring the query planner has accurate statistics (run ANALYZE on tables after bulk loads).
Query optimization is detective work: EXPLAIN ANALYZE is your magnifying glass, pg_stat_statements is your suspect lineup sorted by 'total damage done.' Fix the pickpocket stealing $1 from 100,000 people before the bank robber who hits once a year.
Key EXPLAIN ANALYZE patterns to watch for: Seq Scan on a large table (needs an index or query rewrite), Nested Loop with inner Seq Scan (missing index on the join column), Sort with external merge (work_mem too low, spilling to disk), Hash Join with large hash table (consider increasing work_mem). Statistics are crucial: PostgreSQL query planner relies on pg_statistic for row count estimates. If estimates are wildly off (estimated 100 rows, actual 1M rows), the planner chooses bad plans. Fix with ANALYZE or increase default_statistics_target for high-cardinality columns. Common query antipatterns: SELECT * (reads unnecessary columns, especially bad with TOAST), OR conditions that prevent index usage (rewrite as UNION ALL), functions on indexed columns (WHERE UPPER(email) = 'FOO' cannot use a B-tree index on email — use an expression index or citext), and NOT IN with NULLs (use NOT EXISTS instead).
Advanced optimization: CTEs in PostgreSQL 12+ are no longer optimization fences — the planner can inline them and push predicates down. However, if a CTE is referenced multiple times, it is materialized to avoid recomputation. JIT compilation (enabled by default for expensive queries) can speed up aggregations by 20-30% by generating native code for expression evaluation. Parallel query execution (max_parallel_workers_per_gather) helps with large sequential scans and aggregations but does not help with index lookups. For partition pruning, use WHERE clauses on the partition key and ensure constraint_exclusion is on. Use pg_stat_statements to find the top queries by total time (not just per-call time) — a query running 10ms but called 1M times/day consumes more resources than a 5s query running 10 times/day.
My query optimization workflow starts with pg_stat_statements to identify the queries consuming the most total time — the product of call count and average duration. Then I use EXPLAIN ANALYZE on the top offenders to understand the execution plan. I look for sequential scans on large tables (add an index), sort spills to disk (increase work_mem), and row estimate mismatches (run ANALYZE). Common fixes: add partial or composite indexes, rewrite OR as UNION ALL, create expression indexes for function-based filters, and replace SELECT * with specific columns. For large tables, I partition by the most common filter column (usually a timestamp) so partition pruning skips irrelevant data. The goal is to minimize total resource consumption across all queries, not just optimize the single slowest query.
Optimizing individual query execution time without considering frequency. A query taking 500ms that runs 10 times a day is far less impactful than a 5ms query running 100,000 times a day. Always prioritize by total time (calls * mean_time) from pg_stat_statements.