Performance Tuningmedium

VACUUM & Autovacuum

VACUUM reclaims storage occupied by dead tuples left by MVCC (after UPDATE and DELETE). VACUUM FULL rewrites the entire table (compacts it, but requires an exclusive lock). Autovacuum runs VACUUM automatically based on configurable thresholds.

Memory anchor

VACUUM is a street sweeper for dead data — it clears debris (dead tuples) from the road (table pages) so future traffic (queries) moves faster. Autovacuum is a scheduled street-cleaning service. XID wraparound is what happens if the street is never cleaned — the city floods.

Expected depth

VACUUM marks dead tuple space as reusable (doesn't return it to OS). VACUUM FULL compacts the table and returns space to OS, but blocks all access. ANALYZE updates the table statistics used by the query planner — outdated statistics cause bad plans. VACUUM ANALYZE does both. Autovacuum triggers when dead tuples exceed: autovacuum_vacuum_threshold (default 50) + autovacuum_vacuum_scale_factor (default 0.2 = 20% of rows) dead tuples. For high-churn tables, lower scale_factor to trigger more frequent vacuuming.

Deep — senior internals

The two critical functions of VACUUM: (1) Reclaim dead tuple space to prevent table bloat. (2) Advance the oldest transaction horizon to prevent XID wraparound. pg_stat_user_tables tracks n_live_tup, n_dead_tup, last_autovacuum, last_analyze — monitor these to ensure autovacuum keeps up. Autovacuum can be overwhelmed by very high write rates — increase autovacuum_max_workers and autovacuum_vacuum_cost_delay. Bloat estimation: use the pgstattuple extension or community queries on pg_class. pg_repack rebuilds bloated tables online (no full lock, unlike CLUSTER or VACUUM FULL).

🎤Interview-ready answer

Autovacuum is critical for PostgreSQL health — it prevents table bloat and XID wraparound. I tune autovacuum_vacuum_scale_factor lower (0.01–0.05) for large, high-churn tables so it runs more frequently before bloat accumulates. I monitor pg_stat_user_tables for n_dead_tup and last_autovacuum. If a table is already badly bloated, I use pg_repack for online repack without service disruption.

Common trap

Disabling autovacuum to improve write performance. This is catastrophically wrong — dead tuples accumulate, queries slow, and eventually XID wraparound causes total data loss. Tune autovacuum instead of disabling it.