MVCC (Multi-Version Concurrency Control)
MVCC maintains multiple versions of each row, allowing readers and writers to operate concurrently without blocking each other. Readers see a consistent snapshot of committed data; writers create new row versions without overwriting old ones.
MVCC is a multi-draft document system — every edit creates a new draft (row version). Readers read the last approved draft (snapshot). Old drafts pile up in the filing cabinet (dead tuples). VACUUM is the intern who shreds old drafts to free up cabinet space.
PostgreSQL MVCC: each row (heap tuple) has xmin (transaction that created it) and xmax (transaction that deleted/updated it). A reader's snapshot determines which tuples are visible based on transaction IDs. UPDATE creates a new tuple with xmin = current transaction, sets xmax = current transaction on the old tuple. Old tuple versions accumulate as 'dead tuples' until VACUUM reclaims them. This is why PostgreSQL needs regular vacuuming; MySQL (InnoDB) stores old versions in a separate undo log segment.
Transaction ID wraparound is an existential PostgreSQL risk: XIDs are 32-bit integers (~2 billion). When wrapping around, old tuples appear 'in the future' — all data becomes invisible. PostgreSQL freezes old tuples (marks them as always visible) via autovacuum to prevent wraparound. pg_database.age(datfrozenxid) tracks how close a database is to wraparound — if it approaches 2 billion, emergency VACUUM FREEZE is needed. Bloat: high update/delete workloads create many dead tuples, inflating table/index size, slowing sequential scans, and wasting disk. pg_repack rebuilds tables online without a full lock.
MVCC enables readers and writers to never block each other by keeping multiple row versions. Readers see a consistent snapshot; writers create new versions. The tradeoff is dead tuple bloat that requires regular vacuuming. I monitor pg_stat_user_tables for n_dead_tup and ensure autovacuum runs frequently on high-churn tables. Transaction ID wraparound is a critical operational concern — I set up monitoring on datfrozenxid age.
Thinking VACUUM is optional. Without regular vacuuming, dead tuples accumulate (table bloat), sequential scans slow, and eventually XID wraparound causes catastrophic data loss. Autovacuum must be tuned for high-churn tables.