OLTP Fundamentalscritical

ACID Transactions

ACID stands for Atomicity (all-or-nothing), Consistency (data always valid per constraints), Isolation (concurrent transactions do not interfere), Durability (committed data survives crashes). Every relational database supports ACID. It is what makes operations like 'transfer $100 from account A to account B' safe — either both the debit and credit happen, or neither does.

Memory anchor

ACID is a bank ATM: Atomic (you get all your cash or none), Consistent (your balance is always correct), Isolated (the person behind you can't see your transaction mid-swipe), Durable (even if the power dies, your withdrawal is recorded).

Expected depth

Isolation levels are the practical knob you tune: READ UNCOMMITTED (dirty reads possible), READ COMMITTED (PostgreSQL default — sees only committed data), REPEATABLE READ (snapshot at transaction start — MySQL InnoDB default), SERIALIZABLE (full isolation, highest overhead). Most production systems use READ COMMITTED because SERIALIZABLE creates significant lock contention. PostgreSQL implements isolation via MVCC (Multi-Version Concurrency Control) — readers never block writers. MySQL InnoDB also uses MVCC but with a different undo log mechanism. In distributed systems, achieving ACID across multiple nodes requires two-phase commit (2PC) or consensus protocols (Raft/Paxos), which add latency.

Deep — senior internals

The subtle trap with isolation levels: REPEATABLE READ in PostgreSQL and MySQL behave differently. PostgreSQL REPEATABLE READ provides true snapshot isolation — it prevents phantom reads. MySQL REPEATABLE READ uses next-key locking for some operations but still allows phantoms in certain edge cases. SERIALIZABLE in PostgreSQL uses Serializable Snapshot Isolation (SSI), which detects dependency cycles and aborts conflicting transactions — this is more performant than traditional lock-based serializability. In distributed databases (Spanner, CockroachDB), achieving external consistency requires synchronized clocks (TrueTime in Spanner) or hybrid logical clocks. The CAP theorem means distributed transactions involve trade-offs: Spanner chooses CP (consistent + partition-tolerant) and accepts higher write latency. DynamoDB chooses AP with eventual consistency by default but offers strongly consistent reads at 2x the cost.

🎤Interview-ready answer

ACID guarantees that transactions are atomic, consistent, isolated, and durable. In practice, the most important knob is the isolation level. PostgreSQL defaults to READ COMMITTED, which is the right choice for most applications — it prevents dirty reads while avoiding the lock contention of SERIALIZABLE. PostgreSQL implements this via MVCC, where each transaction sees a snapshot of the data and readers never block writers. When I need stronger guarantees — for example, preventing lost updates in a booking system — I use SELECT FOR UPDATE or bump to SERIALIZABLE isolation. For distributed transactions across services, I avoid 2PC in favor of saga patterns with compensating transactions, because 2PC is a single point of failure.

Common trap

Assuming SERIALIZABLE is always the safest choice. It is the most correct but also the most expensive — it dramatically reduces throughput under contention. Most production systems run READ COMMITTED and handle edge cases with explicit locking (SELECT FOR UPDATE) or optimistic concurrency control (version columns).

Related concepts