What is Index Types: B-tree vs LSM-tree in System Design?

Databaseshigh

Index Types: B-tree vs LSM-tree

Indexes speed up reads by maintaining a sorted data structure that maps key values to data locations. B-tree indexes are the standard in RDBMS; LSM-tree indexes underpin most NoSQL write-optimized stores.

Memory anchor

B-tree = a library card catalog (great for finding any book, but re-sorting cards on every new arrival is slow). LSM-tree = a 'to-file' inbox that stacks up and gets sorted in bulk later -- fast intake, slower lookup.

Expected depth

B-tree: balanced tree with O(log N) reads and writes. In-place updates, good for read-heavy workloads. Used by PostgreSQL, MySQL, SQLite. Each write requires random I/O to update pages in place — slow for high write throughput. LSM-tree (Log-Structured Merge): writes go to an in-memory buffer (MemTable) then a sorted on-disk SSTable. Background compaction merges SSTables. Writes are sequential I/O (fast); reads may require checking multiple SSTables (mitigated by Bloom filters). Used by Cassandra, RocksDB, LevelDB, HBase.

Deep — senior internals

B-tree write amplification is typically 5–10x (writing one logical byte may update multiple tree nodes). LSM write amplification is higher in total (compaction rewrites data multiple times) but the critical I/O is sequential, which SSDs handle far more efficiently than random I/O. LSM space amplification is also worse (deleted data persists until compaction). For OLTP read-heavy workloads: B-tree. For write-heavy time-series, logging, or event storage: LSM. Covering indexes in PostgreSQL (INCLUDE clause) store column values in the index itself, turning index scans into index-only scans — critical for query performance. Partial indexes (WHERE clause in CREATE INDEX) reduce index size dramatically for sparse data.

🎤Interview-ready answer

For a write-heavy system like event logging or time-series data, I'd choose an LSM-tree-based store (Cassandra, RocksDB) because sequential writes are an order of magnitude faster than B-tree random writes at scale. For the core transactional database with complex queries and joins, I'd use PostgreSQL B-tree indexes with careful index design: composite indexes ordered to match query predicates, partial indexes for frequently-filtered subsets, and covering indexes to eliminate table heap lookups.

⚠Common trap

Over-indexing in RDBMS: every additional index increases write latency (each write updates all relevant indexes) and storage. Only index columns that appear in WHERE, JOIN, or ORDER BY clauses of actual queries — measure before adding.

Related concepts

JavaScript

Tree Shaking

DSA

Tree Traversals (Inorder, Preorder, Postorder)

Covering Indexes & Index-Only Scans