Composite Indexes & Column Order
A composite (multi-column) index covers multiple columns in a specified order. The leftmost prefix rule applies: an index on (a, b, c) can be used for queries filtering on a, on (a, b), or on (a, b, c), but NOT on b alone or c alone.
Composite index is a phone book sorted by (last name, first name). You can find 'Smith, John' easily (full match). You can find all 'Smiths' (leading prefix). But you can't find all 'Johns' without scanning every page — first name alone doesn't help because the book isn't sorted by first name.
Column order in composite indexes is critical. Rules of thumb: (1) equality columns before range columns — index on (status, created_at) supports WHERE status = 'active' AND created_at > '2024-01-01' better than (created_at, status). (2) Most selective column first — columns that eliminate the most rows go first for equality conditions. (3) Include ORDER BY columns last to avoid a sort step. A query hitting only the first column of a composite index still uses the index, but less efficiently than a dedicated single-column index.
The index condition pushdown (ICP) optimization evaluates index conditions as the index is scanned, reducing heap access. Skip scanning (MySQL/PostgreSQL 14+): the planner can sometimes use an index even if the leading column isn't in the filter — by scanning distinct values of the leading column, then applying the actual filter. Covering indexes (index-only scans) — when all queried columns are in the index, the planner never touches the heap. Functional indexes on expressions: CREATE INDEX ON lower(email) — enables case-insensitive searches. Index on jsonb path: CREATE INDEX ON users ((data->>'email')).
I design composite indexes by considering query patterns: equality filters first (most selective), then range filters, then ORDER BY columns. An index on (user_id, created_at DESC) perfectly supports 'get last 10 orders for user X'. I verify with EXPLAIN that the index is actually used and check whether an index-only scan is possible (all needed columns in the index).
Putting a range column before equality columns. Index on (created_at, user_id) with WHERE user_id = 5 AND created_at > '2024-01-01' — the planner can't use the index efficiently for user_id because it's not the leading column. Reverse the order.