Advanced SQLmedium

UPSERT & ON CONFLICT

UPSERT inserts a row if it doesn't exist, or updates it if it does — atomically, without a race condition. PostgreSQL implements it with INSERT ... ON CONFLICT.

Memory anchor

UPSERT is a bouncer with a VIP list — if you're already on the list (conflict), the bouncer updates your table (DO UPDATE). If not, you get added (INSERT). No two bouncers argue about who checks first — it's atomic.

Expected depth

Syntax: INSERT INTO users (id, name, updated_at) VALUES (1, 'Alice', NOW()) ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name, updated_at = EXCLUDED.updated_at. EXCLUDED refers to the row that would have been inserted. ON CONFLICT DO NOTHING ignores conflicts without updating. The conflict target can be a column, columns, or a constraint name (ON CONFLICT ON CONSTRAINT uk_email). UPSERT is atomic — no race condition between checking existence and inserting.

Deep — senior internals

UPSERT uses an internal 'speculative insertion' — it attempts the INSERT, and on conflict, falls back to UPDATE without releasing the row lock. This prevents the read-then-write race condition of application-level check-then-insert. The UPDATE in ON CONFLICT can reference complex expressions: SET count = users.count + 1 for increment patterns. For batch upserts, insert multiple rows in one INSERT ... ON CONFLICT statement — much faster than individual upserts. Conditional UPSERT: ON CONFLICT DO UPDATE WHERE condition — only update if the condition is true (e.g., only update if the new value is higher).

🎤Interview-ready answer

INSERT ... ON CONFLICT is my go-to for idempotent write operations — event processing, data ingestion, syncing. It eliminates the check-then-insert race condition and works atomically. I use ON CONFLICT DO NOTHING for 'insert if not exists' and ON CONFLICT DO UPDATE SET ... = EXCLUDED.column for full upsert semantics. For increment patterns: ON CONFLICT DO UPDATE SET count = table.count + 1.

Common trap

Using an application-level 'check then insert' pattern (SELECT → if not found → INSERT) as an alternative. This has a race condition — two concurrent transactions both see no row and both insert, causing a duplicate key error or data loss.