What is DynamoDB in Databases?

Document & NoSQLhigh

DynamoDB

DynamoDB is AWS fully managed NoSQL database designed for single-digit millisecond performance at any scale. It is a key-value and document database that requires you to design your data access patterns upfront. Every item is accessed by a partition key (hash key) and optionally a sort key. DynamoDB scales horizontally by distributing data across partitions based on the partition key hash.

Memory anchor

DynamoDB is a vending machine: you must know exactly which button to press (partition key + sort key) and you get your item instantly. But if you ask 'show me everything on shelf 3' (scan), the machine charges you per item it looks at. Design your snack requests (access patterns) before you build the machine.

Expected depth

Single-table design is the recommended DynamoDB pattern: instead of multiple tables, you put all entity types in one table and use generic attribute names (PK, SK) with prefixes (PK='USER#123', SK='ORDER#456'). This enables fetching related data in a single query. GSI (Global Secondary Index) provides an alternate partition key + sort key for different access patterns — each GSI is a full copy of the projected attributes. LSI (Local Secondary Index) provides alternate sort keys within the same partition key. Capacity modes: on-demand (pay per request, auto-scales, more expensive per request) vs provisioned (fixed throughput, cheaper at predictable workloads, risk of throttling). DynamoDB Streams captures item-level changes for event processing — similar to CDC, commonly used to trigger Lambda functions.

Deep — senior internals

Partition internals: each partition handles up to 3000 RCU and 1000 WCU. Hot partitions (one partition key receiving disproportionate traffic) cause throttling even if total table capacity is available. Adaptive capacity (enabled by default) mitigates this by borrowing capacity from less-busy partitions, but severe hot keys still throttle. Write sharding (appending a random suffix to partition keys) distributes writes but complicates reads. Transaction support (TransactWriteItems) provides ACID across up to 100 items but at 2x the WCU cost. On-demand mode has a hidden gotcha: it scales based on previous peak — if traffic suddenly doubles beyond the previous peak, you get throttled for a few minutes. DAX (DynamoDB Accelerator) adds microsecond-latency caching in front of DynamoDB, useful for read-heavy workloads with hot keys. Pricing trap: scanning is extremely expensive because DynamoDB charges for all data read, not just matching items. Always use Query (with partition key), never Scan.

🎤Interview-ready answer

I choose DynamoDB when I need a key-value or narrow-query database that scales infinitely with single-digit millisecond latency — typically for user sessions, gaming leaderboards, IoT device state, or high-throughput event ingestion. The critical requirement is designing access patterns upfront because DynamoDB does not support ad-hoc queries efficiently. I use single-table design with composite keys (PK='USER#123', SK='PROFILE' or SK='ORDER#2024-01-15') to colocate related data. GSIs provide secondary access patterns. The biggest mistakes I have seen: using Scan instead of Query (costs skyrocket), choosing a low-cardinality partition key (causes hot partitions), and not understanding that GSI storage is a full copy of projected attributes (doubles or triples storage cost).

⚠Common trap

Putting everything in DynamoDB because 'it scales' without understanding single-table design. If you use DynamoDB like a relational database with multiple tables and frequent scans, you get the worst of both worlds: no joins, no ad-hoc queries, and massive cost from scan operations. DynamoDB is only cost-effective when your access patterns are narrow and well-defined.

Related concepts