AWS Database
DynamoDB
Serverless NoSQL key-value database with single-digit millisecond performance at any scale
Amazon DynamoDB is a fully serverless, key-value and document NoSQL database that delivers single-digit millisecond performance at any scale - from one request per second to millions. AWS manages all the infrastructure: partitioning, replication across 3 AZs, patching, and scaling. DynamoDB is the right choice when you need predictable low latency at massive scale and can model your access patterns upfront.
Data Model: Tables, Items, Attributes, and Keys
DynamoDB organizes data into tables. Each item (row) is uniquely identified by a primary key. Unlike relational databases, DynamoDB has no fixed schema - each item can have different attributes. However, the primary key attributes are mandatory and immutable.
| Concept | Description | Analogy |
|---|---|---|
| Table | Top-level container for items | SQL table |
| Item | A single data record | SQL row |
| Attribute | A name-value pair within an item | SQL column (but schema-less) |
| Partition Key (PK) | Required; hashed to determine storage partition | Part of primary key |
| Sort Key (SK) | Optional; items with same PK sorted by SK | Enables range queries within a partition |
| GSI (Global Secondary Index) | Alternate key (different PK/SK); spans all partitions | Non-primary index |
| LSI (Local Secondary Index) | Alternate SK for same PK; must be defined at table creation | Local partition index |
You cannot change the primary key of an existing table, and LSIs must be defined at table creation time. These are permanent design decisions - model your access patterns carefully before creating the table.
Capacity Modes: On-Demand vs Provisioned with Auto Scaling
DynamoDB charges for read and write capacity. Choosing the wrong mode is the most common source of unexpected DynamoDB bills.
| Attribute | On-Demand | Provisioned + Auto Scaling |
|---|---|---|
| Pricing unit | Per request (RRU/WRU) | Per provisioned RCU/WCU-hour |
| Throttling | None (instantaneous scaling) | Possible if traffic spikes faster than auto scaling reacts |
| Cost at low traffic | Minimum charges | Minimum provisioned cost even at 0 requests |
| Cost at sustained high traffic | ~6x more expensive than provisioned | Cheaper at predictable high throughput |
| Best for | Unpredictable or new workloads | Steady-state production workloads |
One RCU (Read Capacity Unit) = one strongly consistent read of up to 4 KB per second, or two eventually consistent reads. One WCU (Write Capacity Unit) = one write of up to 1 KB per second. Large item sizes multiply your RCU/WCU consumption - keep items small.
You can switch between on-demand and provisioned modes once per 24 hours. Plan capacity mode changes for low-traffic periods.
Single-Table Design and Access Pattern Modeling
DynamoDB queries are only possible on the PK (and SK if defined). You cannot do ad-hoc queries across arbitrary attributes without a full table scan. The key skill is designing your PK/SK and GSIs to support all required access patterns upfront.
Single-table design stores multiple entity types in one table using a generic PK/SK pattern. For example, a PK of USER#123 with SK of ORDER#456 lets you query all orders for a user with a single Query operation.
| Access Pattern | Operation | Key Design |
|---|---|---|
| Get user by ID | GetItem | PK=USER#<id> |
| Get all orders for user | Query | PK=USER#<id>, SK begins_with ORDER# |
| Get order by ID | GetItem | PK=USER#<id>, SK=ORDER#<id> |
| Get all orders by status | Query on GSI | GSI PK=STATUS#<status>, SK=createdAt |
| Full table scan | Scan (avoid in prod) | No key - reads every partition |
A Scan operation reads every item in the table and is expensive. Design your table so that all production access patterns can be served by Query or GetItem. If you find yourself frequently scanning, you need more GSIs or a different PK design.
Consistency Models and Transactions
DynamoDB offers two consistency models for reads and a Transactions API for multi-item atomic operations.
| Feature | Behaviour | Cost |
|---|---|---|
| Eventually consistent read | May return stale data (up to ~1 second) | 0.5 RCU per 4 KB |
| Strongly consistent read | Always returns latest committed data | 1 RCU per 4 KB |
| Transactional read (TransactGet) | Atomic read of up to 100 items across tables | 2 RCUs per 4 KB |
| Transactional write (TransactWrite) | Atomic all-or-nothing write of up to 100 items | 2 WCUs per 1 KB |
DynamoDB Transactions use a two-phase commit protocol under the hood. They cost 2x the normal read/write capacity. Use them for operations that must be atomic (e.g. transfer money between two balance items) but avoid them for high-frequency single-item writes.
DynamoDB Streams, TTL, and DAX
DynamoDB provides three important operational features that come up constantly in architecture discussions.
| Feature | What It Does | Common Pattern |
|---|---|---|
| DynamoDB Streams | Ordered stream of item-level changes (insert/update/delete) retained for 24 hours | Trigger Lambda for real-time processing, replicate to Elasticsearch |
| TTL (Time to Live) | Automatic item deletion when a timestamp attribute expires | Session management, soft deletes, expiring caches |
| DAX (DynamoDB Accelerator) | In-memory write-through cache in front of DynamoDB; microsecond reads | Read-heavy workloads needing sub-millisecond latency |
TTL deletes happen in the background and can be delayed by up to 48 hours after the expiry time. Do not use TTL as a hard security or compliance expiry mechanism - items may still be readable after the TTL timestamp passes.
DAX is a cluster deployed in your VPC. It caches both item-level (GetItem) and query-level (Query/Scan) results. It is not suitable for strongly consistent reads (those bypass the cache) and adds latency for writes. Best for read-heavy, eventually-consistent workloads.
CLI and SDK Operations
# PutItem
aws dynamodb put-item \
--table-name Orders \
--item '{"PK": {"S": "USER#123"}, "SK": {"S": "ORDER#456"}, "status": {"S": "PENDING"}, "total": {"N": "99.99"}}'
# GetItem
aws dynamodb get-item \
--table-name Orders \
--key '{"PK": {"S": "USER#123"}, "SK": {"S": "ORDER#456"}}'
# Query all orders for a user
aws dynamodb query \
--table-name Orders \
--key-condition-expression "PK = :pk AND begins_with(SK, :sk_prefix)" \
--expression-attribute-values '{ ":pk": {"S": "USER#123"}, ":sk_prefix": {"S": "ORDER#"}}'
# Enable TTL on a table
aws dynamodb update-time-to-live \
--table-name Sessions \
--time-to-live-specification "Enabled=true,AttributeName=expiresAt"
# Enable Point-in-Time Recovery
aws dynamodb update-continuous-backups \
--table-name Orders \
--point-in-time-recovery-specification PointInTimeRecoveryEnabled=trueInterview Focus Points
- 1What is the difference between a partition key, sort key, GSI, and LSI in DynamoDB?
- 2Explain single-table design in DynamoDB. What are the trade-offs versus multi-table design?
- 3When would you choose DynamoDB over RDS/Aurora? What are the limitations of DynamoDB?
- 4What is a hot partition in DynamoDB and how do you prevent it?
- 5Compare on-demand vs provisioned capacity. When is each mode more cost-effective?
- 6How do DynamoDB Streams work and what are common use cases for them?
- 7Explain DynamoDB Transactions. What is the cost implication?
- 8How does DAX work? What types of reads does it accelerate and what are the limitations?
- 9A DynamoDB table is getting throttled. What are the possible causes and how do you diagnose and fix it?
- 10How does TTL work and what are the consistency guarantees around item expiry?