AWS Database
DocumentDB
Managed MongoDB-compatible document database for JSON data
Amazon DocumentDB is a fully managed document database service that implements the MongoDB 3.6, 4.0, and 5.0 APIs, allowing applications built for MongoDB to run on DocumentDB with minimal code changes. It stores, queries, and indexes JSON documents using a distributed storage architecture similar to Aurora. DocumentDB is the choice for teams that need MongoDB compatibility without managing MongoDB infrastructure themselves.
How DocumentDB Works: Cluster Architecture
DocumentDB uses the same Aurora-derived storage architecture: a distributed cluster volume with 6 copies across 3 AZs that grows automatically in 10 GB increments up to 128 TB. The cluster has one primary instance and up to 15 read replicas sharing the same storage volume.
| Component | Description |
|---|---|
| Cluster endpoint | DNS name pointing to the primary - for writes |
| Reader endpoint | DNS name load-balancing across all read replicas |
| Instance endpoint | Direct DNS name per instance - for specific routing |
| Cluster volume | Shared distributed storage; 6 copies across 3 AZs |
| Replica lag | Typically < 100 ms (same storage - no data copying) |
DocumentDB is NOT MongoDB. It implements the MongoDB wire protocol and API but is not binary-compatible with MongoDB. Features like full-text search (Atlas Search), MongoDB Charts, Change Streams (limited support), and some aggregation pipeline stages are absent or behave differently. Test your application thoroughly before migrating.
DocumentDB vs Self-Managed MongoDB vs MongoDB Atlas
| Factor | DocumentDB | Self-Managed MongoDB on EC2 | MongoDB Atlas |
|---|---|---|---|
| Management overhead | Low - AWS manages infra | High - you manage everything | Low - MongoDB manages infra |
| MongoDB feature parity | Partial (3.6/4.0/5.0 API subset) | Full | Full + extras (Atlas Search, Charts) |
| AWS integration | Native (IAM, VPC, CloudWatch, KMS) | Manual | Limited (cross-cloud) |
| Change Streams | Limited support | Full | Full |
| Full-text search | Not supported | MongoDB Atlas Search or manual | Atlas Search (Lucene-based) |
| Cost at scale | Competitive with Atlas | Lower compute; higher ops cost | Higher list price |
| Best for | Existing MongoDB apps moving to AWS | Full MongoDB feature set needed | Full MongoDB + managed + multi-cloud |
Indexing and Query Patterns
DocumentDB supports single field, compound, multi-key (array), sparse, and TTL indexes. Without the right indexes, queries degrade to collection scans which are expensive at scale.
// Connect and run basic operations (MongoDB shell syntax)
// DocumentDB uses the same syntax
// Create a compound index on userId ascending, createdAt descending
db.orders.createIndex({ userId: 1, createdAt: -1 })
// Create a TTL index (auto-delete documents after 7 days)
db.sessions.createIndex(
{ expiresAt: 1 },
{ expireAfterSeconds: 0 }
)
// Explain a query to check index usage
db.orders.find({ userId: "user123" }).explain("executionStats")
// Aggregation pipeline example
db.orders.aggregate([
{ $match: { status: "completed", createdAt: { $gte: new Date("2024-01-01") } } },
{ $group: { _id: "$userId", totalSpent: { $sum: "$amount" }, count: { $sum: 1 } } },
{ $sort: { totalSpent: -1 } },
{ $limit: 10 }
])Use explain("executionStats") to verify that queries use an index (look for IXSCAN vs COLLSCAN in the winningPlan). A COLLSCAN on a large collection is almost always a performance problem.
Pricing Model
| Component | Pricing Basis | Tip |
|---|---|---|
| Instance hours | Per hour by instance class | Reserve production instances 1-3 years |
| Storage | Per GB-month (grows automatically) | Same as Aurora - no pre-provisioning needed |
| I/O requests | Per million I/Os | Optimize queries to reduce scan I/O |
| Backup storage | Free up to cluster size; per GB beyond | Reduce retention on dev/staging |
| Data transfer | Cross-AZ and cross-region charges apply | Place app and DB in same AZ |
Security and Network Configuration
DocumentDB runs inside a VPC. All connections must come from within the VPC - there is no public endpoint option. TLS is enabled by default and cannot be disabled on DocumentDB 4.0+.
| Security Feature | Details |
|---|---|
| Encryption at rest | AES-256 via KMS; must be enabled at cluster creation |
| Encryption in transit | TLS 1.2 mandatory; download the CA bundle for your client |
| Authentication | Username/password; Secrets Manager recommended for rotation |
| Network | VPC only; no public endpoint; use SSH tunnel or bastion for admin access |
| IAM authentication | Not supported (unlike RDS/Aurora) |
# Download the DocumentDB CA certificate bundle
wget https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem
# Connect using mongosh with TLS
mongosh "mongodb://admin:password@my-cluster.cluster-abc.us-east-1.docdb.amazonaws.com:27017/?tls=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false"Interview Focus Points
- 1What is DocumentDB and how does it differ from MongoDB? What are the key incompatibilities?
- 2When would you choose DocumentDB over MongoDB Atlas or self-managed MongoDB?
- 3Explain the DocumentDB cluster architecture. How is it similar to Aurora?
- 4How do you connect to DocumentDB from an application? What are the TLS requirements?
- 5What indexing strategies matter most for DocumentDB performance?
- 6How does DocumentDB handle failover? What is the typical failover time?
- 7What is the replicaSet=rs0 connection string parameter and why is it required?
- 8How do you migrate an existing MongoDB database to DocumentDB with minimal downtime?