Ace Cloud Interviews
Home/AWS Tutorial/DocumentDB
🗃️

AWS Database

DocumentDB

Managed MongoDB-compatible document database for JSON data

Amazon DocumentDB is a fully managed document database service that implements the MongoDB 3.6, 4.0, and 5.0 APIs, allowing applications built for MongoDB to run on DocumentDB with minimal code changes. It stores, queries, and indexes JSON documents using a distributed storage architecture similar to Aurora. DocumentDB is the choice for teams that need MongoDB compatibility without managing MongoDB infrastructure themselves.

How DocumentDB Works: Cluster Architecture

DocumentDB uses the same Aurora-derived storage architecture: a distributed cluster volume with 6 copies across 3 AZs that grows automatically in 10 GB increments up to 128 TB. The cluster has one primary instance and up to 15 read replicas sharing the same storage volume.

ComponentDescription
Cluster endpointDNS name pointing to the primary - for writes
Reader endpointDNS name load-balancing across all read replicas
Instance endpointDirect DNS name per instance - for specific routing
Cluster volumeShared distributed storage; 6 copies across 3 AZs
Replica lagTypically < 100 ms (same storage - no data copying)
⚠️

DocumentDB is NOT MongoDB. It implements the MongoDB wire protocol and API but is not binary-compatible with MongoDB. Features like full-text search (Atlas Search), MongoDB Charts, Change Streams (limited support), and some aggregation pipeline stages are absent or behave differently. Test your application thoroughly before migrating.

DocumentDB vs Self-Managed MongoDB vs MongoDB Atlas

FactorDocumentDBSelf-Managed MongoDB on EC2MongoDB Atlas
Management overheadLow - AWS manages infraHigh - you manage everythingLow - MongoDB manages infra
MongoDB feature parityPartial (3.6/4.0/5.0 API subset)FullFull + extras (Atlas Search, Charts)
AWS integrationNative (IAM, VPC, CloudWatch, KMS)ManualLimited (cross-cloud)
Change StreamsLimited supportFullFull
Full-text searchNot supportedMongoDB Atlas Search or manualAtlas Search (Lucene-based)
Cost at scaleCompetitive with AtlasLower compute; higher ops costHigher list price
Best forExisting MongoDB apps moving to AWSFull MongoDB feature set neededFull MongoDB + managed + multi-cloud

Indexing and Query Patterns

DocumentDB supports single field, compound, multi-key (array), sparse, and TTL indexes. Without the right indexes, queries degrade to collection scans which are expensive at scale.

bash
// Connect and run basic operations (MongoDB shell syntax)
// DocumentDB uses the same syntax

// Create a compound index on userId ascending, createdAt descending
db.orders.createIndex({ userId: 1, createdAt: -1 })

// Create a TTL index (auto-delete documents after 7 days)
db.sessions.createIndex(
  { expiresAt: 1 },
  { expireAfterSeconds: 0 }
)

// Explain a query to check index usage
db.orders.find({ userId: "user123" }).explain("executionStats")

// Aggregation pipeline example
db.orders.aggregate([
  { $match: { status: "completed", createdAt: { $gte: new Date("2024-01-01") } } },
  { $group: { _id: "$userId", totalSpent: { $sum: "$amount" }, count: { $sum: 1 } } },
  { $sort: { totalSpent: -1 } },
  { $limit: 10 }
])
💡

Use explain("executionStats") to verify that queries use an index (look for IXSCAN vs COLLSCAN in the winningPlan). A COLLSCAN on a large collection is almost always a performance problem.

Pricing Model

ComponentPricing BasisTip
Instance hoursPer hour by instance classReserve production instances 1-3 years
StoragePer GB-month (grows automatically)Same as Aurora - no pre-provisioning needed
I/O requestsPer million I/OsOptimize queries to reduce scan I/O
Backup storageFree up to cluster size; per GB beyondReduce retention on dev/staging
Data transferCross-AZ and cross-region charges applyPlace app and DB in same AZ

Security and Network Configuration

DocumentDB runs inside a VPC. All connections must come from within the VPC - there is no public endpoint option. TLS is enabled by default and cannot be disabled on DocumentDB 4.0+.

Security FeatureDetails
Encryption at restAES-256 via KMS; must be enabled at cluster creation
Encryption in transitTLS 1.2 mandatory; download the CA bundle for your client
AuthenticationUsername/password; Secrets Manager recommended for rotation
NetworkVPC only; no public endpoint; use SSH tunnel or bastion for admin access
IAM authenticationNot supported (unlike RDS/Aurora)
bash
# Download the DocumentDB CA certificate bundle
wget https://truststore.pki.rds.amazonaws.com/global/global-bundle.pem

# Connect using mongosh with TLS
mongosh "mongodb://admin:password@my-cluster.cluster-abc.us-east-1.docdb.amazonaws.com:27017/?tls=true&tlsCAFile=global-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false"
🎯

Interview Focus Points

  • 1What is DocumentDB and how does it differ from MongoDB? What are the key incompatibilities?
  • 2When would you choose DocumentDB over MongoDB Atlas or self-managed MongoDB?
  • 3Explain the DocumentDB cluster architecture. How is it similar to Aurora?
  • 4How do you connect to DocumentDB from an application? What are the TLS requirements?
  • 5What indexing strategies matter most for DocumentDB performance?
  • 6How does DocumentDB handle failover? What is the typical failover time?
  • 7What is the replicaSet=rs0 connection string parameter and why is it required?
  • 8How do you migrate an existing MongoDB database to DocumentDB with minimal downtime?