Ace Cloud Interviews
Home/AWS Tutorial/Database Migration Service
🚚

AWS Migration & Transfer

Database Migration Service

Migrate databases to AWS with minimal downtime using continuous replication

AWS Database Migration Service (DMS) migrates databases to AWS with minimal downtime by replicating data continuously while your source remains operational. It supports homogeneous migrations (Oracle to Oracle) and heterogeneous migrations (Oracle to Aurora PostgreSQL) across dozens of database engines. For cloud engineers, DMS is the go-to tool for live database cutovers that keep production running during migration.

How DMS Replicates Data

DMS uses a three-phase process: full load, CDC (change data capture), and cutover. A replication instance runs the migration engine, connecting to both source and target endpoints.

PhaseWhat HappensDuration
Full LoadBulk copy of existing data from source to targetHours to days depending on DB size
CDC (Change Data Capture)Continuously replicates inserts, updates, deletes from source transaction logsOngoing until cutover
CutoverStop writes to source, wait for CDC lag to reach zero, redirect applicationMinutes

CDC relies on the source database's native replication mechanism - binary logs for MySQL, redo logs for Oracle, WAL for PostgreSQL. The replication instance must have network access to both endpoints.

💡

Size your replication instance correctly. Use t3.medium for small databases under 1TB, r5.xlarge or larger for multi-TB migrations or high-throughput CDC. An undersized instance is the most common cause of replication lag.

Supported Sources and Targets

DMS supports a wide range of engines as both sources and targets. Not all combinations support all features - check the DMS docs for CDC support per engine.

EngineAs SourceAs TargetCDC Support
MySQL / MariaDBYesYesYes (binlog)
PostgreSQLYesYesYes (WAL / pglogical)
OracleYesYesYes (LogMiner / BFile)
SQL ServerYesYesYes (MS-CDC)
Amazon AuroraYesYesYes
Amazon DynamoDBNoYesN/A (target only)
Amazon RedshiftNoYesN/A (target only)
S3Yes (files)Yes (files)No
MongoDBYesNoYes
⚠️

Oracle as a source with LogMiner requires the source DB to be in ARCHIVELOG mode and supplemental logging must be enabled at the table level. Missing this causes DMS to silently miss DML changes.

Task Types and Key Settings

Each DMS replication task defines what to migrate, how to map schemas, and how to handle errors. Task settings are JSON documents that override defaults.

Task TypeUse Case
Full load onlyOne-time bulk migration, source can go offline during migration
CDC onlyAlready have data on target, just need to keep in sync
Full load + CDCMost common - live migration with zero-downtime cutover
bash
# Create a replication task via CLI
aws dms create-replication-task \
  --replication-task-identifier my-migration \
  --source-endpoint-arn arn:aws:dms:us-east-1:123456789012:endpoint:SOURCE \
  --target-endpoint-arn arn:aws:dms:us-east-1:123456789012:endpoint:TARGET \
  --replication-instance-arn arn:aws:dms:us-east-1:123456789012:rep:INSTANCE \
  --migration-type full-load-and-cdc \
  --table-mappings file://table-mappings.json \
  --replication-task-settings file://task-settings.json
💡

Table mappings use inclusion/exclusion rules to select which schemas and tables to migrate. You can also apply transformation rules to rename schemas, tables, or columns during migration - useful when the target uses a different naming convention.

Data Validation and Monitoring

DMS has a built-in data validation feature that compares row counts and checksums between source and target after full load. Enable it via task settings.

bash
// task-settings.json excerpt
{
  "ValidationSettings": {
    "EnableValidation": true,
    "ValidationMode": "ROW_LEVEL",
    "ThreadCount": 5,
    "FailureMaxCount": 100
  },
  "Logging": {
    "EnableLogging": true,
    "LogComponents": [
      {"Id": "SOURCE_UNLOAD", "Severity": "LOGGER_SEVERITY_DEFAULT"},
      {"Id": "TARGET_LOAD", "Severity": "LOGGER_SEVERITY_DEFAULT"},
      {"Id": "TASK_MANAGER", "Severity": "LOGGER_SEVERITY_DEBUG"}
    ]
  }
}
Metric to WatchCloudWatch MetricWhat It Means
CDC lagCDCLatencySource / CDCLatencyTargetSeconds behind source - should approach 0 at cutover
Full load progressFullLoadThroughputRowsSourceRows read per second during full load
Error countTaskErrorCountNon-zero means data loss risk
Memory usageMemoryUsageHigh memory on replication instance causes slowdowns
⚠️

DMS validation adds significant load to both source and target. For large tables, run validation during off-peak hours or validate a subset of critical tables only.

DMS Pricing Model

DMS charges for the replication instance and optional data transfer. There is no per-row or per-GB charge for the migration itself.

Cost ComponentRate (us-east-1)Notes
Replication instance (t3.medium)~$0.104/hrMost small migrations fit this size
Replication instance (r5.xlarge)~$0.480/hrUse for high-throughput CDC or multi-TB
Storage (replication instance)$0.115/GB/monthUsed for log staging, typically 50-100GB needed
Data transfer outStandard EC2 ratesOnly if source and target are in different regions
💡

DMS is free tier eligible: 750 hours of t2.micro/t3.micro replication instance per month for the first year. For a typical migration that runs for a few days, total cost is often under $20.

🎯

Interview Focus Points

  • 1Explain the difference between full load, CDC, and full load + CDC task types and when you would use each.
  • 2What is CDC lag and how do you ensure it reaches zero before cutting over to the new database?
  • 3How would you migrate a 5TB Oracle database to Aurora PostgreSQL with less than 30 minutes of downtime?
  • 4What supplemental logging changes must be made to an Oracle source database before starting DMS?
  • 5How does DMS handle DDL changes (ALTER TABLE, DROP TABLE) during a live migration?
  • 6What causes DMS to fall behind and how would you troubleshoot high CDC latency?
  • 7Describe the role of the Schema Conversion Tool vs DMS in a heterogeneous database migration.
  • 8How do you validate that the data in the target database matches the source after migration?
  • 9What networking and IAM permissions does the DMS replication instance need to access source and target endpoints?