AWS Storage
EBS
High-performance block storage volumes attached to EC2 instances
Amazon EBS (Elastic Block Store) provides persistent block storage volumes that attach to EC2 instances like virtual hard drives, surviving instance stops and reboots. EBS volumes are AZ-specific, offer sub-millisecond latency, and come in SSD and HDD variants tuned for different I/O profiles. Understanding EBS volume types, IOPS provisioning, and snapshot mechanics is essential for designing performant and cost-efficient EC2-based workloads.
EBS Volume Types - SSD vs HDD
EBS offers four volume types split between SSD-backed (latency-sensitive, random I/O) and HDD-backed (throughput-oriented, sequential I/O). The correct choice dramatically impacts both cost and performance.
| Volume Type | API Name | Max IOPS | Max Throughput | Use Case |
|---|---|---|---|---|
| General Purpose SSD | gp3 | 16,000 | 1,000 MB/s | Default for most workloads - boot volumes, dev/test, small DBs |
| General Purpose SSD (legacy) | gp2 | 16,000 | 250 MB/s | Older default - migrate to gp3 for better performance and cost |
| Provisioned IOPS SSD | io2 Block Express | 256,000 | 4,000 MB/s | I/O-intensive DBs: Oracle, SQL Server, SAP HANA |
| Provisioned IOPS SSD | io1 | 64,000 | 1,000 MB/s | Legacy high-performance, prefer io2 |
| Throughput Optimized HDD | st1 | 500 | 500 MB/s | Big data, data warehouses, log processing |
| Cold HDD | sc1 | 250 | 250 MB/s | Infrequently accessed, lowest cost block storage |
gp3 is now the recommended default. Unlike gp2, gp3 decouples IOPS from volume size - you can provision up to 16,000 IOPS and 1,000 MB/s independently at no extra charge on volumes sized from 1GB. gp2 ties IOPS to volume size (3 IOPS/GB, burst up to 3,000).
HDD volumes (st1, sc1) cannot be used as boot volumes. Only SSD volumes (gp2, gp3, io1, io2) can be root volumes.
IOPS, Throughput, and Performance Tuning
IOPS (I/O Operations Per Second) measures how many read/write operations a volume can handle. Throughput measures how much data moves per second. Understanding both is required for database workload sizing.
| Concept | gp3 Defaults | gp3 Max | io2 Max |
|---|---|---|---|
| Baseline IOPS | 3,000 | 16,000 | 64,000 (256k on Block Express) |
| IOPS cost above baseline | Free included | $0.005/provisioned IOPS above 3,000 | $0.065/provisioned IOPS |
| Baseline Throughput | 125 MB/s | 1,000 MB/s | 4,000 MB/s |
| Throughput cost above baseline | Free included | $0.04/MB/s above 125 | Included |
The EC2 instance itself also has EBS bandwidth limits - the volume cannot exceed what the instance supports:
# Check EBS optimized baseline throughput for an instance type
aws ec2 describe-instance-types \
--instance-types r6i.2xlarge \
--query "InstanceTypes[].EbsInfo"
# Check current volume performance
aws cloudwatch get-metric-statistics \
--namespace AWS/EBS \
--metric-name VolumeReadOps \
--dimensions Name=VolumeId,Value=vol-xxxxxxxxx \
--start-time 2024-01-01T00:00:00Z \
--end-time 2024-01-01T01:00:00Z \
--period 60 \
--statistics SumEBS-optimized instances have dedicated bandwidth for EBS traffic, separate from the network. Most modern instance types are EBS-optimized by default. Always check instance-level EBS bandwidth limits when provisioning high-IOPS volumes.
Snapshots, AMIs, and Data Lifecycle
EBS snapshots are incremental backups stored in S3 (managed by AWS, not visible in your S3 console). The first snapshot is a full copy; subsequent snapshots only store changed blocks.
| Operation | Behavior | Key Detail |
|---|---|---|
| Create snapshot | Incremental, captures changed blocks | Volume can remain in use during snapshot |
| Restore from snapshot | New volume, lazy loading | Data loads on first access - can cause latency spikes |
| Copy snapshot | Copies to same or different region | Used for DR and cross-region AMI sharing |
| Fast Snapshot Restore (FSR) | Pre-warms blocks for immediate performance | Costs extra - charged per AZ per snapshot enabled |
| Snapshot sharing | Share with specific accounts or public | Shared snapshots are not billed to recipient for storage |
# Create a snapshot with a description
aws ec2 create-snapshot \
--volume-id vol-xxxxxxxxx \
--description "Pre-deployment backup 2024-01-15"
# Copy snapshot to another region for DR
aws ec2 copy-snapshot \
--source-region us-east-1 \
--source-snapshot-id snap-xxxxxxxxx \
--destination-region eu-west-1 \
--description "DR copy"
# Use DLM (Data Lifecycle Manager) to automate snapshots
aws dlm create-lifecycle-policy \
--execution-role-arn arn:aws:iam::123456789:role/AWSDataLifecycleManagerDefaultRole \
--description "Daily snapshots" \
--state ENABLED \
--policy-details file://dlm-policy.jsonWhen you restore a volume from a snapshot, the volume is created immediately but blocks are loaded lazily from S3 in the background. Production databases restored from snapshots can show severely degraded I/O until all blocks are pre-warmed. Use Fast Snapshot Restore or pre-warm by reading all blocks (dd if=/dev/xvda of=/dev/null) before going live.
Encryption and Multi-Attach
EBS encryption uses AWS KMS and encrypts data at rest, in transit between the instance and volume, and in snapshots. Multi-Attach allows one io1/io2 volume to attach to up to 16 Nitro-based EC2 instances simultaneously.
| Feature | Detail |
|---|---|
| Encryption algorithm | AES-256 |
| Key options | AWS managed key (aws/ebs) or customer managed KMS key |
| Encrypted volume from unencrypted snapshot | Possible during copy - specify encryption and key |
| Unencrypted volume from encrypted snapshot | Not possible |
| Multi-Attach support | io1 and io2 only, same AZ, up to 16 instances |
| Multi-Attach file system requirement | Cluster-aware file system required (e.g. GFS2, OCFS2) - ext4/XFS will corrupt |
# Enable encryption by default for new volumes in the account
aws ec2 enable-ebs-encryption-by-default --region us-east-1
# Create an encrypted volume with a custom KMS key
aws ec2 create-volume \
--availability-zone us-east-1a \
--size 100 \
--volume-type gp3 \
--encrypted \
--kms-key-id arn:aws:kms:us-east-1:123456789:key/mrk-xxxxxxxxxMulti-Attach requires a cluster-aware file system. Using a standard file system like ext4 or XFS with Multi-Attach will result in data corruption because they do not handle concurrent writes from multiple hosts.
EBS Pricing and Cost Optimization
EBS costs are based on provisioned capacity and IOPS, not actual usage. You pay for what you provision, not what you use - making right-sizing critical.
| Volume Type | Storage Cost (us-east-1) | IOPS Cost | Throughput Cost |
|---|---|---|---|
| gp3 | $0.08/GB-month | $0.005/IOPS above 3,000 | $0.04/MB/s above 125 |
| gp2 | $0.10/GB-month | Included (3 IOPS/GB) | Included |
| io2 | $0.125/GB-month | $0.065/IOPS up to 32k; $0.046 up to 64k | Included |
| st1 | $0.045/GB-month | Included | Included |
| sc1 | $0.015/GB-month | Included | Included |
| Snapshots | $0.05/GB-month | N/A | N/A |
Migrating from gp2 to gp3 typically saves 20% on storage costs while providing better baseline performance. Use AWS Compute Optimizer to identify oversized EBS volumes and volumes where gp2 to gp3 migration is recommended.
Interview Focus Points
- 1What is the difference between gp2 and gp3 and why should you migrate existing gp2 volumes?
- 2A database on EC2 is experiencing high latency. Walk me through how you would diagnose an EBS I/O bottleneck.
- 3Explain how EBS snapshots work incrementally and what happens when you delete an intermediate snapshot.
- 4What is EBS Multi-Attach and what file system requirements does it impose?
- 5How would you ensure an EBS-backed EC2 instance can survive an AZ failure?
- 6What is the "lazy loading" behavior when restoring an EBS volume from a snapshot and how do you handle it for production?
- 7How do you encrypt an existing unencrypted EBS volume without downtime?
- 8When would you choose io2 over gp3 for a database workload?
- 9What is AWS Data Lifecycle Manager and how does it differ from creating snapshots manually?