AWS Storage
S3 Glacier
Low-cost archival storage with retrieval times from milliseconds to hours
S3 Glacier is AWS's purpose-built archival storage service, offering the lowest storage costs on the platform (as low as $0.00099/GB-month for Deep Archive) in exchange for retrieval delays ranging from milliseconds to 48 hours. It is designed for data that must be retained for years but is rarely accessed - compliance records, media archives, scientific data, and long-term backups. Understanding the retrieval tier trade-offs and archive management patterns is important for cost-effective long-term data retention strategies.
Glacier Storage Tiers and Retrieval Options
Glacier is now integrated into S3 as three S3 storage classes rather than a separate service. Each tier has different cost and retrieval time trade-offs.
| Tier | Storage Cost | Retrieval Options | Min Storage Duration |
|---|---|---|---|
| Glacier Instant Retrieval | $0.004/GB-month | Milliseconds (same as Standard-IA) | 90 days |
| Glacier Flexible Retrieval | $0.0036/GB-month | Expedited (1-5 min), Standard (3-5 hr), Bulk (5-12 hr) | 90 days |
| Glacier Deep Archive | $0.00099/GB-month | Standard (up to 12 hr), Bulk (up to 48 hr) | 180 days |
| Retrieval Tier | Speed | Cost (per GB + per request) | Use Case |
|---|---|---|---|
| Expedited | 1-5 minutes | $0.03/GB + $0.01/request | Occasional urgent access, small files (<250MB) |
| Standard | 3-5 hours | $0.01/GB + $0.0025/1,000 req | Default for non-urgent restores |
| Bulk | 5-12 hours | $0.0025/GB + $0.0025/1,000 req | Large restores where cost matters more than speed |
Provisioned retrieval capacity guarantees Expedited retrievals complete within 1-5 minutes even during peak demand. Each unit costs $100/month and provides 3 Expedited retrievals per minute and 150MB/s of Expedited retrieval throughput.
Archiving, Restoring, and Vault Operations
When using Glacier through S3 lifecycle policies, you interact with it entirely through S3 APIs. When using the original Glacier service directly, you use Vaults and Archives.
The most common pattern is transitioning S3 objects to Glacier storage classes via lifecycle policies:
# Restore an object from Glacier Flexible Retrieval (initiates async restore)
aws s3api restore-object \
--bucket my-archive-bucket \
--key "records/2020/annual-report.pdf" \
--restore-request '{"Days":7,"GlacierJobParameters":{"Tier":"Standard"}}'
# Check restore status
aws s3api head-object \
--bucket my-archive-bucket \
--key "records/2020/annual-report.pdf"
# Look for: "Restore": "ongoing-request=\"false\", expiry-date=\"...""
# Copy restored object to Standard storage if you need permanent access
aws s3 cp \
s3://my-archive-bucket/records/2020/annual-report.pdf \
s3://my-working-bucket/records/2020/annual-report.pdfRestoring from Glacier does not change the storage class of the object - it creates a temporary copy in Standard storage for the number of days you specify. After the expiry date, the temporary copy is deleted and the object remains archived in Glacier. To permanently move it back, you must copy it to a new location.
Vault Lock and Compliance Policies
Glacier Vault Lock provides immutable, WORM (Write Once Read Many) compliance policies for regulatory requirements like SEC 17a-4, HIPAA, and financial records retention laws.
| Feature | Vault Lock | S3 Object Lock |
|---|---|---|
| Applies to | Glacier Vaults (legacy direct API) | S3 objects in versioning-enabled buckets |
| Policy type | Resource-based policy, once locked cannot be changed | Retention modes per object or bucket default |
| Lock modes | Compliance mode (immutable) | Governance mode (admin override) or Compliance mode (no override) |
| Legal hold | Not supported on vault lock | Supported - prevents deletion regardless of retention period |
| Audit | Policy locked permanently | Audit via S3 API |
For new workloads, S3 Object Lock is preferred over Glacier Vault Lock because it works within S3 lifecycle policies:
# Enable S3 Object Lock on a bucket (must be set at creation)
aws s3api create-bucket \
--bucket compliance-archive \
--object-lock-enabled-for-bucket
# Set a default retention policy of 7 years (compliance mode)
aws s3api put-object-lock-configuration \
--bucket compliance-archive \
--object-lock-configuration \
'{"ObjectLockEnabled":"Enabled","Rule":{"DefaultRetention":{"Mode":"COMPLIANCE","Years":7}}}'Once an S3 Vault Lock policy is in the "locked" state (after the 24-hour initiation period), it cannot be modified or deleted - even by the AWS root account. Test thoroughly before locking.
Glacier Cost Optimization Strategies
Glacier costs are straightforward for storage but retrieval and early deletion fees can add up. Understanding the minimum storage duration charges is critical for cost optimization.
| Cost Trap | How It Works | How to Avoid |
|---|---|---|
| Minimum storage duration | Glacier Flexible charges 90 days even if deleted earlier; Deep Archive charges 180 days | Only archive data you know you won't delete for the minimum period |
| Per-archive overhead | Each archive has 32KB overhead for index data and 8KB for Glacier metadata | Aggregate small files into tar/zip before archiving |
| Expedited retrieval costs | $0.03/GB + $0.01/request - expensive for large restores | Use Standard or Bulk tier unless urgent |
| Data transfer out | $0.09/GB from Glacier to internet | Keep restored data within AWS where possible |
For archiving many small files, always aggregate them into larger archives (e.g. tar.gz) before sending to Glacier. The per-archive overhead and index storage make small individual files disproportionately expensive.
Interview Focus Points
- 1What are the three Glacier storage classes and when would you choose each?
- 2Explain how you restore an object from Glacier and what happens after the restore expiry.
- 3What is Glacier Vault Lock and how does it differ from S3 Object Lock?
- 4A company needs to retain financial records for 7 years with SEC 17a-4 compliance - how would you architect this?
- 5What are the minimum storage duration charges in Glacier and why do they matter for workload design?
- 6How do you use S3 Lifecycle policies to automatically transition data through storage classes?
- 7When would you use Provisioned Retrieval Capacity for Glacier Expedited retrievals?