Ace Cloud Interviews
Home/AWS Tutorial/S3 Glacier
🗄️

AWS Storage

S3 Glacier

Low-cost archival storage with retrieval times from milliseconds to hours

S3 Glacier is AWS's purpose-built archival storage service, offering the lowest storage costs on the platform (as low as $0.00099/GB-month for Deep Archive) in exchange for retrieval delays ranging from milliseconds to 48 hours. It is designed for data that must be retained for years but is rarely accessed - compliance records, media archives, scientific data, and long-term backups. Understanding the retrieval tier trade-offs and archive management patterns is important for cost-effective long-term data retention strategies.

Glacier Storage Tiers and Retrieval Options

Glacier is now integrated into S3 as three S3 storage classes rather than a separate service. Each tier has different cost and retrieval time trade-offs.

TierStorage CostRetrieval OptionsMin Storage Duration
Glacier Instant Retrieval$0.004/GB-monthMilliseconds (same as Standard-IA)90 days
Glacier Flexible Retrieval$0.0036/GB-monthExpedited (1-5 min), Standard (3-5 hr), Bulk (5-12 hr)90 days
Glacier Deep Archive$0.00099/GB-monthStandard (up to 12 hr), Bulk (up to 48 hr)180 days
Retrieval TierSpeedCost (per GB + per request)Use Case
Expedited1-5 minutes$0.03/GB + $0.01/requestOccasional urgent access, small files (<250MB)
Standard3-5 hours$0.01/GB + $0.0025/1,000 reqDefault for non-urgent restores
Bulk5-12 hours$0.0025/GB + $0.0025/1,000 reqLarge restores where cost matters more than speed
💡

Provisioned retrieval capacity guarantees Expedited retrievals complete within 1-5 minutes even during peak demand. Each unit costs $100/month and provides 3 Expedited retrievals per minute and 150MB/s of Expedited retrieval throughput.

Archiving, Restoring, and Vault Operations

When using Glacier through S3 lifecycle policies, you interact with it entirely through S3 APIs. When using the original Glacier service directly, you use Vaults and Archives.

The most common pattern is transitioning S3 objects to Glacier storage classes via lifecycle policies:

bash
# Restore an object from Glacier Flexible Retrieval (initiates async restore)
aws s3api restore-object \
  --bucket my-archive-bucket \
  --key "records/2020/annual-report.pdf" \
  --restore-request '{"Days":7,"GlacierJobParameters":{"Tier":"Standard"}}'

# Check restore status
aws s3api head-object \
  --bucket my-archive-bucket \
  --key "records/2020/annual-report.pdf"
# Look for: "Restore": "ongoing-request=\"false\", expiry-date=\"...""

# Copy restored object to Standard storage if you need permanent access
aws s3 cp \
  s3://my-archive-bucket/records/2020/annual-report.pdf \
  s3://my-working-bucket/records/2020/annual-report.pdf
⚠️

Restoring from Glacier does not change the storage class of the object - it creates a temporary copy in Standard storage for the number of days you specify. After the expiry date, the temporary copy is deleted and the object remains archived in Glacier. To permanently move it back, you must copy it to a new location.

Vault Lock and Compliance Policies

Glacier Vault Lock provides immutable, WORM (Write Once Read Many) compliance policies for regulatory requirements like SEC 17a-4, HIPAA, and financial records retention laws.

FeatureVault LockS3 Object Lock
Applies toGlacier Vaults (legacy direct API)S3 objects in versioning-enabled buckets
Policy typeResource-based policy, once locked cannot be changedRetention modes per object or bucket default
Lock modesCompliance mode (immutable)Governance mode (admin override) or Compliance mode (no override)
Legal holdNot supported on vault lockSupported - prevents deletion regardless of retention period
AuditPolicy locked permanentlyAudit via S3 API

For new workloads, S3 Object Lock is preferred over Glacier Vault Lock because it works within S3 lifecycle policies:

bash
# Enable S3 Object Lock on a bucket (must be set at creation)
aws s3api create-bucket \
  --bucket compliance-archive \
  --object-lock-enabled-for-bucket

# Set a default retention policy of 7 years (compliance mode)
aws s3api put-object-lock-configuration \
  --bucket compliance-archive \
  --object-lock-configuration \
    '{"ObjectLockEnabled":"Enabled","Rule":{"DefaultRetention":{"Mode":"COMPLIANCE","Years":7}}}'
💡

Once an S3 Vault Lock policy is in the "locked" state (after the 24-hour initiation period), it cannot be modified or deleted - even by the AWS root account. Test thoroughly before locking.

Glacier Cost Optimization Strategies

Glacier costs are straightforward for storage but retrieval and early deletion fees can add up. Understanding the minimum storage duration charges is critical for cost optimization.

Cost TrapHow It WorksHow to Avoid
Minimum storage durationGlacier Flexible charges 90 days even if deleted earlier; Deep Archive charges 180 daysOnly archive data you know you won't delete for the minimum period
Per-archive overheadEach archive has 32KB overhead for index data and 8KB for Glacier metadataAggregate small files into tar/zip before archiving
Expedited retrieval costs$0.03/GB + $0.01/request - expensive for large restoresUse Standard or Bulk tier unless urgent
Data transfer out$0.09/GB from Glacier to internetKeep restored data within AWS where possible
💡

For archiving many small files, always aggregate them into larger archives (e.g. tar.gz) before sending to Glacier. The per-archive overhead and index storage make small individual files disproportionately expensive.

🎯

Interview Focus Points

  • 1What are the three Glacier storage classes and when would you choose each?
  • 2Explain how you restore an object from Glacier and what happens after the restore expiry.
  • 3What is Glacier Vault Lock and how does it differ from S3 Object Lock?
  • 4A company needs to retain financial records for 7 years with SEC 17a-4 compliance - how would you architect this?
  • 5What are the minimum storage duration charges in Glacier and why do they matter for workload design?
  • 6How do you use S3 Lifecycle policies to automatically transition data through storage classes?
  • 7When would you use Provisioned Retrieval Capacity for Glacier Expedited retrievals?