AWS Database
ElastiCache
Managed in-memory caching with Redis or Memcached to accelerate application performance
Amazon ElastiCache is a fully managed in-memory caching service that supports two engines: Redis and Memcached. It is used to accelerate applications by storing frequently accessed data in memory, reducing database load and cutting response times from milliseconds to microseconds. ElastiCache is a core component of almost every high-scale AWS architecture, handling session storage, query result caching, leaderboards, pub/sub messaging, and rate limiting.
Redis vs Memcached: Choosing the Right Engine
Redis and Memcached solve different problems. Redis is richer and more durable; Memcached is simpler and horizontally scalable for pure caching. In practice, Redis is chosen for almost all new workloads because of its data structure support and persistence options.
| Feature | Redis | Memcached |
|---|---|---|
| Data structures | Strings, hashes, lists, sets, sorted sets, streams, bitmaps, HyperLogLog | Strings only |
| Persistence | RDB snapshots + AOF (append-only file) | None - data lost on restart |
| Replication | Primary-replica with automatic failover (Cluster Mode or Sentinel) | None |
| Multi-AZ failover | Yes (automatic with Redis Cluster Mode) | No |
| Pub/Sub messaging | Yes | No |
| Lua scripting | Yes | No |
| Max memory per node | Up to 419 GB (r6g.16xlarge) | Up to 419 GB |
| Horizontal scaling (data sharding) | Yes (Cluster Mode Enabled) | Yes (native multi-threaded) |
| Use case | Sessions, leaderboards, rate limiting, queues, pub/sub, caching | Simple object caching, horizontal scaling simplicity |
If you need any of: replication, failover, persistence, complex data types, or pub/sub - choose Redis. Memcached is only preferable if you need the simplest possible horizontal scaling and do not need any of the above features.
Redis Cluster Architecture: Cluster Mode Disabled vs Enabled
ElastiCache Redis can run in two modes. The mode you choose determines how data is sharded and how you scale.
| Attribute | Cluster Mode Disabled | Cluster Mode Enabled |
|---|---|---|
| Sharding | Single shard (one primary) | Up to 500 shards |
| Max data size | Limited to one node's memory | Total memory across all shards |
| Scaling writes | Scale up (larger instance) | Scale out (add shards) |
| Read replicas | Up to 5 per primary | Up to 5 per shard |
| Multi-AZ | Yes (replica in different AZ) | Yes (replicas distributed across AZs) |
| Replication type | Async to replicas | Async to replicas within each shard |
| Client complexity | Single endpoint | Must use cluster-aware client |
You cannot enable or disable cluster mode on an existing cluster. You must create a new cluster and migrate data. Plan for cluster mode from the start if you anticipate needing to scale beyond a single node's memory.
Caching Strategies: Lazy Loading, Write-Through, Write-Behind
How you populate and invalidate the cache is as important as the cache itself. The wrong strategy leads to stale data, cache stampedes, or high cache miss rates.
| Strategy | How It Works | Pros | Cons |
|---|---|---|---|
| Lazy Loading (Cache-Aside) | Read cache; on miss, fetch from DB, store in cache | Only caches accessed data; resilient to cache failure | Cache miss adds latency; stale data possible |
| Write-Through | Write to cache and DB simultaneously on every write | Cache always fresh; no stale reads | Write penalty; caches data that may never be read |
| Write-Behind (Write-Back) | Write to cache immediately; flush to DB asynchronously | Lowest write latency | Risk of data loss if cache node fails before flush |
| Refresh-Ahead | Proactively refresh cache before TTL expires | Eliminates cache miss latency for popular keys | Complex to implement; may fetch unused data |
Lazy Loading with a TTL is the most common and safest pattern. Use Write-Through when you cannot tolerate stale reads. Never use Write-Behind for data you cannot afford to lose.
ElastiCache as a Session Store
Storing user sessions in ElastiCache Redis is one of the most common and well-understood patterns. It solves the sticky session problem in horizontally scaled applications.
| Approach | Problem | ElastiCache Solution |
|---|---|---|
| Sticky sessions on ALB | Single instance holds session; loses data on instance failure | All instances share Redis; any instance can serve any user |
| Local in-memory session | Session lost if process restarts | Redis persists session with configurable TTL |
| RDS session storage | Database becomes a bottleneck for session reads | Microsecond reads from Redis; no DB query needed |
# Example: set a session in Redis with 30-minute TTL
# Using redis-cli
SET session:abc123 '{"userId": "user#456", "role": "admin"}' EX 1800
# Get session
GET session:abc123
# Delete session on logout
DEL session:abc123
# Check TTL remaining
TTL session:abc123Security: Encryption, Auth, and Network Isolation
ElastiCache runs inside your VPC. Security is layered: network isolation via security groups, in-transit encryption via TLS, and authentication via Redis AUTH or IAM (for newer engine versions).
| Security Layer | Mechanism | Notes |
|---|---|---|
| Network | VPC subnet groups + security groups | Never expose ElastiCache endpoints to the internet |
| In-transit encryption | TLS (enable at cluster creation) | Requires TLS-capable client; small performance overhead |
| At-rest encryption | AES-256 via KMS | For Redis Cluster Mode Enabled |
| Authentication | Redis AUTH token or IAM auth (Redis 7+) | AUTH token is a shared password; IAM is preferred |
| RBAC | Redis ACLs (users/passwords per command/key pattern) | Available on Redis 6+; replaces single AUTH token |
Encryption in transit must be enabled at cluster creation. You cannot enable TLS on an existing cluster without creating a new one. Always enable TLS for any cluster handling sensitive data.
Pricing Model
| Component | Pricing Basis | Optimization Tip |
|---|---|---|
| Node hours | Per hour by node type (cache.t4g, cache.r7g, etc.) | Reserve 1-3 years for production savings up to 55% |
| Backup storage | Per GB-month beyond free allowance (1x cluster size free) | Reduce backup retention for dev clusters |
| Data transfer | Intra-AZ free; cross-AZ charged per GB | Place application and cache in same AZ |
| Serverless ElastiCache | Per ECU (ElastiCache Compute Unit) and GB stored | No capacity planning needed; pay for what you use |
ElastiCache Serverless (Redis and Memcached) launched in 2023 and automatically scales capacity. It charges per ECU and data stored - useful for unpredictable workloads where you want to avoid over-provisioning.
Interview Focus Points
- 1What is the difference between Redis and Memcached in ElastiCache? When would you choose each?
- 2Explain lazy loading vs write-through caching. What are the trade-offs?
- 3What is the difference between Cluster Mode Enabled and Cluster Mode Disabled in Redis?
- 4How would you implement session management using ElastiCache Redis in a horizontally scaled application?
- 5How do you handle cache invalidation? What strategies exist and what are the pitfalls?
- 6What is a cache stampede and how do you prevent it?
- 7How do you secure an ElastiCache cluster? Walk through each layer.
- 8When would you use ElastiCache vs DynamoDB DAX vs CloudFront for caching?
- 9A Redis cluster is running out of memory. What are your options and how do you decide?