AWS Analytics & Big Data
Kinesis
Real-time streaming data ingestion and analytics for logs, events, and clickstreams
Amazon Kinesis is a family of services for real-time data streaming, covering ingestion (Kinesis Data Streams), processing (Kinesis Data Analytics), and delivery (Kinesis Data Firehose). It handles millions of events per second with sub-second latency, making it the foundation for real-time analytics pipelines, log aggregation, and event-driven architectures on AWS. Understanding Kinesis is essential for any data engineering or platform engineering role on AWS.
The Four Kinesis Services Explained
Kinesis is not one service - it is a family of four related but distinct products:
| Service | What It Does | Consumers | Managed? |
|---|---|---|---|
| Kinesis Data Streams (KDS) | Durable, ordered stream of records with shard-based parallelism | Lambda, KCL apps, Analytics | Partially - you provision shards |
| Kinesis Data Firehose | Managed delivery of streams to S3, Redshift, OpenSearch, Splunk | Only Firehose destinations | Fully - no shards to manage |
| Kinesis Data Analytics | Run SQL or Apache Flink against a stream in real time | Outputs to another stream or S3 | Fully managed Flink runtime |
| Kinesis Video Streams | Ingest and process video streams from devices | ML models, Rekognition | Fully managed |
In most interviews, "Kinesis" means Kinesis Data Streams. Make sure to clarify which service is being discussed. Firehose is the right choice when you just need delivery to a destination - KDS is for custom consumer applications that need multiple readers or replay.
Kinesis Data Streams - Shards, Partitions, and Retention
A Kinesis Data Stream is divided into shards. Each shard provides 1 MB/s ingest and 2 MB/s egress at up to 1,000 records/sec. Producers assign a partition key to each record; Kinesis hashes this key to determine the shard.
| Concept | Limit | Notes |
|---|---|---|
| Shard ingest | 1 MB/s or 1,000 records/s | ProvisionedThroughputExceededException if exceeded |
| Shard egress per consumer | 2 MB/s | Shared across all GetRecords callers on that shard |
| Enhanced fan-out egress | 2 MB/s per consumer per shard | Dedicated per-consumer HTTP/2 push - extra cost |
| Record retention | 24 hours default, up to 7 days (365 days extended) | Extended retention has additional cost |
| Record size | 1 MB max | Including partition key |
Hot shard problem: if most records share the same partition key (e.g., a single user ID or a constant string), all traffic lands on one shard and you hit the 1 MB/s limit. Use high-cardinality partition keys and distribute writes across shards.
Kinesis supports two capacity modes: Provisioned (you specify shard count) and On-Demand (auto-scales, higher cost per GB). On-Demand simplifies capacity planning but costs roughly 3x more per GB than Provisioned at steady high throughput.
Kinesis vs SQS vs MSK - Choosing the Right Stream
Three AWS services handle message streaming but with very different trade-offs:
| Attribute | Kinesis Data Streams | SQS | MSK (Kafka) |
|---|---|---|---|
| Ordering | Per-shard ordering guaranteed | Best-effort (FIFO queues per group) | Per-partition ordering |
| Replay | Yes - up to 365 days | No - consumed and deleted | Yes - configurable retention |
| Multiple consumers | Yes - all consumers see all records | No - message delivered to one consumer | Yes - consumer groups |
| Throughput scaling | Shard split/merge | Automatic, unlimited | Add brokers/partitions |
| Latency | Milliseconds | Milliseconds | Milliseconds |
| Operational burden | Low - managed | None - fully serverless | Medium - managed but more config |
Rule of thumb: use SQS for work queues where each message is processed once. Use Kinesis for event streams where multiple consumers need to read the same data or you need replay. Use MSK when you need Kafka compatibility or very high throughput with fine-grained partition control.
Consumer Patterns - KCL, Lambda, and Enhanced Fan-Out
Kinesis supports three main consumption patterns:
| Pattern | How It Works | Best For |
|---|---|---|
| GetRecords polling | Consumer polls each shard every 200ms, up to 2 MB/s shared | Low-throughput, low-cost consumers |
| Enhanced Fan-Out | HTTP/2 push, 2 MB/s dedicated per registered consumer per shard | Multiple parallel consumers, low latency |
| Lambda trigger | Lambda polls shards (uses GetRecords internally), invokes on batch | Serverless event processing |
| KCL (Kinesis Client Library) | Java/Python library, handles checkpointing in DynamoDB | Long-running consumer applications |
# Create a Kinesis stream with 4 shards
aws kinesis create-stream \
--stream-name my-event-stream \
--shard-count 4
# Put a record
aws kinesis put-record \
--stream-name my-event-stream \
--partition-key user-123 \
--data "$(echo '{"event":"click","userId":"123"}' | base64)"
# Register enhanced fan-out consumer
aws kinesis register-stream-consumer \
--stream-arn arn:aws:kinesis:us-east-1:123456789012:stream/my-event-stream \
--consumer-name analytics-consumerKinesis Firehose - Managed Delivery to Destinations
Firehose is fully serverless - there are no shards to manage. You configure a delivery stream with a source (KDS, Direct Put, or MSK), optional Lambda transformation, and a destination.
| Destination | Buffering | Format Conversion |
|---|---|---|
| S3 | 1-900s or 1-128 MB buffer | JSON to Parquet/ORC via Glue schema |
| Redshift | COPY command after S3 staging | No - must match table schema |
| OpenSearch | 1-900s or 1-100 MB buffer | No format conversion |
| HTTP endpoint | Custom endpoint | Custom |
| Splunk | 0-60s | No |
Firehose can convert JSON to Parquet or ORC automatically using a Glue schema - this is a huge cost saver because columnar formats are 3-10x cheaper to query with Athena. Always enable this for S3 destinations if the source is structured JSON.
Interview Focus Points
- 1What is the difference between Kinesis Data Streams and Kinesis Firehose - when do you use each?
- 2Explain the hot shard problem and how you would fix it.
- 3How does Enhanced Fan-Out differ from standard GetRecords polling, and when is the extra cost justified?
- 4How does Kinesis Data Streams compare to SQS FIFO queues for ordered event processing?
- 5A Lambda consumer is falling behind on a Kinesis stream - what are the possible causes and how do you diagnose them?
- 6How does Kinesis handle exactly-once delivery vs at-least-once delivery?
- 7Explain how you would build a real-time clickstream pipeline using Kinesis, Lambda, and S3.
- 8What is the retention period for Kinesis Data Streams and what are the cost implications of extended retention?
- 9How do you scale a Kinesis stream when ingest exceeds current shard capacity?