Kinesis

Real-time streaming data ingestion and analytics for logs, events, and clickstreams

Amazon Kinesis is a family of services for real-time data streaming, covering ingestion (Kinesis Data Streams), processing (Kinesis Data Analytics), and delivery (Kinesis Data Firehose). It handles millions of events per second with sub-second latency, making it the foundation for real-time analytics pipelines, log aggregation, and event-driven architectures on AWS. Understanding Kinesis is essential for any data engineering or platform engineering role on AWS.

The Four Kinesis Services Explained

Kinesis is not one service - it is a family of four related but distinct products:

Service	What It Does	Consumers	Managed?
Kinesis Data Streams (KDS)	Durable, ordered stream of records with shard-based parallelism	Lambda, KCL apps, Analytics	Partially - you provision shards
Kinesis Data Firehose	Managed delivery of streams to S3, Redshift, OpenSearch, Splunk	Only Firehose destinations	Fully - no shards to manage
Kinesis Data Analytics	Run SQL or Apache Flink against a stream in real time	Outputs to another stream or S3	Fully managed Flink runtime
Kinesis Video Streams	Ingest and process video streams from devices	ML models, Rekognition	Fully managed

💡

In most interviews, "Kinesis" means Kinesis Data Streams. Make sure to clarify which service is being discussed. Firehose is the right choice when you just need delivery to a destination - KDS is for custom consumer applications that need multiple readers or replay.

Kinesis Data Streams - Shards, Partitions, and Retention

A Kinesis Data Stream is divided into shards. Each shard provides 1 MB/s ingest and 2 MB/s egress at up to 1,000 records/sec. Producers assign a partition key to each record; Kinesis hashes this key to determine the shard.

Concept	Limit	Notes
Shard ingest	1 MB/s or 1,000 records/s	ProvisionedThroughputExceededException if exceeded
Shard egress per consumer	2 MB/s	Shared across all GetRecords callers on that shard
Enhanced fan-out egress	2 MB/s per consumer per shard	Dedicated per-consumer HTTP/2 push - extra cost
Record retention	24 hours default, up to 7 days (365 days extended)	Extended retention has additional cost
Record size	1 MB max	Including partition key

⚠️

Hot shard problem: if most records share the same partition key (e.g., a single user ID or a constant string), all traffic lands on one shard and you hit the 1 MB/s limit. Use high-cardinality partition keys and distribute writes across shards.

Kinesis supports two capacity modes: Provisioned (you specify shard count) and On-Demand (auto-scales, higher cost per GB). On-Demand simplifies capacity planning but costs roughly 3x more per GB than Provisioned at steady high throughput.

Kinesis vs SQS vs MSK - Choosing the Right Stream

Three AWS services handle message streaming but with very different trade-offs:

Attribute	Kinesis Data Streams	SQS	MSK (Kafka)
Ordering	Per-shard ordering guaranteed	Best-effort (FIFO queues per group)	Per-partition ordering
Replay	Yes - up to 365 days	No - consumed and deleted	Yes - configurable retention
Multiple consumers	Yes - all consumers see all records	No - message delivered to one consumer	Yes - consumer groups
Throughput scaling	Shard split/merge	Automatic, unlimited	Add brokers/partitions
Latency	Milliseconds	Milliseconds	Milliseconds
Operational burden	Low - managed	None - fully serverless	Medium - managed but more config

💡

Rule of thumb: use SQS for work queues where each message is processed once. Use Kinesis for event streams where multiple consumers need to read the same data or you need replay. Use MSK when you need Kafka compatibility or very high throughput with fine-grained partition control.

Consumer Patterns - KCL, Lambda, and Enhanced Fan-Out

Kinesis supports three main consumption patterns:

Pattern	How It Works	Best For
GetRecords polling	Consumer polls each shard every 200ms, up to 2 MB/s shared	Low-throughput, low-cost consumers
Enhanced Fan-Out	HTTP/2 push, 2 MB/s dedicated per registered consumer per shard	Multiple parallel consumers, low latency
Lambda trigger	Lambda polls shards (uses GetRecords internally), invokes on batch	Serverless event processing
KCL (Kinesis Client Library)	Java/Python library, handles checkpointing in DynamoDB	Long-running consumer applications

bash

# Create a Kinesis stream with 4 shards
aws kinesis create-stream \
  --stream-name my-event-stream \
  --shard-count 4

# Put a record
aws kinesis put-record \
  --stream-name my-event-stream \
  --partition-key user-123 \
  --data "$(echo '{"event":"click","userId":"123"}' | base64)"

# Register enhanced fan-out consumer
aws kinesis register-stream-consumer \
  --stream-arn arn:aws:kinesis:us-east-1:123456789012:stream/my-event-stream \
  --consumer-name analytics-consumer

Kinesis Firehose - Managed Delivery to Destinations

Firehose is fully serverless - there are no shards to manage. You configure a delivery stream with a source (KDS, Direct Put, or MSK), optional Lambda transformation, and a destination.

Destination	Buffering	Format Conversion
S3	1-900s or 1-128 MB buffer	JSON to Parquet/ORC via Glue schema
Redshift	COPY command after S3 staging	No - must match table schema
OpenSearch	1-900s or 1-100 MB buffer	No format conversion
HTTP endpoint	Custom endpoint	Custom
Splunk	0-60s	No

💡

Firehose can convert JSON to Parquet or ORC automatically using a Glue schema - this is a huge cost saver because columnar formats are 3-10x cheaper to query with Athena. Always enable this for S3 destinations if the source is structured JSON.

🎯

Interview Focus Points

1What is the difference between Kinesis Data Streams and Kinesis Firehose - when do you use each?
2Explain the hot shard problem and how you would fix it.
3How does Enhanced Fan-Out differ from standard GetRecords polling, and when is the extra cost justified?
4How does Kinesis Data Streams compare to SQS FIFO queues for ordered event processing?
5A Lambda consumer is falling behind on a Kinesis stream - what are the possible causes and how do you diagnose them?
6How does Kinesis handle exactly-once delivery vs at-least-once delivery?
7Explain how you would build a real-time clickstream pipeline using Kinesis, Lambda, and S3.
8What is the retention period for Kinesis Data Streams and what are the cost implications of extended retention?
9How do you scale a Kinesis stream when ingest exceeds current shard capacity?