Ace Cloud Interviews
Home/AWS Tutorial/Timestream
🗃️

AWS Database

Timestream

Fast, scalable, serverless time-series database for IoT and operational applications

Amazon Timestream is a fast, scalable, serverless time-series database designed for IoT telemetry, DevOps metrics, and operational data that is naturally timestamped. It automatically tiers data between an in-memory store for recent data and a cost-optimized magnetic store for historical data, separating hot and cold storage transparently. Timestream can store and analyze trillions of time-series data points per day at a fraction of the cost of a relational database storing the same data.

Timestream Architecture: Memory Store and Magnetic Store

Timestream automatically manages two storage tiers. Recent data lives in the in-memory store for high-speed ingest and query. As data ages past the memory store retention period, Timestream automatically moves it to the magnetic store. Queries transparently span both tiers.

AttributeMemory StoreMagnetic Store
PurposeRecent data; high-speed writes and queriesHistorical data; cost-optimized
LatencyMicroseconds to millisecondsMilliseconds to seconds
RetentionConfigurable (hours to days)Configurable (years; up to unlimited)
CostHigher (per GB-hour)Lower (per GB-month)
Data movementAutomatic when memory retention expiresTransparent to queries
💡

A common configuration for IoT telemetry is a 24-hour memory store with a 1-year magnetic store. Recent dashboard queries hit the memory store and are fast; historical analytics queries go to the magnetic store at lower cost.

Data Model: Databases, Tables, Dimensions, Measures, and Time

Timestream organizes data differently from relational databases. Every record has a timestamp, dimensions (metadata that describes the series), and measures (the actual measured values).

ConceptDescriptionExample
DatabaseLogical container for tables'iot-sensors'
TableContainer for time-series records'temperature-readings'
DimensionMetadata identifying the time series; low cardinalitydevice_id, location, sensor_type
Measure nameWhat is being measured'cpu_utilization', 'temperature'
Measure valueThe actual reading72.5, 98
TimeTimestamp of the measurement (nanosecond precision)2024-01-15 10:00:00.000000000
bash
-- Timestream uses a SQL-like query language
-- Query average temperature per device in the last hour
SELECT device_id,
       AVG(measure_value::double) AS avg_temp,
       bin(time, 5m) AS time_bucket
FROM "iot-sensors"."temperature-readings"
WHERE measure_name = 'temperature'
  AND time BETWEEN ago(1h) AND now()
GROUP BY device_id, bin(time, 5m)
ORDER BY time_bucket DESC

-- Use built-in time-series functions
SELECT device_id,
       INTERPOLATE_LINEAR(
         CREATE_TIME_SERIES(time, measure_value::double),
         SEQUENCE(min(time), max(time), 1m)
       ) AS interpolated_readings
FROM "iot-sensors"."temperature-readings"
WHERE measure_name = 'temperature'
  AND time BETWEEN ago(6h) AND now()
GROUP BY device_id

Integrations: IoT Core, Kinesis, Grafana, and SageMaker

IntegrationHow It WorksUse Case
AWS IoT CoreIoT Core rules can route MQTT messages directly to TimestreamIngest IoT device telemetry without custom code
Amazon Kinesis Data StreamsKinesis Data Analytics (Flink) can write to TimestreamHigh-throughput streaming telemetry
Amazon Managed GrafanaNative Timestream data source pluginReal-time operational dashboards
AWS LambdaWrite records via the Timestream SDK in LambdaServerless telemetry pipeline
Amazon SageMakerExport Timestream data to S3 for ML trainingAnomaly detection, forecasting models
💡

The native Grafana integration is one of Timestream's strongest selling points for DevOps and IoT teams. You can stand up a real-time operational dashboard in minutes using Amazon Managed Grafana with a Timestream data source, without building any custom query layer.

Timestream vs InfluxDB vs TimescaleDB vs DynamoDB for Time-Series

DatabaseModelStrengthsWeaknesses
TimestreamServerless, managed, AWS-nativeNo ops, auto-tiering, AWS integrations, SQL-likeAWS lock-in, limited query flexibility vs SQL
InfluxDB (Cloud)Purpose-built time-series, open source + managedFlux query language, rich ecosystem, multi-cloudFlux learning curve, OSS version self-managed
TimescaleDBPostgreSQL extension for time-seriesFull SQL, ACID, rich ecosystem, familiarRequires PostgreSQL management (unless Timescale Cloud)
DynamoDBKey-value NoSQLSub-millisecond reads, serverless, global tablesNo time-series functions, expensive for high-frequency writes
💡

Choose Timestream when you are already on AWS, want zero infrastructure management, and need native IoT Core or Kinesis integration. Choose TimescaleDB when you need full SQL and your team is comfortable with PostgreSQL. Choose InfluxDB when you need the Flux ecosystem or multi-cloud portability.

Pricing Model

ComponentPricing BasisTip
WritesPer million write requests (per 1 KB chunk)Batch writes (up to 100 records) to reduce per-request cost
Memory storePer GB-hour storedKeep retention short; move to magnetic quickly
Magnetic storePer GB-month storedEnable magnetic store writes for late-arriving data
QueriesPer GB of data scannedUse time predicates to minimize data scanned
Scheduled queriesPer scheduled query execution + data scannedMaterialize aggregates to reduce ad-hoc scan costs
⚠️

Timestream query costs are based on data scanned, similar to Athena. A query without a time range predicate will scan the entire table and generate a large bill. Always include WHERE time BETWEEN ... AND ... in production queries.

🎯

Interview Focus Points

  • 1What is Timestream and what types of workloads is it optimized for?
  • 2Explain the memory store and magnetic store architecture. How does automatic tiering work?
  • 3How does Timestream pricing work? What are the main cost levers?
  • 4Compare Timestream to storing time-series data in DynamoDB or RDS. When does each make sense?
  • 5How do you ingest IoT telemetry from AWS IoT Core into Timestream?
  • 6What are Timestream scheduled queries and why would you use them?
  • 7What are dimensions and measures in the Timestream data model?
  • 8What time-series specific query functions does Timestream provide?