Ace Cloud Interviews
🤖

AWS AI & Machine Learning

Forecast

Time-series forecasting service using the same algorithms used at Amazon.com

Amazon Forecast is a fully managed time-series forecasting service that uses the same machine learning algorithms developed at Amazon.com for retail demand forecasting, without requiring any ML expertise. It supports multiple built-in algorithms including DeepAR+, CNN-QR, Prophet, ETS, and NPTS, and automatically selects the best model using AutoML. For cloud engineers, Forecast is valuable for predicting resource capacity needs, demand planning, and financial projections in data pipeline architectures.

Forecast Data Model - Target, Related, and Item Metadata

Forecast works with three types of time-series data that are combined into a Dataset Group. Understanding this structure is key to getting accurate predictions.

Dataset TypeRequired?ContentExample
Target Time Series (TTS)YesThe metric you want to forecast over timeitem_id=SKU001, timestamp=2024-01-01, demand=150
Related Time Series (RTS)OptionalExternal variables that affect the target (known in future)price, promotions, holidays, weather
Item Metadata (IM)OptionalStatic attributes of each itemcategory=electronics, brand=Sony, weight=0.5kg

The more context you provide (RTS and IM), the more accurate predictions become, especially for algorithms like DeepAR+ that use related data as covariates.

bash
# Target Time Series CSV format
# item_id,timestamp,target_value
# ITEM_001,2024-01-01 00:00:00,120
# ITEM_001,2024-01-02 00:00:00,145
# ITEM_002,2024-01-01 00:00:00,89

# Create a dataset group and import data
import boto3

forecast = boto3.client('forecast', region_name='us-east-1')

# Create dataset group
forecast.create_dataset_group(
    DatasetGroupName='retail_demand_group',
    Domain='RETAIL',
    DatasetArns=[]
)

Built-in Algorithms and AutoML

Forecast includes several algorithms optimized for different time-series patterns. AutoML trains all applicable algorithms and selects the best based on your accuracy metric.

AlgorithmTypeBest ForHandles Cold Start?
DeepAR+Deep learning (LSTM)Large datasets, many related series, complex patternsYes - uses item metadata
CNN-QRDeep learning (CNN)High-dimensional data, quantile forecastingYes
ProphetAdditive regressionStrong seasonal patterns, holidays, missing dataNo
ETS (Exponential Smoothing)StatisticalSimple trend and seasonality, small datasetsNo
NPTSNon-parametricSparse data, intermittent demandNo
ARIMAStatisticalStationary series, limited external covariatesNo
AutoMLEnsemble selectionBest default choice when unsureDepends on chosen model
💡

DeepAR+ is Amazon's signature algorithm and generally outperforms others on large retail datasets. However, for small datasets (<100 time series, <365 data points), statistical algorithms like ETS often perform better and train much faster.

End-to-End Forecast Workflow

A Forecast job follows a fixed sequence of steps. Each step creates an ARN resource that feeds into the next.

StepResource CreatedAction
1. Import dataDatasetImportJobUpload CSV to S3, create import job to load into Forecast
2. Train predictorPredictor (AutoPredictor)Train one or more algorithms on your dataset group
3. Generate forecastForecastApply the predictor to generate predictions for the horizon
4. Query forecastForecastExportJob or QueryForecast APIExport to S3 or query specific item forecasts via API
⚠️

Forecast resources are not automatically deleted. Training a predictor, creating a forecast, and leaving them idle still incurs storage costs. Always delete unused predictors and forecasts, or set up lifecycle automation with EventBridge.

Forecast Pricing

ComponentPricing
Data storage$0.088 per GB-month stored in Forecast
Training hours (standard)$0.24 per hour per algorithm trained
AutoML training$0.24 per hour x number of algorithms trained
Forecast generation$1.00 per 1,000 time series forecast per forecast horizon step
Forecast query$0.008 per query (QueryForecast API)
Export to S3Free (only pay for S3 storage)
💡

AutoML trains up to 6 algorithms which multiplies training cost by up to 6x. Start with AutoML during development to identify the best algorithm, then switch to training that single algorithm in production to reduce cost.

🎯

Interview Focus Points

  • 1What are the three dataset types in Amazon Forecast and what role does each play?
  • 2When would you choose DeepAR+ over Prophet or ETS?
  • 3How does Amazon Forecast handle cold-start items (new products with no historical data)?
  • 4What is AutoML in Forecast and what are its cost implications?
  • 5How would you design a weekly demand forecasting pipeline for a retail company using Forecast and Redshift?
  • 6What accuracy metrics does Forecast use and what is WAPE?