Personalize

Real-time recommendation engine without requiring ML expertise

Amazon Personalize is a fully managed machine learning service that lets you build real-time recommendation systems using the same technology that powers Amazon.com product recommendations - without requiring ML expertise. It trains models on your user interaction data and serves personalized recommendations through a low-latency API. For cloud engineers, Personalize is the fastest path to production recommendations without managing training infrastructure or knowing ML.

Personalize Data Model - Interactions, Users, and Items

Personalize requires interaction history (minimum 1,000 interactions from 25+ unique users) and optionally user and item metadata.

Dataset Type	Required?	Schema Fields	Example
Interactions	Yes - minimum 1000 records	USER_ID, ITEM_ID, TIMESTAMP + optional EVENT_TYPE, EVENT_VALUE	user123, article-456, 1704067200, click
Users	Optional	USER_ID + metadata fields	user123, age_group=25-34, location=us-west
Items	Optional	ITEM_ID + metadata fields	article-456, category=cloud, topic=kubernetes

Interactions data is the most important input. More historical data = better recommendations. AWS recommends at least 50 interactions per user for high-quality personalization.

Recipes - Choosing the Right Algorithm

Personalize calls its algorithms "recipes." Different recipes solve different recommendation problems. Choosing the right recipe for your use case is a key interview topic.

Recipe	Use Case	Input	Output
USER_PERSONALIZATION (aws-user-personalization)	Homepage personalized feed	User ID	Ranked list of recommended items for that user
RELATED_ITEMS (aws-similar-items)	"More like this" widget	Item ID	Similar items based on co-interaction patterns
PERSONALIZED_RANKING (aws-personalized-ranking)	Rerank search results for a user	User ID + list of item IDs	Input list reranked by user preference
USER_SEGMENTATION (aws-item-affinity)	Marketing segment creation	Item ID	Users most likely to interact with an item
TRENDING_NOW (aws-trending-now)	Trending content section	None required	Currently trending items across all users
POPULARITY_COUNT (aws-popularity-count)	Fallback for new users (cold start)	None required	Most popular items overall

💡

For new users with no interaction history (cold start), Personalize falls back to POPULARITY_COUNT behavior. You can customize this behavior with contextual metadata passed at recommendation time.

Real-time Campaigns vs Batch Recommendations

Personalize serves recommendations via two mechanisms: real-time campaigns (always-on endpoints) and batch inference jobs (offline scoring).

Aspect	Real-time Campaign	Batch Inference Job
Latency	<100ms API response	Minutes to hours for entire user base
Cost	Per TPS capacity reserved (always on)	Per user processed (ephemeral)
Fresh interactions	PutEvents API updates model in real time	Uses static data snapshot
Use case	Website homepage, real-time API	Email campaigns, pre-computed feeds, nightly batch
Scale	Up to 500 TPS per campaign	Millions of users per job

bash

# Get real-time recommendations
import boto3

personalize_runtime = boto3.client('personalize-runtime')

response = personalize_runtime.get_recommendations(
    campaignArn='arn:aws:personalize:us-east-1:123456789012:campaign/article-recommender',
    userId='user-789',
    numResults=10,
    context={'DEVICE': 'mobile', 'TIME_OF_DAY': 'evening'}
)

for item in response['itemList']:
    print(f"item: {item['itemId']}, score: {item['score']:.4f}")

⚠️

Campaigns have a minimum provisioned TPS that you pay for even at zero traffic. The minimum is 1 TPS = ~$43/month. If you have many low-traffic models, use batch inference instead of always-on campaigns.

Personalize Pricing

Component	Pricing
Data ingestion	$0.05 per GB ingested into Personalize
Training	$0.24 per training hour
Real-time inference (campaign)	$0.20 per TPS-hour (minimum 1 TPS)
Batch inference	$0.067 per 1,000 users processed

🎯

Interview Focus Points

1What are the different Personalize recipes and when would you use USER_PERSONALIZATION vs PERSONALIZED_RANKING?
2How does Personalize handle new users with no interaction history (cold start problem)?
3What is the PutEvents API and why is it important for real-time personalization?
4When would you use batch inference instead of a real-time campaign in Personalize?
5What is the minimum data requirement to train a Personalize model?
6How would you A/B test two different Personalize recommendation strategies in production?