AWS AI & Machine Learning
Personalize
Real-time recommendation engine without requiring ML expertise
Amazon Personalize is a fully managed machine learning service that lets you build real-time recommendation systems using the same technology that powers Amazon.com product recommendations - without requiring ML expertise. It trains models on your user interaction data and serves personalized recommendations through a low-latency API. For cloud engineers, Personalize is the fastest path to production recommendations without managing training infrastructure or knowing ML.
Personalize Data Model - Interactions, Users, and Items
Personalize requires interaction history (minimum 1,000 interactions from 25+ unique users) and optionally user and item metadata.
| Dataset Type | Required? | Schema Fields | Example |
|---|---|---|---|
| Interactions | Yes - minimum 1000 records | USER_ID, ITEM_ID, TIMESTAMP + optional EVENT_TYPE, EVENT_VALUE | user123, article-456, 1704067200, click |
| Users | Optional | USER_ID + metadata fields | user123, age_group=25-34, location=us-west |
| Items | Optional | ITEM_ID + metadata fields | article-456, category=cloud, topic=kubernetes |
Interactions data is the most important input. More historical data = better recommendations. AWS recommends at least 50 interactions per user for high-quality personalization.
Recipes - Choosing the Right Algorithm
Personalize calls its algorithms "recipes." Different recipes solve different recommendation problems. Choosing the right recipe for your use case is a key interview topic.
| Recipe | Use Case | Input | Output |
|---|---|---|---|
| USER_PERSONALIZATION (aws-user-personalization) | Homepage personalized feed | User ID | Ranked list of recommended items for that user |
| RELATED_ITEMS (aws-similar-items) | "More like this" widget | Item ID | Similar items based on co-interaction patterns |
| PERSONALIZED_RANKING (aws-personalized-ranking) | Rerank search results for a user | User ID + list of item IDs | Input list reranked by user preference |
| USER_SEGMENTATION (aws-item-affinity) | Marketing segment creation | Item ID | Users most likely to interact with an item |
| TRENDING_NOW (aws-trending-now) | Trending content section | None required | Currently trending items across all users |
| POPULARITY_COUNT (aws-popularity-count) | Fallback for new users (cold start) | None required | Most popular items overall |
For new users with no interaction history (cold start), Personalize falls back to POPULARITY_COUNT behavior. You can customize this behavior with contextual metadata passed at recommendation time.
Real-time Campaigns vs Batch Recommendations
Personalize serves recommendations via two mechanisms: real-time campaigns (always-on endpoints) and batch inference jobs (offline scoring).
| Aspect | Real-time Campaign | Batch Inference Job |
|---|---|---|
| Latency | <100ms API response | Minutes to hours for entire user base |
| Cost | Per TPS capacity reserved (always on) | Per user processed (ephemeral) |
| Fresh interactions | PutEvents API updates model in real time | Uses static data snapshot |
| Use case | Website homepage, real-time API | Email campaigns, pre-computed feeds, nightly batch |
| Scale | Up to 500 TPS per campaign | Millions of users per job |
# Get real-time recommendations
import boto3
personalize_runtime = boto3.client('personalize-runtime')
response = personalize_runtime.get_recommendations(
campaignArn='arn:aws:personalize:us-east-1:123456789012:campaign/article-recommender',
userId='user-789',
numResults=10,
context={'DEVICE': 'mobile', 'TIME_OF_DAY': 'evening'}
)
for item in response['itemList']:
print(f"item: {item['itemId']}, score: {item['score']:.4f}")
Campaigns have a minimum provisioned TPS that you pay for even at zero traffic. The minimum is 1 TPS = ~$43/month. If you have many low-traffic models, use batch inference instead of always-on campaigns.
Personalize Pricing
| Component | Pricing |
|---|---|
| Data ingestion | $0.05 per GB ingested into Personalize |
| Training | $0.24 per training hour |
| Real-time inference (campaign) | $0.20 per TPS-hour (minimum 1 TPS) |
| Batch inference | $0.067 per 1,000 users processed |
Interview Focus Points
- 1What are the different Personalize recipes and when would you use USER_PERSONALIZATION vs PERSONALIZED_RANKING?
- 2How does Personalize handle new users with no interaction history (cold start problem)?
- 3What is the PutEvents API and why is it important for real-time personalization?
- 4When would you use batch inference instead of a real-time campaign in Personalize?
- 5What is the minimum data requirement to train a Personalize model?
- 6How would you A/B test two different Personalize recommendation strategies in production?