AWS AI & Machine Learning
Rekognition
Image and video analysis - object detection, facial recognition, content moderation
Amazon Rekognition is a fully managed computer vision service that analyzes images and videos to detect objects, scenes, faces, text, and unsafe content - without requiring any ML expertise. It is powered by deep learning models trained on billions of images and continuously updated by AWS. For cloud engineers, Rekognition is the fastest way to add vision intelligence to applications without managing GPU infrastructure or ML pipelines.
Image Analysis vs Video Analysis - Architecture Differences
Rekognition has two distinct APIs depending on whether you are analyzing a single image or a video stream/file.
| Aspect | Image API | Video API |
|---|---|---|
| Input | S3 object or raw bytes (max 15MB) | S3 video file or Kinesis Video Stream |
| Response | Synchronous - immediate JSON response | Asynchronous - JobId returned, poll or SNS notify |
| Use case | Per-image processing, real-time photo analysis | Surveillance, video cataloging, temporal analysis |
| Face tracking | Detect faces in a single frame | Track same face across frames over time |
| Max video duration | N/A | 10 GB file size limit, stored video |
| Streaming | N/A | Rekognition Video Streams for live Kinesis feeds |
# Detect labels in an S3 image (synchronous)
import boto3
rekognition = boto3.client('rekognition', region_name='us-east-1')
response = rekognition.detect_labels(
Image={'S3Object': {'Bucket': 'my-bucket', 'Name': 'photo.jpg'}},
MaxLabels=10,
MinConfidence=75.0
)
for label in response['Labels']:
print(f"{label['Name']}: {label['Confidence']:.1f}%")
Full Feature Matrix
Rekognition covers a wide range of vision tasks. Knowing which API to call for each task is a common interview question.
| Feature | API Call | What It Returns |
|---|---|---|
| Object and scene detection | DetectLabels | Labels with confidence scores, bounding boxes, parent categories |
| Facial analysis | DetectFaces | Face landmarks, age range, emotions, attributes (glasses, beard, etc.) |
| Face comparison | CompareFaces | Similarity score between two face images |
| Face search | SearchFacesByImage | Match against a stored face collection (SearchFaces) |
| Celebrity recognition | RecognizeCelebrities | Named celebrities with Wikipedia URLs |
| Content moderation | DetectModerationLabels | Unsafe/explicit content with confidence and taxonomy |
| Text detection | DetectText | OCR - text in images, bounding boxes, LINE vs WORD |
| Custom labels | DetectCustomLabels | Your own trained categories using Rekognition Custom Labels |
| PPE detection | DetectProtectiveEquipment | Hard hat, vest, face cover presence on persons |
| Segment detection (video) | StartSegmentDetection | Shot changes, black frames, end credits in video |
Rekognition Custom Labels lets you train your own image classifier using transfer learning on top of Rekognition's base models. You provide labeled images in S3 and Rekognition handles training - no ML code required.
Face Collections - Building Facial Recognition Systems
A face collection is an indexed database of face vectors stored by Rekognition. You add faces to a collection and then search for matching faces in new images. This is the basis for identity verification and access control systems.
# Create a collection and index faces
import boto3
rek = boto3.client('rekognition')
# Create collection
rek.create_collection(CollectionId='employee-faces')
# Index a face from S3
response = rek.index_faces(
CollectionId='employee-faces',
Image={'S3Object': {'Bucket': 'hr-photos', 'Name': 'john_smith.jpg'}},
ExternalImageId='employee-john-smith-001',
MaxFaces=1,
QualityFilter='AUTO'
)
face_id = response['FaceRecords'][0]['Face']['FaceId']
# Search for the face in a new image
search_response = rek.search_faces_by_image(
CollectionId='employee-faces',
Image={'S3Object': {'Bucket': 'door-camera', 'Name': 'visitor.jpg'}},
FaceMatchThreshold=98.0,
MaxFaces=1
)
Facial recognition has legal and ethical implications. Several US states (Illinois BIPA, Texas CUBI) and the EU (GDPR) have strict regulations around biometric data. Always get legal review before building facial recognition systems involving employees or members of the public.
Rekognition Pricing
| Feature | Pricing Tier | Notes |
|---|---|---|
| Image analysis (labels, faces, text, moderation) | First 1M images/month free; tiered from $0.001/image | Each API call = one image unit |
| Custom Labels training | $1.00 per compute hour | Typically 1-8 hours to train |
| Custom Labels inference | $4.00 per compute hour (dedicated) or $0.01/image | Must start/stop inference endpoint |
| Video stored (labels, faces) | First 1000 min/month free; $0.10/min after | Per minute of video processed |
| Video streaming (Kinesis) | $0.10 per 1000 stream processor hours | Charged while stream processor is running |
| Face collection storage | $0.01 per 1000 faces stored per month | Ongoing cost for large collections |
Rekognition Custom Labels inference endpoints must be explicitly started and stopped. Forgetting to stop a running endpoint costs $4.00/hour continuously. Always add automatic shutdown logic in your application.
Interview Focus Points
- 1What is the difference between Rekognition image analysis and video analysis APIs? Why is video asynchronous?
- 2How do face collections work and what are the use cases for SearchFacesByImage vs CompareFaces?
- 3What is Rekognition Custom Labels and how does it differ from the standard label detection?
- 4How would you build a content moderation pipeline for user-uploaded images using Rekognition and S3?
- 5What are the privacy and legal concerns around using Rekognition facial recognition in production?
- 6How does Rekognition Video handle live streaming vs stored video files?
- 7What confidence threshold would you set for a security access control system vs a content recommendation system and why?