Rekognition

Image and video analysis - object detection, facial recognition, content moderation

Amazon Rekognition is a fully managed computer vision service that analyzes images and videos to detect objects, scenes, faces, text, and unsafe content - without requiring any ML expertise. It is powered by deep learning models trained on billions of images and continuously updated by AWS. For cloud engineers, Rekognition is the fastest way to add vision intelligence to applications without managing GPU infrastructure or ML pipelines.

Image Analysis vs Video Analysis - Architecture Differences

Rekognition has two distinct APIs depending on whether you are analyzing a single image or a video stream/file.

Aspect	Image API	Video API
Input	S3 object or raw bytes (max 15MB)	S3 video file or Kinesis Video Stream
Response	Synchronous - immediate JSON response	Asynchronous - JobId returned, poll or SNS notify
Use case	Per-image processing, real-time photo analysis	Surveillance, video cataloging, temporal analysis
Face tracking	Detect faces in a single frame	Track same face across frames over time
Max video duration	N/A	10 GB file size limit, stored video
Streaming	N/A	Rekognition Video Streams for live Kinesis feeds

bash

# Detect labels in an S3 image (synchronous)
import boto3

rekognition = boto3.client('rekognition', region_name='us-east-1')

response = rekognition.detect_labels(
    Image={'S3Object': {'Bucket': 'my-bucket', 'Name': 'photo.jpg'}},
    MaxLabels=10,
    MinConfidence=75.0
)

for label in response['Labels']:
    print(f"{label['Name']}: {label['Confidence']:.1f}%")

Full Feature Matrix

Rekognition covers a wide range of vision tasks. Knowing which API to call for each task is a common interview question.

Feature	API Call	What It Returns
Object and scene detection	DetectLabels	Labels with confidence scores, bounding boxes, parent categories
Facial analysis	DetectFaces	Face landmarks, age range, emotions, attributes (glasses, beard, etc.)
Face comparison	CompareFaces	Similarity score between two face images
Face search	SearchFacesByImage	Match against a stored face collection (SearchFaces)
Celebrity recognition	RecognizeCelebrities	Named celebrities with Wikipedia URLs
Content moderation	DetectModerationLabels	Unsafe/explicit content with confidence and taxonomy
Text detection	DetectText	OCR - text in images, bounding boxes, LINE vs WORD
Custom labels	DetectCustomLabels	Your own trained categories using Rekognition Custom Labels
PPE detection	DetectProtectiveEquipment	Hard hat, vest, face cover presence on persons
Segment detection (video)	StartSegmentDetection	Shot changes, black frames, end credits in video

💡

Rekognition Custom Labels lets you train your own image classifier using transfer learning on top of Rekognition's base models. You provide labeled images in S3 and Rekognition handles training - no ML code required.

Face Collections - Building Facial Recognition Systems

A face collection is an indexed database of face vectors stored by Rekognition. You add faces to a collection and then search for matching faces in new images. This is the basis for identity verification and access control systems.

bash

# Create a collection and index faces
import boto3

rek = boto3.client('rekognition')

# Create collection
rek.create_collection(CollectionId='employee-faces')

# Index a face from S3
response = rek.index_faces(
    CollectionId='employee-faces',
    Image={'S3Object': {'Bucket': 'hr-photos', 'Name': 'john_smith.jpg'}},
    ExternalImageId='employee-john-smith-001',
    MaxFaces=1,
    QualityFilter='AUTO'
)
face_id = response['FaceRecords'][0]['Face']['FaceId']

# Search for the face in a new image
search_response = rek.search_faces_by_image(
    CollectionId='employee-faces',
    Image={'S3Object': {'Bucket': 'door-camera', 'Name': 'visitor.jpg'}},
    FaceMatchThreshold=98.0,
    MaxFaces=1
)

⚠️

Facial recognition has legal and ethical implications. Several US states (Illinois BIPA, Texas CUBI) and the EU (GDPR) have strict regulations around biometric data. Always get legal review before building facial recognition systems involving employees or members of the public.

Rekognition Pricing

Feature	Pricing Tier	Notes
Image analysis (labels, faces, text, moderation)	First 1M images/month free; tiered from $0.001/image	Each API call = one image unit
Custom Labels training	$1.00 per compute hour	Typically 1-8 hours to train
Custom Labels inference	$4.00 per compute hour (dedicated) or $0.01/image	Must start/stop inference endpoint
Video stored (labels, faces)	First 1000 min/month free; $0.10/min after	Per minute of video processed
Video streaming (Kinesis)	$0.10 per 1000 stream processor hours	Charged while stream processor is running
Face collection storage	$0.01 per 1000 faces stored per month	Ongoing cost for large collections

💡

Rekognition Custom Labels inference endpoints must be explicitly started and stopped. Forgetting to stop a running endpoint costs $4.00/hour continuously. Always add automatic shutdown logic in your application.

🎯

Interview Focus Points

1What is the difference between Rekognition image analysis and video analysis APIs? Why is video asynchronous?
2How do face collections work and what are the use cases for SearchFacesByImage vs CompareFaces?
3What is Rekognition Custom Labels and how does it differ from the standard label detection?
4How would you build a content moderation pipeline for user-uploaded images using Rekognition and S3?
5What are the privacy and legal concerns around using Rekognition facial recognition in production?
6How does Rekognition Video handle live streaming vs stored video files?
7What confidence threshold would you set for a security access control system vs a content recommendation system and why?