Ace Cloud Interviews
Home/AWS Tutorial/Systems Manager
📊

AWS Monitoring & Management

Systems Manager

Operational hub for patching, configuring, and managing EC2 and on-premises servers

AWS Systems Manager (SSM) is the operational management hub for AWS infrastructure, providing a unified interface to patch, configure, run commands, and manage EC2 instances and on-premises servers without SSH or RDP. It eliminates the need for bastion hosts and enables compliance-driven automation at scale across thousands of instances.

SSM Agent, IAM Roles, and Managed Instance Registration

Every capability in Systems Manager depends on the SSM Agent running on the target instance and the instance having an IAM instance profile with the AmazonSSMManagedInstanceCore policy. The agent establishes an outbound HTTPS connection to the SSM service endpoints - no inbound ports need to be open.

RequirementEC2 InstancesOn-Premises Servers
AgentPre-installed on Amazon Linux 2, Windows Server AMIs; manual install for othersManual install required
IAMEC2 instance profile with SSM permissionsHybrid activation (creates managed instance ID)
NetworkSSM, EC2Messages, SSMMessages endpoints (public or VPC endpoints)Same - outbound HTTPS to endpoints
Operating SystemsLinux, Windows, macOSLinux, Windows
💡

VPC endpoints for SSM (com.amazonaws.region.ssm, com.amazonaws.region.ec2messages, com.amazonaws.region.ssmmessages) are required if your instances are in private subnets with no NAT Gateway. This is a common interview question about how Session Manager works in locked-down environments.

bash
# Verify SSM agent status on Linux
sudo systemctl status amazon-ssm-agent

# Check if instance is managed by SSM
aws ssm describe-instance-information \
  --filters "Key=InstanceIds,Values=i-0123456789abcdef0"

# Register an on-premises server with hybrid activation
aws ssm create-activation \
  --default-instance-name "on-prem-web-01" \
  --iam-role "AmazonEC2RunCommandRoleForManagedInstances" \
  --registration-limit 1 \
  --region us-east-1

Session Manager vs Run Command - Interactive vs Batch Operations

Session Manager and Run Command are the two most commonly used SSM features. Session Manager provides an interactive shell (or port forwarding) without SSH keys or open ports. Run Command executes scripts or documents across fleets of instances.

FeatureSession ManagerRun Command
InteractionInteractive shell sessionOne-shot script/command execution
Use caseDebugging, troubleshooting, port forwardingFleet-wide patching, config changes, app deployments
OutputSession logs to S3 or CloudWatch LogsCommand output per instance to S3 or console
TargetingOne instance at a timeTags, resource groups, instance IDs - thousands at once
ConcurrencyN/AConfigurable rate control (MaxConcurrency, MaxErrors)
Audit trailSession history + full session logCommand invocation history

Session Manager port forwarding is a powerful but underused feature. It tunnels any TCP port from a private instance to your local machine through the SSM channel - useful for accessing RDS, Elasticsearch, or internal web apps from your laptop without a VPN.

bash
# Start a shell session (no SSH, no bastion host needed)
aws ssm start-session --target i-0123456789abcdef0

# Port forward RDS port to localhost:5432
aws ssm start-session \
  --target i-0123456789abcdef0 \
  --document-name AWS-StartPortForwardingSessionToRemoteHost \
  --parameters "host=my-db.cluster-xyz.us-east-1.rds.amazonaws.com,portNumber=5432,localPortNumber=5432"

# Run command across all instances tagged env=prod
aws ssm send-command \
  --targets "Key=tag:env,Values=prod" \
  --document-name "AWS-RunShellScript" \
  --parameters "commands=['systemctl restart nginx']" \
  --max-concurrency "20%" \
  --max-errors "5%"

Patch Manager, Patch Baselines, and Maintenance Windows

Patch Manager automates the process of patching managed instances with security-related updates. It uses Patch Baselines to define which patches are approved for installation and Maintenance Windows to define when patching occurs.

ComponentWhat It Does
Patch BaselineDefines approved patches by severity, classification, CVE IDs, or individual patches. Default baselines exist per OS; you create custom ones for stricter control.
Patch GroupTag-based grouping of instances (tag: Patch Group = production). Associates instances with a specific baseline.
Maintenance WindowScheduled time window with max concurrency and error thresholds. Targets resources (instances) and runs tasks (patch, run command, Lambda, Step Functions).
Scan vs InstallScan reports compliance without installing. Install mode applies approved patches and can reboot instances.
⚠️

The default AWS-managed patch baselines auto-approve patches after 7 days for critical/security patches. For production databases or stateful workloads, create a custom baseline with a longer delay and explicit patch approval to avoid surprise reboots.

Patch compliance state is reported per instance and visible in the Systems Manager Compliance dashboard and in AWS Config. You can create CloudWatch alarms on non-compliant instance counts and integrate this with your security reporting.

Parameter Store - Configuration and Secret Management

Parameter Store provides secure, hierarchical storage for configuration data and secrets. It integrates natively with IAM for access control, CloudTrail for audit, and most AWS services (Lambda, ECS, EC2 UserData, CodeBuild) for secret injection.

Standard ParameterAdvanced Parameter
Max value size4 KB8 KB
CostFree$0.05/parameter/month
Parameter policiesNoYes (expiration, notification)
Higher throughput40 TPS100 TPS
Parameter Store (SecureString)Secrets Manager
CostFree (standard params)$0.40/secret/month
Automatic rotationNo (manual Lambda possible)Yes - built-in for RDS, Redshift, DocumentDB
Cross-account accessPossible with CMKNative cross-account support
VersioningLabeled versionsFull version staging (AWSCURRENT, AWSPENDING)
Best forApp config, non-rotating secretsCredentials that must rotate automatically
bash
# Store a secret string
aws ssm put-parameter \
  --name "/myapp/prod/db-password" \
  --value "supersecretpassword" \
  --type SecureString \
  --key-id "alias/myapp-key"

# Retrieve all parameters under a path
aws ssm get-parameters-by-path \
  --path "/myapp/prod/" \
  --with-decryption \
  --recursive
💡

Use a hierarchical naming convention like /app-name/environment/parameter-name. This lets you use path-based IAM policies to grant a Lambda function access to all parameters under /myapp/prod/ without listing each one.

Automation Documents (Runbooks) and State Manager

SSM Documents (formerly called SSM Documents or just "documents") are the automation primitives in Systems Manager. There are several document types for different use cases:

Document TypeUse CaseExample
CommandRun scripts on instancesAWS-RunShellScript, AWS-RunPowerShellScript
AutomationMulti-step orchestration of AWS API calls + instance commandsAMI creation, EC2 restart with approval step, cross-account operations
SessionDefine Session Manager preferences (logging, shell prefs)AWS-StartSSHSession, AWS-StartPortForwardingSession
PolicyEnforce configuration compliance (used by State Manager)Enforce CloudWatch agent config, install software
PackageDistribute and install software packages via DistributorInstall custom agents, security tools

State Manager ensures instances maintain a defined configuration over time. Associations link a document to target instances and run on a schedule. If an instance drifts from the desired configuration (e.g., CloudWatch agent gets uninstalled), the next association run will re-apply it.

Automation documents support approval steps, multi-account execution, and rate control. They're the right tool for operational runbooks that used to be manual SOPs - things like "rotate RDS credentials", "create golden AMI", or "respond to a security finding".

🎯

Interview Focus Points

  • 1How does Session Manager work and why is it preferred over SSH + bastion hosts in modern AWS environments?
  • 2What three things does an EC2 instance need to be managed by Systems Manager?
  • 3What VPC endpoints are required for SSM to work in a private subnet with no NAT Gateway?
  • 4Explain the difference between Parameter Store and Secrets Manager - when would you use each?
  • 5How does Patch Manager handle patching without causing downtime - what are Maintenance Windows?
  • 6What is an SSM Association (State Manager) and how does it differ from Run Command?
  • 7How would you use SSM Automation to create a self-healing runbook that restarts a service when a CloudWatch alarm fires?
  • 8How do you audit who ran Session Manager sessions and what commands were executed?
  • 9How would you use SSM Session Manager port forwarding to access an RDS database in a private subnet?