The SSH Key Management Nightmare
Every DevOps team has experienced these scenarios:
- An engineer leaves the company. How many servers still have their SSH key?
- An SSH key gets committed to Git. Now what?
- Compliance asks: "How do you audit SSH access?" Silence.
- A bastion host goes down. Production access: blocked.
Traditional SSH access creates cascading security and operational problems.
Security Risks of Traditional SSH
- Key lifecycle management: SSH keys are difficult to rotate and revoke at scale
- Persistent access: Stolen keys provide ongoing access until discovered
- No MFA support: Cannot enforce multi-factor authentication on SSH connections
- Limited audit trail: Difficult to know who accessed what and when
Operational Overhead
- Bastion maintenance: Patching, monitoring, high availability
- Attack surface: Open port 22 invites brute-force attacks
- Key distribution: Getting keys to the right people at the right time
- Cost: $100-500/month per bastion host
TL;DR
SSH keys create cascading security problems—committed to Git, difficult to rotate, no MFA support, limited audit trails. Bastion hosts add operational overhead and cost $100-500/month. AWS Systems Manager Session Manager eliminates both by providing IAM-based, zero-trust access without SSH keys, open ports, or bastion infrastructure.
The Transformation: Replace SSH key management and bastion hosts with SSM's IAM-controlled access. Users authenticate via AWS credentials with MFA, IAM evaluates tag-based policies, sessions logged to CloudTrail and CloudWatch.
Key Takeaways:
- Zero SSH keys, zero open ports - Access controlled through IAM policies. No port 22/3389 security group rules. Instant revocation by removing IAM permissions.
- IAM-based access with fine-grained control - Tag-based conditions restrict access by environment. Time-based policies for scheduled access. MFA requirements for production.
- Full audit trail for compliance - CloudTrail logs every session start, command, termination. Session output to CloudWatch/S3. Complete evidence for SOC 2, PCI-DSS, HIPAA.
- Cost savings: $1,109-$7,289 annually - Eliminate bastion hosts ($15-2,000/month), VPN solutions ($50-500/month), SSH key tools ($20-100/month). SSM is free.
Real Results: Mid-Market SaaS with 200 instances saved $18,000/year, achieved 100% MFA enforcement, SOC 2 Type II compliance, recovered 5 hours/week on access management.
Core Principle: Least privilege with context—grant minimum necessary access based on role, environment tags, time, and MFA status. Design for auditability from day one.
AWS Systems Manager Session Manager
AWS SSM Session Manager provides secure, auditable, zero-trust access to EC2 instances without SSH keys, open ports, or bastion hosts.
How It Works
The access flow is fundamentally different from SSH:
- User authenticates to AWS via IAM credentials (supports MFA, temporary credentials)
- IAM evaluates permissions based on policies, conditions, and tags
- SSM API initiates session through the AWS backbone (not public internet)
- SSM Agent on instance receives session and provides shell access
- CloudTrail logs everything for complete audit trail
The Benefits
Zero SSH Keys
- No keys to generate, distribute, rotate, or revoke
- Access controlled entirely through IAM policies
- Instant revocation by removing IAM permissions
Zero Open Ports
- No security group rules for port 22 or 3389
- Instances have no inbound access from the internet
- Eliminates entire class of network-based attacks
Full Audit Trail
- CloudTrail logs every session start, command, and termination
- Session output logged to CloudWatch Logs or S3
- Complete compliance evidence for auditors
IAM-Based Access Control
- Use existing IAM roles, policies, and groups
- Tag-based conditions (e.g., only access instances tagged "dev")
- Require MFA for production access
Cost Savings
- Eliminate bastion hosts: $500-2000/month
- No SSH key management tools needed
- SSM Session Manager itself is free
IAM Policy Design Principles
The Principle: Least Privilege with Context
Design policies that grant minimum necessary access based on role and context. Use conditions to restrict access by environment, time, or source.
Role-Based Access Patterns
Developer Role: Development/Staging Only
Developers should access dev and staging environments but not production. Use tag-based conditions to allow ssm:StartSession on instances tagged Environment: dev or Environment: staging, deny access to production instances, and ensure users manage only their own sessions.
DevOps Role: Full Access with MFA
Operations teams need broader access but with additional controls. Allow access to all instances, require MFA for session start (use aws:MultiFactorAuthPresent condition), and allow viewing and managing all active sessions.
Security/Audit Role: Read-Only
Security teams need visibility without shell access. Allow DescribeSessions, GetConnectionStatus, DescribeInstanceInformation, explicitly deny StartSession and TerminateSession, and provide audit-only access for compliance.
Condition Keys for Fine-Grained Control
Use IAM condition keys to add context to access decisions:
- Tag-based:
ssm:resourceTag/Environment - Time-based:
aws:CurrentTime - MFA-based:
aws:MultiFactorAuthPresent - Source IP:
aws:SourceIp
SSM Agent Prerequisites
The Principle: Agent-Based Security
SSM requires an agent running on each instance. This is your security foundation.
Pre-installed AMIs
- Amazon Linux 2 and 2023
- Windows Server 2016+
- Most recent AWS-published AMIs
Manual Installation
- Ubuntu/Debian: Install from AWS S3
- CentOS/RHEL: Install from AWS S3
- Verify status with systemctl
IAM Instance Profile
- Instances need the
AmazonSSMManagedInstanceCoremanaged policy - This allows the agent to communicate with SSM endpoints
- Without this, instances won't appear in Session Manager
Network Requirements
- Instances need outbound HTTPS (443) to SSM endpoints
- Options: NAT Gateway, VPC Endpoints, or public IP
- VPC Endpoints recommended for private subnets (no internet required)
Session Logging Configuration
The Principle: Complete Audit Trail
Every session should be logged for security and compliance. Configure logging to CloudWatch Logs or S3.
Logging Options:
- CloudWatch Logs: Real-time streaming, easy querying, integrates with alarms
- S3: Long-term storage, cost-effective, integrates with Athena for analysis
What Gets Logged:
- Session start/stop timestamps
- User identity (IAM principal)
- Source IP address
- All commands executed (if enabled)
- Command output (if enabled)
Enable KMS Encryption
Encrypt session logs with customer-managed KMS keys for sensitive environments.
Migration Strategy: SSH to SSM
Phase 1: Parallel Run (Weeks 1-2)
Goal: Run SSM alongside existing SSH access
Tasks:
- Install SSM agent on all instances
- Grant IAM policies to DevOps team
- Train team on SSM access methods
- Monitor usage: SSM vs. SSH
Success Metrics:
- 100% of instances managed by SSM
- 50%+ of access via SSM
Phase 2: Gradual Cutover (Weeks 3-4)
Goal: Shift majority of access to SSM
Tasks:
- Remove SSH keys from non-emergency accounts
- Keep break-glass SSH access for emergencies only
- Migrate automation scripts to use SSM
- Update runbooks and documentation
Success Metrics:
- 90%+ of access via SSM
- Zero SSH-related incidents
Phase 3: Complete Migration (Weeks 5-6)
Goal: Eliminate SSH access entirely
Tasks:
- Remove security group rules for port 22
- Decommission bastion hosts
- Revoke all SSH keys
- Enable comprehensive session logging
Success Metrics:
- 100% SSM adoption
- Bastion costs eliminated
- Full audit trail in place
Cost Savings Analysis
Before: Traditional SSH + Bastion
| Item | Monthly Cost |
|---|---|
| Bastion host (t3.small) | $15 |
| ALB for high availability | $16 |
| Elastic IP | $3.65 |
| VPN solution | $50-500 |
| SSH key management tools | $20-100 |
| Total | $104.65-634.65 |
After: SSM Session Manager
| Item | Monthly Cost |
|---|---|
| SSM Session Manager | FREE |
| VPC Endpoints (optional) | $7.20 |
| CloudWatch Logs (optional) | $5-20 |
| Total | $12.20-27.20 |
Annual Savings: $1,109 - $7,289
Real-World Success Story
Company: Mid-market SaaS, 200 EC2 instances
Challenge: SSH key sprawl, compliance gaps, costly bastion infrastructure
Solution: Complete migration to SSM Session Manager
Results:
- Security: 100% MFA enforcement, zero SSH keys
- Compliance: Full audit trail for SOC 2 Type II
- Cost: $18,000/year saved (bastion + VPN)
- Productivity: 5 hours/week saved on access management
Advanced Use Cases
Port Forwarding for Database Access
SSM supports port forwarding to access private resources:
- Forward RDS port through an EC2 instance
- Access databases without exposing them to the internet
- Full audit trail of who accessed which database
Fleet-Wide Command Execution
Run commands across multiple instances simultaneously:
- Patch all web servers in one command
- Collect diagnostics from an entire fleet
- Enforce configuration changes immediately
IDE Integration
Configure your IDE to use SSM for remote development:
- VS Code Remote SSH with SSM
- JetBrains Gateway through SSM tunnels
- Same security benefits as interactive sessions
Security Best Practices
- Always Enable MFA for production access
- Use VPC Endpoints for fully private communication
- Enable Session Logging to S3 and CloudWatch
- Implement Least Privilege with tag-based conditions
- Regular Audits of session access patterns
- Session Timeout for idle sessions
- Encryption with customer-managed KMS keys
Troubleshooting Principles
Instance Not Appearing in Session Manager
Check in order:
- SSM Agent running? (systemctl status)
- IAM Instance Profile attached with correct policy?
- Network connectivity to SSM endpoints?
- Instance in correct region/account?
Access Denied Errors
Check in order:
- User has
ssm:StartSessionpermission? - Resource ARN matches policy?
- Conditions satisfied (tags, MFA)?
- Session document permissions?
Next Steps
Ready to eliminate SSH keys and bastion hosts from your infrastructure?
At ZSoftly, we help businesses implement secure, zero-trust access:
- Migration planning and execution
- IAM policy design and implementation
- Session logging and monitoring setup
- Team training and documentation
Get started today:
- Email: info@zsoftly.com
- Phone: +1 (343) 503-0513
- Website: zsoftly.com
Next in series: "Bypassing VPN Failure with AWS SSM Port Forwarding"
