aws re:invent 2016: evolving an enterprise-level compliance framework with amazon cloudwatch events...
TRANSCRIPT
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Chris McCurdy, AWS Solutions Architect Specialist
Alan Nihill, Johnson & Johnson DevOps Engineer
December 2, 2016
Evolving an Enterprise-Level Compliance
Framework with Amazon CloudWatch Events and
AWS Lambda
SAC311
What to Expect from the Session
• Why the need for guardrails and compliance
frameworks?
• What are some patterns customers are trying?
• What has Johnson & Johnson learned from the
compliance engine that they discussed at last year’s
re:Invent?
• Where is Johnson & Johnson evolving their engine?
Why guardrails and compliance frameworks?
Shared Responsibility Model
AWS Foundation Services
Compute Storage Database Networking
AWS Global Infrastructure Regions
Availability Zones
Edge Locations
Cu
sto
mer
sPlatform, Applications, Identity & Access Management
Operating System, Network, & Firewall
Customer content
Client-side Encryption Implementation, Server-side Encryption, Network Traffic Protection
Security
in the
Cloud
Security
of the
Cloud
Certifications, Assurances and Attestations
Observed Compliance Framework Patterns
API Sandwich
Periodic Describe
Event-Driven Workflow
Predefined Resource
Observed Compliance Framework Patterns
API Sandwich
Periodic Describe
Event-Driven Workflow
Predefined Resource
API Sandwich Workflow
Amazon
EC2
AWS Lambda
Compute Services
API
Proxies(Amazon EC2)
Amazon SQS
Audit Trail(Amazon S3)
Elastic Load
Balancing
Proxy Services
Rules Store(Amazon DynamoDB)
Amazon
EMR
AWS Services
Auto Scaling
API Sandwich Workflow
Use cases:
• Possible regulation
Advantages:
• Control over each API call
• Simple architecture
Disadvantages:
• Need to update API proxy as AWS adds new services
• Cost of maintaining fleet of API proxy servers
• Indirect service access caused variability
• Upgrading rules challenges
Observed Compliance Framework Patterns
API Sandwich
Periodic Describe
Event-Driven Workflow
Predefined Resource
Periodic Describing Workflow
EC2
Lambda
Compute Services
Amazon SQS
Amazon
EMR
AWS Services
Resource
Describer(Amazon EC2)
Compliance Enforcement Layer
Audit Trail(Amazon S3)
Enforcement
Engine
Amazon
SQS
Rules Store(Amazon DynamoDB)
Periodic Describing
Use cases:
• Resources outside of AWS
• Application configuration compliance
Advantages:
• Direct service calls
• Easy to add new rules
Disadvantages:
• API limits:
• Number of Resources * Period = Account API Call Overhead
• Cost of compliance instances
• Out-of-compliance activity possible until describe runs
Observed Compliance Framework Patterns
API Sandwich
Periodic Describe
Event-Driven Workflow
Predefined Resource
Event-Driven Workflow
EC2
Lambda
Compute Services
Amazon SQS
Amazon
EMR
AWS ServicesCompliance Enforcement Layer
Audit Trail(Amazon S3)
Enforcement
EngineAmazon
SQS
Rules Store(Amazon DynamoDB)
Amazon
CloudWatch
Events
Amazon
SQSEvent
Enrichment(AWS Lambda)
Event-Driven Workflow
Use cases:
• Resources inside of AWS
• Applications that generate CloudWatch Events events
Advantages:
• Direct service calls
• Action taken in milliseconds instead of minutes
• Lambda cost substantially less than similar alternatives
Disadvantages:
• More complicated architecture
• If using CloudWatch Events, describes are still required
Observed Compliance Framework Patterns
API Sandwich
Periodic Describe
Event-Driven Workflow
Predefined Resource
Predefined Resource Workflow
AWS
CloudFormation
template
Security/
Compliance Admin
1
Define
AWS Service Catalog
2
Publish
AWS
CloudFormation
stack
Developers
4
Browse and launch
AWS CloudTrail Amazon S3
11
Monitors
Logs all API calls
Amazon
CloudWatch
alarm
8
Monitors
10
Initiates
12
Notifies
AWS ConfigTrack changes
3
Git push
6
AWS CodeCommit
5
Provisions
9
7
Predefined Resource Workflow
Use cases:
• Validated systems
Advantages:
• Environment conformity
• High RI utilization and management
• Only defined activity possible
Disadvantages:
• Less development freedom
http://amzn.to/2cHDDuN
Johnson & Johnson
A Global Health Care Leader
250
60
$70B
126,900
Operating Companies
Countries
Employees
Sales
Big Company, Big Challenges
Complex IT Operations
Regulated Environment
Demand Forecasting
Virtual Private Cloud
Virtual Private Cloud Vision
Enable Agility
Enforce Policy
Accelerate Best Practices
Self-Service
Enterprise Control
Core Principles
Least Privilege
Account Isolation
J&J Network
J&J Identities
Verbose Logging
Preventative Controls
Detective Controls
Approved VPCs
Logging Enabled
Encryption
Segregation of Duties
Networking
AD Integration
Backups & Monitoring
IAM Whitelist Policies
Architecture
xbot
Policy Enforcement
Administration
Database
Console
Billing
Active Directory
Ticketing
Previous Design
Tests
Queue Tests
Metadata
xbot
App. Account 1
App. Account 2
App. Account n
EC2 Amazon RDS S3
EMR S3
EC2 RDS S3
Distributed
Design Considerations
Centralized
Current Design – App. Account
S3 AWS Identity and
Access Management
EMR
EC2
RDS
Amazon SNS
CloudWatch
Rules
AWS CloudFormation Stack
Sample CloudWatch Rule
Current DesignAccount 1
SNS Topic
Audit
Queue
Tests
Queue
Events
Queue Events
Tests
Audit
Elastic Load
Balancing
/project
/user
/<service>
Account 2
Account n
SNS Topic
SNS Topic
Sample Event
{
"account":"111122223333",
"region":"us-east-1",
"detail-type":"AWS API Call via CloudTrail",
"source":"aws.ec2",
"time":"2016-06-21T18:22:18Z",
"id":"f1cbb72b-cc0d-4eec-a521-dc3cdd088446",
"detail":{
"eventVersion":"1.03",
"eventID":"91b4db10-999d-4ba3-a008-d7c485c8bd60",
"eventTime":"2016-06-21T18:22:18Z",
"awsRegion":"us-east-1",
"eventName":"RunInstances",
"responseElements":{
"reservationId":"r-59a8908c",
"instancesSet":{
"items":[
{
"vpcId":"vpc-62d83407",
"interfaceId":"interface-b23aacf0",
"instanceId":"i-722e9c37",
"imageId":"ami-a4827dc9",
"subnetId":"subnet-d1c4c2a5",
Sample Event Code
def handle_ec2_event(project_id, event):
region = event.get('region')
try:
instances = event['detail']['responseElements']['instancesSet']['items']
event_type = event[‘detail’][‘eventName’]
except Exception, error:
return
project_region_server_tests(project_id=project_id, region=region, instances=instances, event_type=event_type)
def handle_event(project_id, event):
try:
source = event.get('source')
service = source.split('.')[1]
if service == 'ec2':
handle_ec2_event(project_id=project_id, event=event)
except Exception, error:
logger.error('Failed to handle an event for {}. Error: {}. Event: '
'{}'.format(project_id, error, event.get(id)))
Sample Test Code
class ProjectRegionServerEnforcementTest(ProjectRegionServerTest):
@terminate
def test_in_valid_vpc(self):
self.assertIn(self.instance.vpc_id, self.valid_vpc_ids)
@terminate
def test_uses_valid_ami(self):
self.assertIn(self.ami_id, self.valid_images)
Lessons Learned
• Zero, one, or infinity rule
• Keep your code and application modular
• Use PaaS, avoid technical debt
• Differentiate between test frequencies
Demo
Thank you!
Remember to complete
your evaluations!