(ent211) migrating the us government to the cloud | aws re:invent 2014
DESCRIPTION
The US government has built hundreds of applications that must be refactored to task advantage of modern distributed systems. This session discusses EzBake, an open-source, secure big data platform deployed on top of Amazon EC2 and using Amazon S3 and Amazon RDS. This solution has helped speed the US government to the cloud and make big data easy. Furthermore this session discusses critical architecture design decisions through the creation of the platform in order to add additional security, leverage future AWS offerings, and cut total operations and maintenance costs. Sponsored by CSCTRANSCRIPT
November 13, 2014 I Las Vegas
Matt Carroll, CTO, Defense & Intelligence
CSC
The problem
Over 400+ apps within its
enterprise
Over 1000+ active data
sources consuming data on
the order of TBs daily
Network supports over
230,000 daily users with
mission and business needs
Apps
Data
Users
Network
Security
Multiple networks deployed
worldwide on multiple continents
Every capability runs through a
lengthy certification and
accreditation process (4–6 mo)
Disparate activities across apps
and data have left little
quantitative data
We faced a highly complex environment for a US Government customer that
had a large dependency on legacy systems with a need to modernize quickly
Metrics
Customer challenges
Budget
• Not enough money to transition every app to take
advantage of Big Data or a distributed system
• Outsourcing IaaS needs to be monitored for accounting,
security, scale, etc without complex software
• Application elasticity is critical to understanding the true
costs of operations and maintenance
• Storage (data) is a much bigger cost than expected
• Need to consolidate systems engineering support
While we faced many challenges it became clear early on that budget and
ease of integration for apps must be our two driving forces
App migration is not simple
• Most apps are CRUD based; write a report, find a report
• Security business logic is baked into each app
• Number one question: Why can’t I choose the technology that
best fits my app?
• Cannot disrupt operations by any means!
• Applications must reside on multiple networks and work
together
• Takes too long to get started, laying down databases, web
tiers, etc
Security is the ultimate killer of timeThe process around security became complicated, burdensome and still insufficient to counter threats at scale
Our mission
Our Mission is to facilitate Big Data analytics across the enterprise by
providing the tools necessary to align the work of the application engineer,
analytic developer, and data scientist — freeing them to focus on end
products, not infrastructure; we provide this through EzBake
Big Data
should be easy
Big Data should
drive insight
Big Data should
be ubiquitous
Big Data should
be secure
EzBake
It’s all about making application transition easier!!! Rather than assembling your
own big data stack, EzBake provides an integrated way to compose the different
elements of your application: collecting, processing, storing, and querying dataEase of application development
• Time to market of apps and reuse
• Autodeployment and high-availability scaling
• Integrated analytics and audit trails for logs,
metrics, data access, and security events
Built-in security layer
• Role-based access and complex policies
• Down to the object / cell-level controls
• Encryption in transit
Data layer
• Ubiquitous data access (no stovepipes!)
• Simplified streaming / batch analytics
• Tailorable and technology agnostic
• Abstracted index patterns
Data layer
Custom applications
Physical databases
MongoDB Accumulo
PostgreSQL (RDS) Redis hBase
Elasticsearch Titan +Custom
Execution layerStream Batch Query
Events +More
Security layer
Key features
Scaled and commonly used
thrift services, typically
used during streaming
ingest
Interface for building data
flow topologies which
abstract physical stream
processors
Both direct access to
indices and aggregate
query across the various
data sets
Indexing patterns exposed
as thrift services and
abstracts the physical
database
Amazon Elastic
MapReduce (EMR)
abstractions that enable
complex, multidimensional
discovery
Both at the data
persistence and user
access layers
Automated elasticity
through a GUI-based
deployment
Streaming ingest (Frack) Common services Data persistence Distributed query
Security Batch analytics Deployment
Technology agnostic
• Instead of a jack-of-all-trades indexing for
free text search, geospatial search, etc, use
mission-specific indices for specific
application logic needs
• Focus on storage patterns vice database
specific operations, thereby enforcing data
access standards across the enterprise
• Allow for new cartridges for web frameworks
including Node.js, Python, Ruby, etc.
Each app has its own needs, and it is not on the platform builder to force the
team into a particular technology, rather offer a solution to meet the use case
Easy to deploy and secure
The platform provisions and scales, like classic PaaS, and embeds
data layer connections and security on Amazon EC2
• Developers pull-down sandbox from the collaboration
environment to develop on their local box
• App / service is output as a WAR and YML file
(buildpack)
• The app registration page allows engineers to deploy
and register apps, data feeds, and services on the
platform
• EzDeployer supports dynamic resource management to
all capabilities hosted and provisions through Amazon
Elastic Compute Cloud (EC2)
App registration
• Applications carry role-based access controls with human
inserted deployment authorization
• Registration to include data feeds, services, batch jobs, and
intents.
• Ability to assign other users as admin controllers through
AWS Identity and Access Management (IAM) controls or
other IdAM
• Cuts down time to deploy and removes the need for app
developers to write Puppet scripts
• Build in account management policies for financial tracking
of PaaS and IaaS costs
Deploy with buildpacks securely through the application registration page
and provide elasticity as a service by abstracting Amazon EC2 services
Lab76: Collaborative development
• Speed start of development from weeks to hours by
enabling a truly agile development environment
• GitLab was exposed for source control and promoting
the sharing of code across the enterprise through
governance and oversight
• Customized RedMine was exposed for task
management and to allow task oversight and alignment
• DevOps could clone an Amazon Virtual Private Cloud
and stand up new environments in a day vs. months of
setting up for each app or system
The key to speeding transition was to remove redundancy; by providing a one-stop
shop for dev tools (Git, RedMine, Jenkins), a means to share code and common
development environments, we gained months back from each development team
Leveraging a data layer on SQL and NoSQL the platform abstracts physical
data stores and promotes storage patterns to enable ease of sharing, force
object-level security, and provide the ability to plug and play databases
Breaking-down disparate data stores
• That’s not to say we implement Big Data SQL
• Instead, we have the model that binds app development,
Big Data, and security
• Focus developers towards database abstractions extensible
to any database
So what?
• Move to production with Big Data without impacting existing
SQL based production architecture (think PostgreSQL to RDS)
• Brings data together across the enterprise helping customers
with disparate engineering teams build to a standard
Distributed query
We distribute object-specific queries across disparate data sets exposed through
the data layer while controlling access through the service and at the data level
• Migrate off-legacy data stores without disrupting production
instances
• Focus on object-based queries across many data sets as
well as across Amazon VPC within an environment
• Work with Cloudera to modify Impala to run against multiple
data stores
• Common access controls across multiple data sets
So what?
• Common method to discover data across many apps, great
for BI tools and third-party apps like Palantir, Tableau, etc.
• Decreases the duplication of storage across the enterprise
through common indexing patterns
Security becomes an API
• All data is encrypted in transit
• All transactions are authorized by the
security service
• All data is secured at the object level
• Robust security service — scales
horizontally and generated authorization
tokens base on external IdAM properties
• Internal group management service scales
to trillions of groups and beyond
• Compressed bitvector representation of data
visibility and access authorizations speeds
security computations
Following several zero-day attacks the enterprise is waking up to security but
has no understanding of how to secure their Big Data platforms — a major
reason many are not in productionBob
Bob has authorizations:
X, Y, and Z
Data
Data is tagged as: X, Y,
and R. Sorry Bob! Only X
and Y for you!
Query
Object-level security across all data stores through a
common API will provide dramatic efficiencies as it
decreases time to model data across multiple data
stores
Metering and monitoring
• Javascript API for web apps, Thrift API for
services, and REST for others
• Improve application usability and usefulness
by examining analytics on usage patterns
• Diagnose issues with system, services, and
apps
• Determine cost allocation based on what
agencies and organizations are using the
system
To bring back focus on understanding the environment we needed the platform to
provide a comprehensive visualization to monitor users, data and services on AWS
Batch (Amino)
• Removes complexity of Amazon EMR for the average
engineer
• Crowd source microanalytics through analysts and engineers
• Data agnostic
• Not a black box
• Fully scalable
• Inherent cross-data source linked indexes
• Encourages sharing of knowledge, discovery
• Index built to support machine learning
• Security considered up front — index is in Accumulo
• Utilized AWS to enable rapid load-balancing to support
demand based on data and usage
Developers can write Amazon Elastic MapReduce (EMR) code to analyze data, but don’t know
what to look for; the analysts know what to look for, but don’t know how to write code. Technology
is not the problem. It’s enabling the analyst to effectively leverage technology and reuse it.
The impact
So What? What were the overall accomplishments to date? Well…
Time: The platform and the development
model decreased the development time from
6–8 months to production to 3–4 weeks.
Lean and Mean: Application teams went
from being heavy on DevOps, security,
testing to smaller, more agile teams focused
on specific-mission use cases
Most importantly…We revectored teams back to their users, providing more capabilities in less time,
thereby saving lives and protecting our country
Data Shared: Legacy REST/SOAP interfaces
have begun to die and time spent on sharing data
is down significantly without impacting operations
and more apps have more access to data
Money: Removal of redundant code and system,
faster app deployment, cuts in total storage costs,
and decrease in team sizes led to a significant
cost savings up front for the customer
http://bit.ly/awsevals