aws re:invent 2016: finra in the cloud: the big data enterprise (ent313)

29
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tigran Khrimian, VP of Data Platforms Mark Ryland, Chief Architect, AWS Worldwide Public Sector December 1, 2016 ENT313 FINRA in the Cloud The Big Data Enterprise

Upload: amazon-web-services

Post on 16-Apr-2017

871 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Tigran Khrimian, VP of Data Platforms

Mark Ryland, Chief Architect, AWS Worldwide Public Sector

December 1, 2016

ENT313

FINRA in the Cloud

The Big Data Enterprise

Page 2: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

What to Expect from the Session

• FINRA’s Enterprise Class cloud architecture

• Business Value FINRA has realized from cloud migration

• Technology skillsets required

• Tools (data management) and processes required

• Other (unexpected) benefits from cloud migration

• View from AWS: partnership and platform evolution

2

Page 3: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

3

Page 4: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Data Is Central to Our Mission

Reconstruct the market from trillions of

events• Data from broker-dealers and exchanges

• Equities, Options, Fixed Income

• Build a graph of market order events

Analyze the data looking for financial

fraud• Insider trading, layering, cross-product

manipulation, front running & many more

• Looking for a needle in a haystack

4

Page 5: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Volume Challenges

Market volumes are volatile and

steadily increasing

Exchanges are dynamically evolving

Regulatory landscape is changing

Market manipulators innovate

5

Page 6: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Legacy Architecture

6

Tier 1 Tier 3Tier 2

SAN NAS

Page 7: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Pain PointsDoes not scale well as volumes and

workloads increase

Duplication of effort in data management

(data lifecycle, retention, versioning, etc.)

Data sync issues – manual effort to keep

data in sync

Challenges to run analytics across

fragmented data

Costly system maintenance and upgrades

7

Page 8: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Summary of Cloud Drivers: The Problems

• Fast-growing data volumes YoY

• High cost of pre-building for peak

• Escalating costs of in-house technology infrastructure

• Appliance platforms were facing obsolescence and end-of

life as a result of new Big Data technologies

Keep spending on infrastructure or redirect

dollars to core business (financial regulation)?

8

Page 9: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

AWS Architecture

9

Page 10: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Where Is My Data?

One location of master data, security, versioning,

availability, cross-region data replication, etc…

10

Page 11: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

How Do I Access the Data?

11

Page 12: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

FINRA’s AWS Architecture

On-premises data center

NASFTPIncoming Files

Validation Data Management

Linkage

Data Analytics

Normalization Amazon

EC2

Amazon

S3

Amazon

Glacier

Amazon

Redshift

Amazon

EMR

VPC

Amazon

EMR

Amazon

RDSMachine

Learning

AWS

KMS

12

Page 13: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

FINRA Usage Statistics on AWS

30k+ EC2 nodes per day

93%+ of EC2 usage is EMR

based (mostly SPOT)

20Pb+ Storage (Amazon

S3, Amazon Glacier)

60% PROD, 25% QC/UAT,

15% DEV

Node lifecycle:o 50%: Under 2h

o 35%: 2h to 5h

o 15%: over 5h

0

10,000

20,000

30,000

40,00031,044

35,44432,919

36,916

29,330

25,935

20,523

Redshift Web, App & RDS

Hadoop/Spark

Node Distribution for June 19-25 (~32k/day)

13

Page 14: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Information Security

14

Page 15: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

FINRA’s Use of VPC is Highly Secure and Auditable

• Network security even more tightly controlled than

traditional data centers (i.e., “micro-segmentation”)

• Encrypt non-public data both in-motion and at-rest

• AWS IAM function with fine-grained entitlements and

SoD integrated with FINRA’s existing IAM processes

• Comprehensive audit trail – AWS CloudTrail & Amazon

CloudWatch

• Custom AWS compliance reporting system to ensure

“identity perimeter”

15

Page 16: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

AWS Compliance & Certifications

AWS Foundation Services

Compute Storage Database Networking

AWS Global Infrastructure Regions

Availability Zones

Edge Locations

GxP

ISO 13485

AS9100

ISO/TS 16949

16Source: Amazon Web Services

Page 17: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Benefits

Improved performance (from min to seconds)

Ability to expand and contract (up-to 40K EC2 instances get

provisioned daily)

No more tech refreshes, patching, etc.

Lower cost of DR & Reg SCI testing

Superior data protection compared to in-house solution

Redirect focus and dollars to core business

17

Page 18: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Other (Unexpected) Benefits

Easier Data Access – no silos

• All data in one place

• Faster data discovery

• New forms of data exploration

Innovation & Engaged Staff

• Transformation from infrastructure ops to DevOps

• New technologies, new skills, challenging yet very clear goals

• Easier to try new things and innovate

18

Page 19: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Technology Skillsets Required

Fail fast, fail cheap

Innovation

Automation

Curiosity

19

Page 20: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

FINRA’s Future Plans

• Migrate the remaining applications to the Cloud by 2018

• Hundreds of relational databases

• Hundreds of applications

• High degree of inter-application connectivity (messaging,

workflow, data replication)

• Shut down data center operation by end of 2018

20

Page 21: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Key Takeaways

• Develop a compelling business case - sell to your

stakeholder; sell to your team

• Make sure to get security right

• Focus on your data strategy

• Pay attention to variable infrastructure cost

• Partner with Cloud/Big Data vendors for staffing needs

• Innovate and transform as part of Cloud journeys

21

Page 22: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Summary

• FINRA’s original promise (cost & performance) of Cloud

realized

• Other unplanned benefits

• superior data protection

• democratization of data

• catalyst for innovation

• Migrating the remainder of portfolio by end of 2018

22

Page 23: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

AWS Perspective – Enterprise Account

Management

23

Page 24: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Enterprise Account Engagement Model

• AWS Account Team Role:o Assist FINRA in architecting AWS services for Cloud

o Support Proof of Concepts (POCs) to accelerate migration

o Help FINRA understand and influence product roadmaps

• AWS Teams Engaged:o Account Management

o Solutions Architecture

o Support and Technical Account Management (TAMs)

o Technical Delivery Management (TDM)

o Professional Services

o AWS Service Teams / Engineers

24

Page 25: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

AWS Services That FINRA Has Requested

• Broad impact across multiple services

• Identity and access management

o Long-lived federation tokens

• Cross-region data replication (CRR) for S3:

o Copy important data to another region for catastrophic DR

o FINRA requested Data Encryption, other enhancements

• Database Migration Service (DMS):

o Input on DMS roadmap / features

o Early adopter for Oracle-Postgres migration (session DAT302)

25

Page 26: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Biggest Impact: EMR Enhancements

• Enhanced Hive / EMRFS support

• Presto performance improvements within EMR

• HBase on S3 (STG308 session):

o Separate storage & compute – data in S3 vs. persistent HS1 cluster

o Improved resiliency (RTO for cluster restart, S3 backup/replication)

o Improved cost performance (run less expensive nodes, no longer

storage constrained)

o Scale cluster up and down with demand

26

Page 27: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Thank you!

20

Page 28: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Remember to complete

your evaluations!

28

Page 29: AWS re:Invent 2016: FINRA in the Cloud: the Big Data Enterprise (ENT313)

Related Sessions

FINRA Sessions:

• BDM203 – Building a Secure Data Science Platform

• DAT302 – Best Practices for Migrating to RDS / Aurora

• CMP316 – Aligning Billions of Time Ordered Events with

Spark

• SVR202 – What’s new with AWS Lambda

• STG308 – FINRA’s Scalable Big Data Architecture on S3

29