poster 1271-2017: sas® in-memory analytics for hadoop - sas institute · poster 1271-2017: sas®...

4
SAS® In-Memory Analytics for Hadoop SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.

Upload: others

Post on 08-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Poster 1271-2017: SAS® In-Memory Analytics for Hadoop - Sas Institute · Poster 1271-2017: SAS® In-Memory Analytics for Hadoop Author: Venkateswarlu Toluchuri Keywords; SAS In-Memory

SAS® In-Memory Analytics for Hadoop

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.

Page 2: Poster 1271-2017: SAS® In-Memory Analytics for Hadoop - Sas Institute · Poster 1271-2017: SAS® In-Memory Analytics for Hadoop Author: Venkateswarlu Toluchuri Keywords; SAS In-Memory

Paper 1271- 2017

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.

Paper 1271 - 2017

SAS® In-Memory Analytics for HadoopVenkateswarlu Toluchuri, United Health Group (Optum) , Hyderabad, India

Author: Venkat T., SAS Administrator/DeveloperPresenter: James Chris, SAS Administrator/ManagerCompany: Optum, a UnitedHealth Group business(www.optum.com)

Paper 1271- 2017

SAS Fraud Framework

The Optum solution uses SAS’s Fraud Framework and Optum’sdeep health care expertise and extensive health care claims andfraud case datasets to identify and prevent instances of fraud,waste and abuse for payers. The solution delivers broad detectioncapabilities including rules, flags, predictive modeling, text miningand SAS Visual Analytics to identify possible instances of providerand consumer fraud, including multi-party fraud schemes andorganized crime.

Challenges

Types of SAS® In-Memory Analytics Products

LASR and Hadoop

The LASR Analytic Server integrates with Hadoop by reading andwriting SAS data in SASHDAT format in the Hadoop Distributed FileSystem (HDFS).

The data is not co-located.

LASR table blocks exist on dedicated hardware while the asymmetricprovider table blocks exist on separate hardware.

The blocks are pushed from the data provider into LASR just like withco-located data except that, they travel across a dedicated network.

The number of provider nodes does not have to be equal to thenumber of LASR nodes (thus the term, asymmetric)

Data does not pass through the LASR Head node for distribution. Theblocks are pushed straight from the provider into the LASR workernodes.

The mapping algorithm, that maps blocks to worker nodes, isextremely simple and tries to distribute the blocks as evenly aspossible.

The SAS Embedded Process (EP) must be installed on the paralleldata provider.

Parallel (Asymmetric ) Data Load

Understand LASR Server architecture Data flow in to LASR Server Different techniques of loading data in to SAS LASR Understand the analytics life cycle process in SAS In-Memory Different statements in PROC IMSTAT

Page 3: Poster 1271-2017: SAS® In-Memory Analytics for Hadoop - Sas Institute · Poster 1271-2017: SAS® In-Memory Analytics for Hadoop Author: Venkateswarlu Toluchuri Keywords; SAS In-Memory

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.

Paper 1271 - 2017

SAS® In-Memory Analytics for HadoopVenkateswarlu Toluchuri, United Health Group (Optum) , Hyderabad, India

Advanced LASR Loading Techniques Comparison

Proc IMSTAT Statements

Analytics Life Cycle support SAS In-MemoryData Load Engine Techniques

Partitioned Table – In Memory

PROC IMSTAT

Page 4: Poster 1271-2017: SAS® In-Memory Analytics for Hadoop - Sas Institute · Poster 1271-2017: SAS® In-Memory Analytics for Hadoop Author: Venkateswarlu Toluchuri Keywords; SAS In-Memory

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.