sas data management

6
To make smart business moves, you not only need sound operations, you need insight. Having clarity into market trends, transactions, customer habits or even your own business practices positions you to make better decisions and operate more efficiently. And the good news? You don’t have to look far to find that clarity. It’s often available right there in your existing data. The trick, however, is knowing what data you can trust – especially when you feel overwhelmed by information flooding your business. Spreadsheets, emails, reports, customer information, vendor records, operational events and other types of data can too easily become background noise. But when managed well, that data can be transformed into one of your most important assets. SAS Data Management makes sense of all the noise, turning big data into big value. Data access, data quality, data integration and data governance are all managed in a single platform, so you spend less time maintaining your information – and more time running your business. Key Benefits Always access the data you need. Regardless of where your data is stored – from legacy systems to Hadoop, it can be natively accessed, cleansed and processed quickly and efficiently. Reduce time spent on develop- ment and maintenance. Centralized storage, management and reuse helps you keep data updated and ready for action. Plus, secured authorization gives you control over who can access your information. Improve productivity. With an intuitive GUI environment, you can build, document and collaborate on work – and bring new team members up to speed – without a significant learning curve. Control administration and reinforce security. Reusable templates make it easy to provide role-based authorizations and administrative privileges at any level. Work faster and meet time constraints. Take advantage of grid-enabled load balancing and multithreaded parallel processing to rapidly transform and move data, or update with SQL pass-through for popular MPP databases. Deliver trusted information. Data quality auditing tools monitor processing and source systems, giving you visibility into all transformations since origination. Eliminate overlapping, redundant tools. A unified, complete enterprise data management platform enables you to forgo piecemeal management of alternate technologies, keeping costs under control and reducing risk. Product Overview With SAS Data Management, you can handle a wide variety of data challenges – from efficient processing of big data to accessing and integrating legacy sources – all in a single platform. It’s the fastest, easiest and most comprehensive way to get data under control, with in-memory and in-data- base performance improvements helping to deliver trusted information. SAS ® Data Management Transform raw data into a valued business asset with one enterprise solution FACT SHEET What does SAS ® Data Management do? SAS Data Management helps you manage and govern data as a single, unified asset so you can transform, integrate and secure data as well as improve data quality for your organization. Why is SAS ® Data Management important? Due to rapidly increasing volumes and diversity of data in today’s business environment, enterprise data manage- ment technology is no longer a convenience – it’s a necessity. SAS Data Management is a powerful, centralized and secure answer for solving even the most difficult challenges that IT organizations are struggling with. For whom is SAS ® Data Management designed? SAS Data Management is designed for IT organizations that need to address performance and functional improvements in their data management infrastructure. It can also help with managing big data.

Upload: lamkhuong

Post on 02-Jan-2017

234 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SAS Data Management

To make smart business moves, you not only need sound operations, you need insight. Having clarity into market trends, transactions, customer habits or even your own business practices positions you to make better decisions and operate more efficiently. And the good news? You don’t have to look far to find that clarity. It’s often available right there in your existing data.

The trick, however, is knowing what data you can trust – especially when you feel overwhelmed by information flooding your business. Spreadsheets, emails, reports, customer information, vendor records, operational events and other types of data can too easily become background noise. But when managed well, that data can be transformed into one of your most important assets.

SAS Data Management makes sense of all the noise, turning big data into big value. Data access, data quality, data integration and data governance are all managed in a single platform, so you spend less time maintaining your information – and more time running your business.

Key Benefits

• Always access the data you need. Regardless of where your data is stored – from legacy systems to Hadoop, it can be natively accessed, cleansed and processed quickly and efficiently.

• Reduce time spent on develop-ment and maintenance. Centralized storage, management and reuse helps you keep data updated and ready for action. Plus, secured authorization gives you control over who can access your information.

• Improve productivity. With an intuitive GUI environment, you can build, document and collaborate on work – and bring new team members up to speed – without a significant learning curve.

• Control administration and reinforce security. Reusable templates make it easy to provide role-based authorizations and administrative privileges at any level.

• Work faster and meet time constraints. Take advantage of grid-enabled load balancing and multithreaded parallel processing to rapidly transform and move data, or update with SQL pass-through for popular MPP databases.

• Deliver trusted information. Data quality auditing tools monitor processing and source systems, giving you visibility into all transformations since origination.

• Eliminate overlapping, redundant tools. A unified, complete enterprise data management platform enables you to forgo piecemeal management of alternate technologies, keeping costs under control and reducing risk.

Product Overview

With SAS Data Management, you can handle a wide variety of data challenges – from efficient processing of big data to accessing and integrating legacy sources – all in a single platform. It’s the fastest, easiest and most comprehensive way to get data under control, with in-memory and in-data-base performance improvements helping to deliver trusted information.

SAS® Data Management

Transform raw data into a valued business asset with one enterprise solution

FACT SHEET

What does SAS® Data Management do?

SAS Data Management helps you manage and govern data as a single, unified asset so you can transform, integrate and secure data as well as improve data quality for your organization.

Why is SAS® Data Management important?

Due to rapidly increasing volumes and diversity of data in today’s business environment, enterprise data manage-ment technology is no longer a convenience – it’s a necessity. SAS Data Management is a powerful, centralized and secure answer for solving even the most difficult challenges that IT organizations are struggling with.

For whom is SAS® Data Management designed?

SAS Data Management is designed for IT organizations that need to address performance and functional improvements in their data management infrastructure. It can also help with managing big data.

Page 2: SAS Data Management

Interactive Data Integration Development Environment

SAS Data Management encodes highly technical capabilities in an intuitive, point-and-click graphical user interface. From a role-based console, collaborative efforts are centralized under administrative settings, with wizards for data integration and data management project definition that are controlled by secured permissions and audit traceability.

Integrated Process Designer

A visual, end-to-end job designer gives full control over coordinated execution processing with advanced specifications for parallel, triggered and conditional jobs based on internal or external events.

Connectivity and Data Access

With SAS Data Management, you can natively connect to virtually any data,

big data source or streaming data, irrespective of hardware environment. You’ll benefit from unparalleled connectivity in batch or through message queues for real time, with specialized table loaders to provide optimized bulk-loading of Oracle, Teradata and DB2.

Metadata Management

Technical, business, process and administrative metadata is stored and managed to facilitate reuse of existing table definitions, business rules and more. Sophisticated mapping technologies make it easy to propagate column definitions from sources to targets and create automated, intelligent table joins.

Data Cleansing and Enrichment

SAS Data Management embeds data quality into batch, near-time and real-time processes – callable through mes-sage queues and Web services so you

don’t have to question the integrity of your business information. Out-of-the-box standardization rules conform data to your corporate standards, or you can customize rules for specific situations.

Extraction, Transformation and Load (ETL) and Extraction, Load and Transformation (ELT)

You can simplify collaboration and reuse with more than 300 out-of-the-box SQL-based transforms for creating tables – or to join, insert, delete, update and merge. Use analytic transformations for unique insights into data, extending transformation power while metadata is documented throughout the integration and transformation process.

Migration and Synchronization

Whether from mergers, acquisitions or business growth, disparate systems are common and can wreak havoc on your data consistency. SAS Data

Simplify job building with pre-built icons into a self-documenting diagram flow. Use the extensive library of out-of-the-box transformations and draw arrows to connect object icons. Yellow sticky notes help clarify job details.

Page 3: SAS Data Management

Management alleviates migration and synchronization issues with embedded, reusable data quality rules and real-time data services for database structures, enterprise applications, mainframe legacy files, SML, message queues and other sources.

Data Federation

Provide consistent business views across all data sources with optimized query processing that provides instant access to information. With SAS Data Management, you can query and use data across multiple systems without physically reconciling or moving source data. It’s a quick, cost-effective way to provide access for your business users.

Key Features

Interactive data integration development environment• Aneasy-to-use,point-and-click,role-basedGUIwithanintuitivesetofconfigurablewindowsformanagingauthorizedprocesses.Drag-and-dropfunctionalityeliminatestheneedforprogramming.

• Wizardsaccesstosourcesystems,creatingtargetstructures,importandexportmetadatafunctions,andbuild/executeETLandELTprocessflows.• Customizablemetadatatreeviewsletyoudisplay,visualizeandunderstandmetadata.• DedicatedGUIforprofilingdatamakesiteasytorepairsourcesystemissueswhileretainingthebusinessrulesforuseinotherdatamanagementprocesses.

• Interactivedebuggingandtestingofjobsduringdevelopmentandfullaccesstologsissupported.• Audithistoryandcheck-in/check-outallowsdesignerstoseewhichjobsortableswerechanged,whenandbywhom.• Abilitytodistributedataintegrationtasksacrossanyplatformandtovirtuallyconnectanysourceortargetdatastore.• Integrationwiththird-partyvendorsSubversionandCVSprovidesenhancedversionandsourcecontrolfeaturessuchasarchiving,differencingandrollback.

• EnhancedSAScodeimportcapabilitiesgivecurrentSASusersaneasywaytoimporttheirSASjobsandcode.• Command-linejobdeploymentoptionsfordeployingsingleandmultiplejobs.

Integrated process designer• Buildandeditdatamanagementprocesseswithavisual,end-to-endeventdesigner.• Controltheexecutionofdataintegration,SASStoredProcessesanddataqualityjobs.• ConditionallyexecutejobsbasedonIFTHENlogicandparameterization.• Fork“jobs”andprocessestoexecuteinparallel.• Publishjobinputsandoutputsforparameterizedjobs.• Listenforinternalandexternaleventsaswellasconditionallyraiseevents.• ExecuteexternalOSlevelcommandssuchascallshellscripts.• CallRESTandSOAPWebservices.• Listandopenoldversionsofjobs(inread-onlymode)andmakehistoricversionscurrentwithbuilt-inversioning.• Providefullsupportforpromotion/migrationofjobsinsupportofDEV/TEST/PROD.• Usecommonscriptinglanguagestodeploydataintegrationbatchjobsinanautomatedmannerwithautomatedjobdeployment.

Master Data Management Support

By using features such as semantic data descriptions and sophisticated fuzzy-matching, you can check and control data integrity within a Web-based reference management interface. You can also identify, standardize and correct master data by each transaction, in hundreds of transactions at a time or in a single pass of the source data.

Data Governance

SAS Data Management helps you implement business rules and policies across the enterprise to ensure compliance. These actions can be tracked and monitored across the entire governed environment.

Message Queuing

Asynchronous business processes and optimized access for each message queue can reduce the cost of disruptions with minimal administra-tive effort. With interfaces to the leading message-queuing products as well as event-based application integration, automated action triggers across applications are provided.

Enhanced Administration and Monitoring

Manage and monitor your complete integration environment, including data integration jobs, data jobs, federation cache jobs, process flows, process jobs and SAS Stored Processes.

Page 4: SAS Data Management

Connectivity and data access• Providesconnectivityinbatchorinrealtimetomoredatasourcesonmoreplatformsthanmostothersolutions.• Dataaccessenginesareavailableforenterpriseapplications,non-relationaldatabases,RDBMSs,datawarehouseappliances,PCfileformatsandmore.• SpecializedtableloadersprovideoptimizedbulkloadingofOracle,TeradataandDB2.• Filereader/writeravailableforHadoopfilesystem(HDFS)andsupportforHadoop’sMapReduce,PigandHivewithinflowsaswellasHortonworks.• Acompleteandsharedmetadataenvironmentprovidesconsistentdatadefinitionacrossalldatasources.• Nativeaccessmethodsdelivertopperformance,reducedatamovementandreducetheneedforcustomcoding.• Supportformessage-orientedmiddleware,includingWebSphereMQfromIBM,MSMQfromMicrosoft,JavaMessageService(JMS)andTIBCORendezvous.

• Supportforunstructuredandsemistructureddatatoparseandprocessfiles.• AccesstostaticandstreamingdataforsendingandreceivingviaWebservices.• ExpandedsupportforMPPdatabases:AsterDatanCluster,PivotalGreenplumandSybaseIQ,enablingmoreELTpushdownandsupportforbulk-loadutilities.

• NativesupportforSQL-basedprocessing.• EnhancedconnectivitytoAsterData,PivotalGreenplum,HadoopandSybaseIQdatabaseswiththeabilitytopushdownmoreprocessingtothedatabases.

Metadata management• Metadataiscapturedanddocumentedthroughouttransformationsanddataintegrationprocesses,andisavailableforimmediatereuse.• Sophisticatedmetadatamappingtechnologiesquicklypropagatecolumndefinitionsfromsourcestotargets,andcreateautomated,intelligenttablejoins.

• Metadatasearchenablesquicklocationofdesiredcomponents.• Impactanalysisforassessingthescopeandimpactofmakingchangestoexistingobjectssuchascolumns,tablesandprocessjobsbeforetheyoccur.• Abilitytodeterminethepath,processesandtransformationstakentoproducetheresultinginformation.• Datalineage(reverseimpactanalysis),criticaltovalidatedependencieshelpsbuilduserconfidenceindata.• Changeanalysisformetadatachangediscovery,comparison,analysisandselectivepropagation.• Multiple-usercollaborationsupportincludesobjectcheck-inandcheck-out.• Promotionandreplicationofmetadataacrossdevelopment,testandproductionenvironments.• Wizard-drivenmetadataimportandexportaswellascolumnstandardization.• Metadata-drivendeploymentflexibilitysoprocessjobscanbedeployedforbatchexecution,asreusablestoredprocessesorasWebservices.

Successful job completion is noted by green check marks for each stage of the job. Metadata statistics are saved for each applied transformation and can be viewed in graphical or tabular (lower right) format to help fine-tune performance.

Page 5: SAS Data Management

Data cleansing and enrichment• Dataqualityisembeddedintobatch,near-timeandreal-timeprocesses.• Datacleansingisprovidedinnativelanguageswithspecificlanguageawarenessandlocalizationsformorethan38regionsworldwide.• Dataqualityfunctionsareavailableinbothoperationalandreporting(transactionandbatch)environments.• AninteractiveGUIenablesyoutoprofileoperationaldatatoidentifyincomplete,inaccurateorambiguousdata.• Customizableandreusabledataqualitybusinessrulescanbeaccesseddirectlywithinprocessjobflows.• Out-of-the-boxstandardizationrulesconformdatatocorporatestandards,oryoucanbuildcustomizedrulesforspecialsituations.• Metadatabuiltandsharedacrosstheentireprocessprovidesanaccuratetrailofactionsappliedtothecleanseddata.• Valuecanbeaddedtoexistingdatabygeneratingandappendingpostaladdresses,geocoding,demographicdataorfactsfromothersourcesofinformation.

• DatastewardscanprofileoperationaldataandmonitorongoingdataactivitieswithaninteractiveGUIdesignedspecificallyfortheirneeds.• Simpleprocessforinstitutionalizingdataqualitybusinessrules.Applybasicorcomplexrulestovalidatedataaccordingtothespecificbusinessrequirementsofaparticularprocess,projectororganization.Rulesmaybeappliedinbatchmodeorasareal-timetransactioncleansingprocess.

• Dataqualitymonitoringenablesyoutocontinuouslyexaminedatainrealtimeandovertimetodiscoverwhenqualityfallsbelowacceptablelimits.• Alertscanbeissuedwhenthereisaneedforcorrectiveaction.

Extraction, transformation and load (ETL) and extraction, load and transform (ELT)• Apowerful,easy-to-usetransformationuserinterfacethatsupportscollaboration,reuseofprocessesandcommonmetadata.• Out-of-the-boxSQL-basedtransformsdeliverELTcapabilities,includingcreatetables,join,insertrows,deleterows,updaterows,merge,SQLset,extractandSQLexecute.

• Singleormultiple-sourcedataacquisition,transformation,cleansingandloadingenabletheeasycreationofdatawarehouses,datamarts,orBIandanalyticdatastores.

• Metadataiscapturedanddocumentedthroughoutthedataintegrationandtransformationprocessesandisavailableforimmediatereuse.• Transformationscanrunonanyplatformwithanydatasource.• Morethan300predefinedtableandcolumn-leveltransformations.• Ready-to-useanalyticaltransformations,includingcorrelationsandfrequencies,distributionanalysisandsummarystatistics.• TransformationwizardorJavaplug-indesigntemplatesletyoueasilygeneratereusableandrepeatabletransformationsthataretrackedandregis-teredinmetadata.

• Transformationprocesses,callablethroughcustomexits,messagequeuesandWebservicesarereusableindifferentprojectsandenvironments.• Transformationscanbeexecutedinteractivelyandscheduledtoruninbatchatsettimesorbasedoneventsthattriggerexecution.• Frameworkenvironmentforpublishinginformationtoarchives,apublishingchannel,emailorvariousmessage-queuingmiddleware.• Easilyrefresh,appendandupdateduringloading.• Optimizeloadingtechniqueswithuser-selectableoptions.• Database-awareloadingtechniquesincludebulk-loadfacilities,indexandkeycreation,anddroppingandtruncatingoftables.• Transformationsautomaticallygeneratehigh-performanceSAScodethatisdesignedforrapidandefficientprocessing.• Transformationsinclude:Type1SCDsupportformergeandhashtechniques,tabledifferencingandenhancementsforType2SCDloaders.• TheCompareTablestransformationcomparestwodatasourcesanddetectschangesindata.• ProvidestheabilitytocallRESTorSOAPWebservices.

Migration and synchronization• Abilitytomigrateorsynchronizedatabetweendatabasestructures,enterpriseapplications,mainframelegacyfiles,text,XML,messagequeuesandahostofothersources.

• Metadata-drivenaccesstosourcesandtargets.• Extensivelibraryofpredefinedtransformationscanbeextendedandsharedwithotherintegrationprocesses.• Embedded,reusabledataqualitybusinessrulescleandataasitismoved,synchronizedorreplicated.• Recognizeschangestokeyfieldsandreplicatesorsynchronizeschangesacrossmultipledatabases.• Optional,integratedschedulerallowschangesmadeinoneormoresystemstobepropagatedtoothersystemsonascheduledbasis.• Deliversreal-timedataservicesforsynchronizationandmigrationprojects.

Data federation• Virtualaccesstodatabasestructures,enterpriseapplications,mainframelegacyfiles,text,XML,messagequeuesandahostofothersources.• Abilitytojoindataacrossdatasourcesforreal-timeaccessandanalysis.• Instantaccesstoareal-timeviewofthedatausingthebuilt-indataviewer.• QueryoptimizationisprovidedbothautomaticallyaspartofDBMSrequests,andmanuallywithintheadvancedSQLeditor,andcanbeusedforbothhomogenousandheterogeneousdatasources.

Page 6: SAS Data Management

Master data management• Enhancedmetadatasearchfeaturesenableyoutosearchbytype,name,dateorotherkeywords,subsetbyfoldersorotheroptions,andsavesearchesforfutureuse.

• Supportforsemanticdatadescriptionsofinputandoutputdatasourcesuniquelyidentifyeachinstanceofabusinesselement(customer,product,account,etc.).

• Powerfultransformationtoolsandembeddeddataqualityprocessesimprovemasterdataquality.• Sophisticatedfuzzy-matchingtechnologyandclusteringmethodologiesenableyoutovalidateandconsolidatemasterrecordsintoidentifiabledatagroups.

• Real-timedatamonitoring,dashboardsandscorecardsletyoucheckandcontroldataintegrityovertime.• Canbeusedasabasisfortransitioningtoafull-fledgedmasterdatamanagementoffering.• Datafeedscanarriveinasingletransactionorinhundredsoftransactionsatthesametime.• Datasetscanbeprocessedinasinglepassofthesourcedata.

Data governance• Enhanced,Web-basedreferencedatamanagementandbusinessdataenvironmenttoeasegovernanceandsemanticreference,respectively.• Integratedbusinessdataglossaryallowsbusinesstermstobeorganizedhierarchicallyandrelatedtotermownersaswellastechnicalmetadatasuchastablesanddatamanagementprocesses.

• Extensivedatastewardshipcapabilitiesincludingweb-baseddashboardingandbusinessruleexceptionmonitoringforreportingandremediation.

Message queuing• Integrationofasynchronousbusinessprocessesviamessage-basedconnectivity.• Interfacestotheleadingmessage-queuingproducts,includingMicrosoftMSMQ,IBMWebSphere,TibcoRendezvousandJavaMessageService(JMS).

• Guaranteedmessage/transactiondeliveryreducesthecostofdisruptions.• Optimizedaccessforeachmessage-queuemanagerthatisdesignedforminimaladministrativeeffort.• Event-basedapplicationintegrationsoactivitiesinoneapplicationautomaticallytriggeractionsinotherapplications.• Dynamic,event-drivenrunstreamsandalerts.• Abilitytosendandreceivemessagesbetweendistributedanddisparatesystems.

Enhanced administration and monitoring• JobstatusandperformancereportsandtrendinginformationprovidetheabilitytotrackmetricssuchasCPUuse,memory,I/O,etc.anddeliverupdatesonhowrecentprocessrunsperformrelativetopreviousruns.

• Enablesuserstomanageandmonitortheircompleteintegrationenvironments,includingthefollowingtypesofjobsandactivities:o DataIntegrationJobs.o DataQualityJobs.o FederationCacheJobs–scheduledqueriestoupdatethefederationcache.o ProcessFlows.o Accesslogfilesfromacentral,Web-basedpanelforfaster,easiertroubleshooting.o SAS®StoredProcesses.

For More Information

To learn more about SAS Data Management, download white papers, view screenshots and see other related material, please visit sas.com/sasdm.

SAS Institute Inc. World Headquarters +1 919 677 8000To contact your local SAS office, please visit: sas.com/offices

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. Copyright © 2012, SAS Institute Inc. All rights reserved. 102451_S109573.0913