anexinet big data solutions
DESCRIPTION
Big Data Solutions offered by AnexinetTRANSCRIPT
![Page 1: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/1.jpg)
Anexinet Big Data
Solutions for Big Data Analytics
![Page 2: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/2.jpg)
Big Data Defined
Volume• Datasets that grow too large to
easily manage in traditional RDBMS• TBs, PBs, ZBs
Velocity• Large volume streaming data that
can overwhelm traditional BI & ETL processes
Variety• Data sources extraneous to
traditional business systems that can be unstructured and require text analytics
Value• Big Data can have a
transformational effect on business when the proper systems and processes are put in place
![Page 3: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/3.jpg)
Big Data vs. Classic BI
What is different from classic DW/BI and Big Data Analytics? Businesses today treat data warehouse & business intelligence as must-have reporting and
operational capability Businesses that are not fully mature in BI lifecycle may struggle with Big Data
Big Data Projects look for untapped analytics, not BI dashboards
SCALE: Think Volume, Variety and Velocity Yahoo! Uses Microsoft SQL Server & Analysis Services, with Hadoop, Oracle & Tableau
38,000 machines distributed across 20 different clusters
2-petabyte Hadoop cluster that feeds 1.2 terabytes of raw data each day into Oracle RAC Data is compressed and 135 gigabytes of data per day is sent to a SQL Server 2008 R2 Analysis
Services cube Cube produces 24 terabytes of data each quarter http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?CaseStudyID=710000001707
![Page 4: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/4.jpg)
Scalable Big Data Platform Architecture
© Copyright 2013 Anexinet Corp. 4
Hadoop
Data Warehouse
Analytics
End User Reporting
HDFS Cluster
MapReduce Framework
MPP Database
Star Schemas
In-memory cubes
Analytical Columnstore
Tables
Advanced in-memory analytics
Ad-hoc data discovery
![Page 5: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/5.jpg)
Go Beyond Dashboards. Provide Advanced Analytics.
Large number of data points adds new business value
Big Data advanced analytics requires tool that can sample complex data sources
Must provide quick aggregations of large data sets that are easily consumed by the human eye
Must provide “data discovery” for ad-hoc analysis
Tableau
Microsoft Power View
Qlikview
![Page 6: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/6.jpg)
Marketing Samples
Enhance marketing campaigns with Big Data
Social analytics, customer analytic, targeted marketing, brand sentiment
Big Data has proven transformational for marketing organizations (Razorfish, Yahoo!, NBC, [x+1])
Web Analytics from Google Analytics
![Page 7: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/7.jpg)
Anexinet Big Data Offerings
Strategy Engagement• Customer stakeholder interviews & interactive sessions• Define Big Data Requirements• Design Big Data Strategy• Deliver Strategy & Roadmap Documents
Starter Solution• Let Anexinet handle the hardest parts of a Big Data solution• * Getting started• * Collecting & processing data• * Uncover business value from Big Data
Big Data Project Engagement• End-to-end Big Data project• * Big Data Discovery• * Big Data Platform• * Big Data Analytics• * Big Data Visualizations
![Page 8: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/8.jpg)
Partnerships
Big Data Platforms
• EMC Greenplum• Hortonworks (OSS,
MSFT, HP)• Cloudera (OSS, Oracle,
HP)
Big Data Databases
• HP Vertica• EMC Greenplum• Microsoft PDW• Oracle Exalytics• Oracle Big Data
Appliance
Big Data Visualizations
• QlikView• Tableau• Microsoft PowerPivot• Microsoft Power View
![Page 9: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/9.jpg)
A Credible Partner to Deploy Big Data Solutions
Security
• Ensure privacy of PII
• Conform Big Data solution to your enterprise security standards
Integration
• ETL / ELT• Integrate
Hadoop into your DW & Analytics environments
• Integrate Big Data into your IT investments
Configuration
• Configure the Big Data environment to maximize throughput, performance and analytics to meet your stated SLA goals
Governance
• Ensure Data Quality
• MDM• Process
Governance
![Page 10: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/10.jpg)
Top Impediments to Successful Big Data Analytics
![Page 11: Anexinet Big Data Solutions](https://reader036.vdocuments.mx/reader036/viewer/2022062617/54bb3fd44a795944118b458c/html5/thumbnails/11.jpg)
Big Data Buzzword Glossary
Big Data: Think 3 v’s, unstructured data, data that is not currently managed in DW. This is the data that companies need to do game-changing analytics.
Big Data Analytics: Business insights gained from mining Big Data to transform business processes
Columnar: Column-oriented databases that are used in Big Data scenarios because of their speed and compression capabilities, i.e. HP Vertica, HBase
Hadoop: Apache open-source framework for Big Data processing. Made up of multiple components. The leading Big Data platform. Marketed by Couldera & Hortonworks.
In-memory DB: A database that resides fully in memory, eliminating IO bottlenecks. Very important in Big Data Analytics systems, i.e. Microsoft PowerPivot, SSAS 2012, SAP HANA
MapReduce: Distributed data programming and processing framework. A key aspect of processing Big Data is using a MapReduce framework across distributed clusters of commodity servers. Available as open source in the Hadoop framework and in various Hadoop distribution flavors.
MPP: Massively Parallel Processing database engine, mostly used for data warehouse & BI workloads. I.e. SQL Server PDW, IBM Netezza, Teradata
NoSQL: Key-value data store for quick eventual-ACID schemaless database writes. Big Data systems will use these to store data coming in from sources that dump large amounts of data quickly, i.e. Cassandra, MongoDB.