cale inter 2° 05 ita - technology transfer€¦ · 1. an introduction to big data ... • popular...

[email protected]

TECHNOLOGY TRANSFER PRESENTS

RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231ROME (ITALY)

Big Data and Analytics Data Virtualizationin PracticeFrom Strategy

to Implementation

MIKEFERGUSON

JUNE 3-4, 2015 JUNE 5, 2015

Big Data and Analytics

This new three day workshop is aimed at getting Data Scientists, Data Warehousing and BI Professionals up to

scratch on Big Data. What is this new phenomenon? How can you make use of it? How does it fit within our

traditional analytical environment? What skills do you need to develop for Big Data? All of these questions

are addressed in this new knowledge packed workshop.

Audience

• IT Directors

• CIO’s

• IT Managers

• BI Managers

• Data Warehousing Professionals

• Data Scientists

• Enterprise Architects

• Data Architects

LeArning Objectives

Attendees to this seminar will learn:

• How to understand what Big Data is

• How to understand Big Data Analytical workloads

• The different Big Data technology platforms

• Big Data Analytical techniques and front-end tools

• How to analyse un-modelled, multi-structured data using Hadoop and MapReduce

• How to integrate Big Data with traditional Data Warehouses and BI systems

• How to clearly understand business use cases for different Big Data technologies

• How to set up and organise Big Data projects

• How to make use of Big Data to deliver business value

ABOUT THIS SEMINAR

1. An introduction to big data

This session defines Big Data and looks at business

reasons for wanting to make use of this new area of

technology. It looks at Big Data use cases and what

the difference is between traditional BI and Data

Warehousing versus Big Data.

• What is Big Data?

• Types of Big Data

• Why analyse Big Data?

• Industry use cases – Popular Big Data analytic ap-

plications

• What is Data Science?

• Data Warehousing and BI Versus Big Data

• Popular patterns for Big Data technologies

2. An introduction to big data Analytics

This session looks at Big Data Analytics, the tools

and techniques involved in it and how you can inte-

grate this new environment into an existing DW/BI

environment to enrich business insight. It also looks

at how to get more value out of existing Data Man-

agement tools and BI tools across DW and Big

Data platforms.

• Traditional Data Warehousing and BI in the Enterprise

• The need to analyse new more complex data sources

• Types of Big Data analytical workloads

• Streaming data at high velocity

• Structured data analysis

• Multi-structured data analysis

• Challenges when managing and analysing Big Data

• Key components in a Big Data Analytics environment

• The Big Data Extended Analytical Ecosystem

3. big data Platforms and storage Options

This session looks at Data storage options for Big

Data Analytics.

• The new Multi-Platform Analytical Ecosystem

• Beyond the Data Warehouse: Hadoop NoSQL and

analytical RDBMSs, NewSQL DBMSs

• NoSQL DBMSs

• An introduction to Hadoop and the Hadoop Stack

• HDFS, MapReduce, Pig & Hive

• Hadoop 2.0 Spark Framework

• SQL on Hadoop options

- Impala, Hive, Stinger, SparkSQL, HawQ, IBM

BigSQL, CitusDB JethroData, Splice Machine,

Actian Vortex

• The Big Data Marketplace

- Hadoop distributions – Cloudera, HortonWorks, ,

MapR, IBM igInsights, Microsoft HD Insight, Piv-

otalHD

- Big Data Appliances – Oracle Big Data Appliance,

IBM PureData System for Hadoop, HP HAeN, Ter-

adata Aster Discovery Server

- NoSQL databases e.g. Datastax, Neo4J, Cassan-

dra, , MongoDB, Riak

- Analytical databases and DW appliances, e.g. Tera-

data, Exasol, IBM PureData, Oracle Exadata, SAP

HANA, Kognitio, Actian Matrix

- Analytical appliances – SAS LASR, MicroStrategy

PRIME

• The Cloud deployment option – Microsoft Windows

Azure, IBM, Amazon Elastic MapReduce, Altiscale

Data Cloud, Qubole

• Creating a Multi-Platform analytical ecosystem

4. big data integration and governance in a Multi-

Platform Analytical environment

This session will look at the challenge of dealing with

the integration of Big Data and the unique issues it

raises. How do you deal with very large data volumes

and different varieties of data? How does loading

data into Hadoop differ from loading data into analyti-

cal relational databases?

What about NoSQL databases? How should low-la-

tency data be handled?

• Types of Big Data

• Connecting to Big Data sources, e.g. Web logs,

clickstream, sensor data, unstructured and semi-

structured content

• Supplying consistent data to Multiple Analytical Plat-

forms

• Loading Big Data – what’s different about loading

HDFS, Hive & NoSQL Vs analytical relational data-

bases

• Change data capture – What’s possible

• Data Warehouse offload

OUTLINE

• Tools for ELT processing on Hadoop – The Enter-

prise Data Refinery

• ETL tools Vs Pig Vs self-service DI/DQ

• Parsing unstructured data

• Governing data in a Data Science environment

• Joined up analytical processing from ETL to analyti-

cal workflows

• The impact of data scientist and end user self-service

DQ/DI – Paxata, Trifacta, MS Excel, MicroStrategy

• Mapping discovered data of value into your DW and

business vocabulary

• Big data audit, protection and security – Dataguise,

IBM Guardium, Protegrity

5. tools and techniques for Analysing big data

This session looks at tools available for both data Sci-

entists and also traditional DW/BI Professionals. It

looks how both types of developers can exploit Big

Data platforms such as Hadoop and NoSQL databas-

es using programming techniques and traditional BI

tools as well as how vendors are making it easier to

gain access both the NoSQL/Hadoop world and the

Analytical RDBMS world by using data virtualisation.

• Data Science projects

• Creating Sandboxes for Data Science projects

• Options for analysing unstructured content – Text

analytics, custom MapReduce code and MapRe-

duce developer tools

• Using R as an analytical language for Big Data

• Text analysis and visualisation, Sentiment analysis

and visualisation

• Clickstream analysis and visualisation

• Analysing big data using MapReduce and Spark BI

Tools and applications for Hadoop, e.g. Datameer,

FICO Karmasphere, Platfora, IBM Customer Insight

• Exploratory graph analysis and visualisations

• Using search to analyse multi-structured data

- Creating search indexes on multi-structured data

- Building dashboards and reports on top of search

engine indexed content

- The integration of search with traditional BI platforms

- Guided analysis using multi-faceted search

- The marketplace: Apache Solr, Attivio, Cloudera

Search, Connexica, DataRPM, HP IDOL, IBI Web-

Focus Magnify, IBM Watson Explorer, LucidWorks,

Microsoft, Oracle Endeca Quid, Splunk

• Analysing Big Data using Self-Service BI Tools, e.g.

Tableau, QlikView, Spotfire, SAS Visual Analytics,

MicroStrategy, SAP Lumira

• Big data analytics – query performance enablers

• Managing stream computing in a Big Data environ-

ment

• Tools and techniques for streaming analytics

6. integrating big data Analytics into the enter-

prise

This session looks at how new Big Data platforms

can be integrated with traditional Data Warehouses

and Data Marts. It looks at Stream Processing,

Hadoop, NoSQL databases, Data Warehouse appli-

ances and shows how to put them together to maxi-

mize business value from Big Data Analytics.

• Integrating Big Data Platforms with traditional DW/BI

environments – What’s involved

• Integrating Stream Processing with Hadoop and An-

alytical DW Appliances

• Integrating Hadoop with DW Appliances and Enter-

prise Data Warehouses

• Tying together front end tools

• Options for implementing Multi-Platform Analytics

• Cross-Platform Analytical workflows

• The role of Data Virtualisation in a Big Data environ-

ment

• Multi-Platform optimisation

Data Virtualization in Practice

As more and more data becomes available to people and applications inside the enterprise the challenge of in-

tegrating data is becoming more complex. Data is now distributed across multiple on-premises and Cloud

based applications as well as new high value data becoming available in external data sources. This makes

data harder to access and integrate. This new 1-day seminar looks to address this issue by introducing Data

Virtualization and dhow why this technology is now a key component of any information architecture. It discuss-

es what Data Virtualization is, how you can use this technology to simplify access to data, how it can be used to

increase agility and reduce time to time value.

LeArning Objectives

After attending this seminar you should be able to understand what Data Virtualization is, how it works, tools in

the marketplace, how Data Virtualization changes your Data Architecture and how it can simplify access to data.

Audience

• Data Architects

• DBAs

• BI Professionals

• Business Analysts

• IT Managers

ABOUT THIS SEMINAR

1. An introduction to data virtualization

• This part of the course introduces Data Virtualiza-

tion: how it works, the business benefit that can be

realized from using it, and the products that are

available on the market.

• What is Data Virtualization?

• Why is it needed? What business problems does it

solve?

• How Data Virtualization works

• Data Virtualization product types

• The Data Virtualization marketplace – Cisco, Deno-

do, IBM, Informatica Data Services, RedHat, SAS,

StoneBond…

2. implementing data virtualization

• This session continues with a look at how you can im-

plement Data Virtualization, including the importance

of common data definitions, building virtual layers,

and how to manage performance and security

• Business Glossaries and Data Virtualization

• Why is common metadata important in virtual data

views?

• Virtual data models and importing data models from

other tools

• Implementing virtual tables and nested table layering

• Mapping physical data to virtual data

• Managing performance

- Using data caching and snapshots in a Data Virtu-

alization deployment

- Using Data Virtualization with in-memory computing

- Using multiple instances of Data Virtualization

servers

- Distributed Data Virtualization configurations

• Managing data security using Data Virtualization

3. Popular data virtualization use cases to Max-

imise business value

• Using Data Virtualization in Data warehousing

- Virtual ODSs

- Virtual Data Marts

- Virtual Data Sources

• Data Virtualization Patterns for integrating BI and

CPM

• Using Data Virtualization in MDM

• Building an information services layer using Data

Virtualization

• Using Data Virtualization with operational systems

• Integrating cloud and on-premise data use in Cloud

Computing

• Using Data Virtualization in a big Data Architecture

• Using Data Virtualization with other data manage-

ment tools and in data governance

OUTLINE

MIKE FERGUSON

big dAtA And AnALytics

Rome June 3-4, 2015Residenza di Ripetta - Via di Ripetta, 231Registration fee: € 1200

dAtA virtuALizAtiOn

in PrActice

Rome June 5, 2015Residenza di Ripetta - Via di Ripetta, 231Registration fee: € 700

bOth seMinArs

special price for the delegates

who attend both seminars: € 1800

If anyone registered is unable to attend, or in caseof cancellation of the seminar, the general conditionsmentioned before are applicable.

first name ...............................................................

surname .................................................................

job title ...................................................................

organisation ...........................................................

address ..................................................................

postcode ................................................................

city .........................................................................

country ...................................................................

telephone ...............................................................

fax ..........................................................................

e-mail .....................................................................

✂

Send your registration formwith the receipt of the payment to:Technology Transfer S.r.l.Piazza Cavour, 3 - 00193 Rome (Italy)Tel. +39-06-6832227 - Fax +39-06-6871102info@technologytransfer.itwww.technologytransfer.it

Stamp and signature

INFORMATION

PARTICIPATION FEE

big data and Analytics

€ 1200

data virtualization in Practice

€ 700

special price for the delegateswho attend both seminars:

€ 1800

The fee includes all seminardocumentation, luncheon and coffeebreaks.

VENUE

Roma, Residenza di RipettaVia di Ripetta, 231Rome (Italy)

SEMINAR TIMETABLE

9.30 am - 1.00 pm2.00 pm - 5.00 pm

HOW TO REGISTER

You must send the registration form withthe receipt of the payment to:TECHNOLOGY TRANSFER S.r.l.Piazza Cavour, 3 - 00193 Rome (Italy)Fax +39-06-6871102

withinMay 19, 2015

PAYMENT

Wire transfer to:Technology Transfer S.r.l.Banca: CariparmaAgenzia 1 di RomaIBAN Code: IT 03 W 06230 03202 000057031348BIC/SWIFT: CRPPIT2P546

GENERAL CONDITIONS

DISCOUNT

The participants who will register 30 daysbefore the seminar are entitled to a 5%discount.

If a company registers 5 participants to thesame seminar, it will pay only for 4.

Those who benefit of this discount are notentitled to other discounts for the sameseminar.

CANCELLATION POLICY

A full refund is given for any cancellationreceived more than 15 days before theseminar starts. Cancellations less than15 days prior the event are liable for 50%of the fee. Cancellations less than oneweek prior to the event date will be liablefor the full fee.

CANCELLATION LIABILITY

In the case of cancellation of an event forany reason, Technology Transfer’sliability is limited to the return of theregistration fee only.

Mike Ferguson is the Managing Director of Intelligent Business Strategies Ltd. As an independent analyst

and consultant he specialises in Business Intelligence, Enterprise Business Integration and Enterprise Data

Management. With over 30 years of IT experience, Mr. Ferguson has consulted for dozens of companies, spo-

ken at events all over the world and written numerous articles. He is also an expert on the B-EYE-Network. Pri-

or to founding Intelligent Business Strategies, was a member of NCR’s worldwide product strategy and archi-

tecture team as a Chief Architect working on the Teradata DBMS. He spent four years as a principal and co-

founder of Codd and Date Europe Limited – the inventors of the Relational Model - specialising in IBM’s DB2

product and was a partner and European Managing Director at DataBase Associates.

SPEAKER

cale inter 2° 05 ita - technology transfer€¦ · 1. an introduction to big data ... • popular...

Documents