skladišta podataka - kkolac
TRANSCRIPT
![Page 1: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/1.jpg)
Skladištenje podataka
(Data Warehousing)
sa Oracle-om
MeĎimurje IPC d.d. Krešimir Kolac, [email protected]
24.07.2007
![Page 2: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/2.jpg)
Ukratko
Zašto skladišta podataka?
Što su skladišta podataka?
Razlike izmeĎu OLTP i OLAP-a
Osnovni pojmovi (OLAP, BI, Data Martovi itd…)
Oracle i skladišta podataka
![Page 3: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/3.jpg)
Transakcijski sustavi (relacijske baze
podataka)
Optimirane za brz transakcijski rad
Brzo spremanje transakcija (update, insert)
Normalizirani kompleksni procesi koji se sastoje od puno tablica
Redundancija svedena na minimum
Puno analitičkih podataka
Visoka pouzdanost i dostupnost
![Page 4: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/4.jpg)
Problemi
Normalizirani podaci nisu prikladni za brzo izvještavanje i analiziranje podataka kroz više godina
Upiti na više tablica često traju jako dugo
Zbog optimiranja sustava povijesni podaci se često odvajaju od radnih podataka
![Page 5: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/5.jpg)
Problemi – heterogena okolina
“Heterogeneities are
everywhere”
Personal Databases
Digital Libraries
Scientific Databases World Wide Web
• Sustavi različitih dobavljača • Različiti modeli podataka • Redundantni, nekonzistentni podaci
![Page 6: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/6.jpg)
Cilj – unificirani pristup podacima
Integration System
Sakupljanje i kombiniranje informacija
Uniformno integrirano sučelje za analizu podataka
World Wide Web
Digital Libraries Scientific Databases
Personal
Databases
![Page 7: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/7.jpg)
Rješenja kroz povijest
EIS – Executive information systems
Prije relacijskih baza, sustavi pisani u strukturnim programskim jezicima, domet su bile razne tablice i sume.
DSS - Decision support systems
Tradiocinalni pristup, SQL upitima ekstrahiranje informacija iz relacijskih baza
Data warehousing (DW) and business intelligence (BI)
![Page 8: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/8.jpg)
Skladišta podataka (DW)
Skladište podataka je relacijska baza zadužena za skupljanje povijesnih podataka iz različitih transakcijskih sustava.
Dizajnirana za brzo izvoĎenje upita i analiziranje podataka u skladištu.
Skladišta podataka mogu spremati podatke analitičke i sumarne
podatke na različitim nivoima granularnosti. Podaci su denormalizirani. Redundancija dobrodošla.
Analiza podataka u skladištu i korištenje skladišta je vremenski odvojeno od moda kada se skladište puni.
Za razliku od relacijskih baza podataka, skladišta podataka uključuju: rješenja za ekstrakciju, transformaciju i učitavanje podataka (ETL
proces) Online analytical processing (OLAP) i Data mining mogućnosti Client alate za analizu i izvješćivanje.
![Page 9: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/9.jpg)
Razlike izmeĎu transakcijskih
sustava i DW
Standard DB (OLTP)
Većinom update, insert, delete
Puno malih transakcija
Mb - Gb podataka
Normalizirani podaci, minimalna redundancija
Trenutni snapshot
Sirovi podaci
Tisuće korisnika - operateri
Warehouse (OLAP)
Većinom čitanje kod punjenja većinom insert
Dugi i kompleksni upiti
Gb - Tb podataka
Denormalizirani podaci
Povijest – Puno presjeka baze
Sumarni, očišćeni podaci
Stotine korisnika - decision-makeri, analitičari
![Page 10: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/10.jpg)
Osnovni pristup
![Page 11: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/11.jpg)
Data Warehouse Architecture (with a
Staging Area)
![Page 12: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/12.jpg)
Data Warehouse Architecture (with a
Staging Area and Data Marts)
![Page 13: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/13.jpg)
Arhitektura skladišta podataka
Information Sources Data Warehouse
Server
(Tier 1)
OLAP Servers
(Tier 2)
Clients
(Tier 3)
Operational
DB’s
Semistructured
Sources
extract
transform
load
refresh
etc.
Data Marts
Data
Warehouse
e.g., MOLAP
e.g., ROLAP
serve
Analysis
Query/Reporting
Data Mining
serve
serve
![Page 14: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/14.jpg)
Data Mart
Manji dio skladišta podataka zadužen za implementaciju specifične poslovne funkcije Financije
Prodaja
Nabava
Razlikujemo dvije vrste Data Mart-ova Dependent Data Mart
Puni se iz skladišta podataka
Jednostavan ETL
Independent Data Mart
naslanja se direktno na transakcijske sustave
Kompliciran ETL
![Page 15: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/15.jpg)
Data Warehouse vs. Data Marts
Enterprise warehouse: collects all information about subjects (customers,products,sales,assets, personnel) that span the entire organization
Requires extensive business modeling (may take years to design and build)
Data Marts: Departmental subsets that focus on selected subjects
Marketing data mart: customer, product, sales
Faster roll out, but complex integration in the long run
(few months to build)
Virtual warehouse: views over operational dbs
Materialize sel. summary views for efficient query processing
Easy to build but require excess capability on operat. db servers
![Page 16: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/16.jpg)
Poslovno, logičko i dimenzijsko
modeliranje DW
Star shema model
Snowflake shema model
Dimenzije (SCD), hierarhije, leveli
Fact tabele (factless fact table)
Source model
![Page 17: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/17.jpg)
Star shema
U potpunosti denormalizirane dimenzije
![Page 18: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/18.jpg)
Star shema
![Page 19: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/19.jpg)
Snowflake shema
Nisu do kraja denormalizirane dimenzije
![Page 20: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/20.jpg)
Snowflake shema
![Page 21: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/21.jpg)
Modeliranje dimenzija
Hierarhije
Leveli
Granularnost
Slowly Changing Dimensions
Vremenska dimenzija
Kritična za skladište podataka – bitno izabrati dobar nivo granularnosti
![Page 22: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/22.jpg)
Modeliranje dimenzija
![Page 23: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/23.jpg)
Slowly Changing Dimensions
![Page 24: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/24.jpg)
Modeliranje fact tabele
Mjere
Količina
Cijena
Bruto/Neto iznos
Porez
Factless fact tabela
![Page 25: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/25.jpg)
ETL proces (Extract, transform, load )
Ekstrakcija podataka iz transakcijskih sustava i flat fileova, Čišćenje, denormalizacija i preoblikovanje prema poslovnim
potrebama. Punjenje skladišta podataka
Najveći dio posla oko izgradnje skladišta podataka odnosi se
na ETL proces-e
![Page 26: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/26.jpg)
OLAP – Online analytical processing
―Tehnika za sintezu podataka je OLAP (On Line Analisys
Processing). Podaci spremljeni u DW su u prvom redu optimizirani za skladištenje te bilo kakav rad s njima zahtijeva značajno vrijeme. Ali, korisnicima BI sustava se žuri te ne mogu čekati dok se obavi proces dohvata i analize podataka. Zato je tu OLAP koji "stane" izmeĎu DW-a i korisnika te omogućuje brzu analizu podataka. Kako se to postiže? Jednostavno, OLAP predvidi sve potrebne analize, unaprijed ih izračuna, spremi i proslijedi korisniku kada on to zatraži.‖ INFOTREND
Informacijski sustav za brz, konzistentan i interaktivan pristup i manipulaciju multidimenzionalnim podacima koji dolaze iz različitih izvora, a spremljeni su u skladištu podataka.
Funkcionalnost OLAP-a ostvarena je kroz mogućnost multi-dimenzionalnih analiza konsolidiranih korporativnih podataka koje uključuju: modeliranje korištenjem dimenzija i hijerarhija podataka, analize trendova kroz odreĎena vremenska razdoblja, projekciju podataka kroz what-if scenarije, podskupove podataka, bušenje (drill down) do nižih nivoa detaljnosti podataka.
![Page 27: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/27.jpg)
Načini implementacije
MOLAP (Multidimensional OLAP) Podaci se spremaju u multidimenzionalnu bazu podataka
zvanu Anaytical Workspace (AW)
U Oracle-u 10g to su sistemske tablice u koje se spremaju binarni podaci, nečitljivo bez posebnog alata
ROLAP (Relational OLAP) Podaci se spremaju u relacijsku bazu podataka
Multidimenzionalnost i prekalkuliranje sume implementiraju se pomoću materijaliziranih view-a
Manje su učinkoviti, jeftinija implementacija
HOLAP (Hybrid OLAP) Rijetko se ažuriraju podaci – MDD
Često ažuriranje - RDB
![Page 28: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/28.jpg)
Od OLTP-a do OLAP-a
![Page 29: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/29.jpg)
OLAP “kocka”
![Page 30: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/30.jpg)
OLAP operacije
Single Cell Multiple Cells Slice Dice
Roll Up
Drill Down
![Page 31: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/31.jpg)
OLAP operacije
Simple query – single cell in the cube Slice – Look at a subcube to get more
specific information Dice – Rotate cube to look at another
dimension Roll Up – Dimension Reduction;
Aggregation Drill Down Visualization: These operations allow the
OLAP users to actually ―see‖ results of an operation.
![Page 32: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/32.jpg)
OLAP u Oracle-u
Oracle OLAP 10g components: The OLAP analytic
engine Analytical
workspaces (AWs) Analytical Workspace
Manager (AWM) OLAP Worksheet OLAP Catalog Interfaces for
developing OLAP applications in SQL and JAVA
Oracle Tools for AW Administration and Quarying: Analyticcal
Workspace Manager (AWM)
Oracle Warehouse Builder (OWB)
Oracle Discoverer Plus for OLAP
Oracle BI Spreadsheet Add-In for MS Excel
![Page 33: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/33.jpg)
Data mining
From Wikipedia
Data mining has been defined as "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data" [1] and "the science of extracting useful information from large data sets or databases―
Data mining involves sorting through large amounts of data and picking out relevant information. It is usually used by Business intelligence organizations, and financial analysts, but is increasingly used in the sciences to extract information from the enormous data sets generated by modern experimental and observational methods.
![Page 34: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/34.jpg)
Oracle Data Mining
Oracle Data Mining (ODM)—an option to Oracle Database 10g Enterprise Edition—enables you to produce actionable predictive information and build integrated business intelligence applications. Using data mining functionality embedded in Oracle Database 10g, you can find patterns and insights hidden in your data. Application developers and integrators can quickly automate the distribution of new business intelligence—predictions, patterns and discoveries—throughout your organization.
Oracle Data Mining enables business decision makers, data analysts, integrators, and IT to extract greater value from corporate data resulting in better informed business decisions that address a wide range of business problems.
GUI interface - Oracle Data Miner
![Page 35: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/35.jpg)
Oracle & Data Warehousing
Kupnjom kompanije Siebel Systems, Oracle je ponudio alternativu svojim BI (Business Intelligence) alatima. Danas u Oracle-ovoj ponudimo imamo dva paketa BI alata:
Oracle Business Intelligence Enterprise Edition (Siebelovi alati)
Oracle Business Intelligence Standard Edition (Oracleovi alati)
![Page 36: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/36.jpg)
Oracle Business Intelligence Standard
Edition (SE)
Oracle BI Discoverer (Relacijski i OLAP) – OLAP alat za pregled i analizu podataka u skladištu. Omogućuje spajanje na relacijski OLAP i multi dimenzionalni OLAP (MOLAP)
Oracle BI Warehouse Builder – dizajn, kreiranje i punjenje skladišta podataka
Oracle BI Spreadsheet Add-in – dodatak za Excel koji omogućuje Excelu spajanje na OLAP kocke
Oracle BI Beans – za razvoj BI aplikacija Oracle Reports Services – report alat
![Page 37: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/37.jpg)
Oracle Business Intelligence Enterprise
Edition (EE)
Oracle BI Server
Oracle BI Answers – pregled i analiza podataka
Oracle BI Interactive Dashboards - portal
Oracle BI Delivers – monitoring i upozorenja
Oracle BI Disconnected Analytics – za offline pristup
Oracle BI Publisher (XML Publisher) - reporting
Oracle BI Briefing Books – za dijeljenje dashboard dokumenata offline
![Page 38: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/38.jpg)
Koristimo
Oracle baza 10gR2 sa OLAP opcijom
Oracle Warehouse Builder 10gR2
Oracle BI Application Server 10gR2 (Discoverer Plus, Discoverer Viewer)
OracleBI Spreadsheet Add-In 10g
Oracle Workflow
![Page 39: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/39.jpg)
MeĎimurje IPC d.d.
arhitektura sustava
![Page 40: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/40.jpg)
Instalacije
Oracle Warehouse Builder 10gr2 \\kolac\programi\OWB10gR2
Oracle BI Excel Add-In \\kolac\programi\Oracle AS&BI\OBISpAddinInst_10.1.2.2.10.exe
Ostalo \\kolac\programi\Oracle AS&BI\
Oracle BI App Server (Discoverer Plus & Viewer) http://hermes.ipcck.hr:7779/
Primjeri
http://www.oracle.com/technology/obe/obe_bi/bi.html
Dokumentacija http://www.oracle.com/pls/db102/portal.portal_db?selected=6
![Page 41: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/41.jpg)
Sheme
Vlasnik OWB repozitorija wbgazda/wbgazda@zeus
Korisnik OWB repozitorija wbkor/wbkor@zeus
Oracle Workflow OWF_MGR/OWF_MGR@zeus
DW (ROLAP) maris_wh/maris_wh@zeus
AW (MOLAP) maris_aw/maris_aw@zeus
EUL za ROLAP Discoverer disco/disco@zeus
Logiranje u OWB
![Page 42: Skladišta podataka - kkolac](https://reader031.vdocuments.mx/reader031/viewer/2022012400/54773cbf5906b553068b45f0/html5/thumbnails/42.jpg)
Ova prezentacija se nalazi na:
\\ipcmaris\SHARE\maris\Projekti\Nove tehnologije\DataWarehose\Škola