OLAP Mining: Mining Multidimensional Data

Download OLAP Mining: Mining Multidimensional Data

Post on 29-Nov-2014

704 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

 

TRANSCRIPT

<ul><li> 1. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives OLAP Mining: Mining Multidimensional Data EXPEDO LIRMM, U NIVERSIT M ONTPELLIER II, F RANCE ETIS, U NIVERSIT C ERGY-P ONTOISE , F RANCE HELP UC, K UALA L UMPUR , M ALAYSIA Feb. 20-21 2007 EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 2. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives Outline 1 Introduction OLAP and Data Mining Research Topics on OLAP Mining (EXPEDO) 2 Mining for Blocks Fuzzy and Crisp Blocks Generating Blocks Managing Hierarchies Visualizing Blocks 3 Multiple-Level Multidimensional Sequential Patterns Multidimensional Sequential Patterns Multiple Level MSP Implementation 4 Conclusion and perspectives EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 3. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives Outline 1 Introduction OLAP and Data Mining Research Topics on OLAP Mining (EXPEDO) 2 Mining for Blocks Fuzzy and Crisp Blocks Generating Blocks Managing Hierarchies Visualizing Blocks 3 Multiple-Level Multidimensional Sequential Patterns Multidimensional Sequential Patterns Multiple Level MSP Implementation 4 Conclusion and perspectives EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 4. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives OLAP and KDD OLTP vs. OLAP OLAP Users Decision makers Complex Queries Current Uses OLAP framework : mainly provides navigation and reporting tools (pull) Need for Data Mining (push) EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 5. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives OLAP and KDD OLAP Mining First introduced in 1997 by Jiawei Han as a mechanism which integrates OLAP with data mining so that mining can be performed in different portions of databases or data warehouses and at different levels of abstraction at users nger tips EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 6. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives OLAP and KDD Specicities of the OLAP Framework On-line analysis measures described by means of dimensions aggregated measure values hierarchies displaying data : the order matters (switch, pivot) EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 7. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives OLAP and KDD Motivating Example Beer Water Soda Wine Milk Europe 4 4 7 6 5 America 4 5 7 7 6 Asia 3 3 6 5 5 Africa 2 2 6 5 4 Beer Water Milk Wine Soda America 4 5 6 7 7 Europe 4 4 5 6 7 Asia 3 3 5 5 6 Africa 2 2 4 5 6 EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 8. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives OLAP and KDD Representing Cubes Several Ways to represent the same data Finding the best representations is known as being NP-Hard EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 9. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives Hierarchies Using Hierarchies Representativity of extracted Knowledge high : nothing can be extracted (and trivial knowledge) low : too many patterns extracted, no use for the decision makers Difculty to choose the best level of granularity to get relevant knowledge Taking Hierarchies into account Extracting rules at different levels of hierarchies Subrules are automatically discovered (thanks to anti-monotonicity) EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 10. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives Research Topics on OLAP Mining (EXPEDO) Research Topics from EXPEDO Topics addressed by the project Mining for Rules (e.g. association rules, gradual rules, sequential patterns) Mining for homogeneous parts and compressing (e.g. blocks) Navigating by means of intelligent queries To be addressed in this talk Mining for Blocks Mining for Multidimensional Sequential Patterns EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 11. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives Outline 1 Introduction OLAP and Data Mining Research Topics on OLAP Mining (EXPEDO) 2 Mining for Blocks Fuzzy and Crisp Blocks Generating Blocks Managing Hierarchies Visualizing Blocks 3 Multiple-Level Multidimensional Sequential Patterns Multidimensional Sequential Patterns Multiple Level MSP Implementation 4 Conclusion and perspectives EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 12. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives Why Blocks ? Impossibe to Mine for the Best Representation Different kinds of relevant representations Other criteria may be considered : pointing out homogeneous parts What are Blocks ? Blocks are subcubes dened over all dimensions some dimensions may appear completely : ALL level Blocks must be large enough (Support) Blocks must be homogeneous enough (Condence) EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 13. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives PRODUCT 111 000 111 000 111111 000000 5 P1 111 000 6 111 000 111 000 111111 000000 86 5 111111 000000 2 111 000 111 000 111 000 111 000 111111 000000 111111 000000 6 75 111 000 P2 6 111 000 111 000 8 111111 000000 5 5 111111 000000 1111111111 0000000000 111111 000000 1111111111 1111111111 0000000000 0000000000 111111 000000 2 1111111111 1111111111 0000000000 0000000000 P3 8 5 111111 000000 5 2 8 1111111111 1111111111 0000000000 0000000000 1111111111 1111111111 0000000000 0000000000 1111111111 1111111111 0000000000 0000000000 P4 1111111111 1111111111 0000000000 0000000000 8 8 8 1111111111 0000000000 2 2 2 C1 C2 C3 C4 C5 C6 CITY Block Value the number of measure values may be numerous, thus preventing from discovering blocks EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li> <li> 14. Introduction Mining for Blocks Multiple-Level Multidimensional Sequential Patterns Conclusion and perspectives Fuzzy and Crisp Blocks Partitioning the measure : Crisp Blocks 6 5.9 7.8 4.8 5 0 10 6.1 8 5.1 4.7 5.3 8.1 5 4.9 2.4 1.8 7.9 8.1 8.2 2.2 1.9 0 2 5 8 10 EXPEDO LIRMM-ETIS-HELP UC OLAP Mining </li>...</ul>

Recommended

View more >