data modeling principles

Upload: nag-dhall

Post on 01-Jun-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Data Modeling Principles

    1/21

     

    Data Modeling

    Agnivesh Kumar

  • 8/9/2019 Data Modeling Principles

    2/21

     

    Define the data model

    A data model documents and organizes data, how it is stored and accessed, and therelationships among different types of data. The model may be abstract or concrete.

    ● Identify the different data components consider raw and processed data, as well as associatedmetadata !these are called entities"

    ● Identify the relationships between the different data components !these are called

    associations"

    ● Identify anticipated uses of the data !these are called re#uirements", with recognition that datamay be most valuable in the future for unanticipated uses

    ● Identify the strengths and constraints of the technology !hardware and software" that you planto use during your pro$ect !this is called a technology assessment phase"

    ● %uild a draft model of the entities and their relations, attempting to &eep the modelindependent from any specific uses or technology constraints.

  • 8/9/2019 Data Modeling Principles

    3/21

     

    Data Modeling Types

    ● 'onceptual Data Modeling

    ● (nterprise Data Modeling

    ● )ogical Data Modeling

    ● *hysical Data Modeling

    ● +elational Data Modeling

    ● Dimensional Data Modeling

  • 8/9/2019 Data Modeling Principles

    4/21

     

    'onceptual Data Modeling

    ● A conceptual data model identifies the highestlevel relationships between the differententities. eatures of conceptual data model include-

    ● Includes the important entities and the relationships among them.

    ● o attribute is specified.

    o primary &ey is specified.

  • 8/9/2019 Data Modeling Principles

    5/21

     

    (nterprise Data Model

    ● Development of a common consistent view and understanding of data elements and theirrelationships across the enterprise is referred to as enterprise data modeling.

    ● This type of data modeling provides access to information scattered throughout an enterpriseunder the control of different divisions or departments with different databases and datamodels.

    (nterprise data modeling is sometimes called as global business model and the entireinformation about the enterprise would be captured in the forms of entities.

    ● /hen a enterprise logical data model is transformed to a physical data model, 01*(+T2*(0and 01%T*(0 may not be as is. I.e the logical and physical structure of super types andsubtypes may be entirely different.!Means names of tables and columns changes and tablescan brea& for understand the model.

  • 8/9/2019 Data Modeling Principles

    6/21

     

  • 8/9/2019 Data Modeling Principles

    7/21

     

    )ogical Data Modeling

    A logical data model describes the data in as much detail as possible, without regard to how

    they will be physical implemented in the database. eatures of a logical data model include-● Includes all entities and relationships among them.

    ● All attributes for each entity are specified.

    ● *rimary &ey for each entity is specified.

    oreign &eys are specified.● ormalization occurs at this level.

  • 8/9/2019 Data Modeling Principles

    8/21

     

    *hysical Data Modeling

    *hysical data model represents how the model will be built in the database. A physicaldatabase model shows all table structures, including column name, column data type, columnconstraints, primary &ey, foreign &ey, and relationships between tables. eatures of a physicaldata model include-

    ● 0pecification all tables and columns.

    oreign &eys are used to identify relationships between tables.● Denormalization may occur based on user re#uirements.

    ● *hysical considerations may cause the physical data model to be #uite different from thelogical data model.

    ● *hysical data model will be different for different +D%M0. or e3ample, data type for a

    column may be different between My04), 04) 0erver,5racle,*ostgres etc.

  • 8/9/2019 Data Modeling Principles

    9/21

     

  • 8/9/2019 Data Modeling Principles

    10/21

     

    +elational Data Modeling

    ● +elational Data Model is a data model that views the real world as entities and relationships.

    ● (ntities are concepts, real or abstract about which information is collected.

    ● The goal of relational data model is to normalize data and present it in a good normal form.

    ● ollowing are some of #uestions that arise during development of relational data model,

    /hat will be the future scope of the data model6

    7ow to normalize data 6

    7ow to group attribute and entities6

    7ow to connect one entity to other6

    7ow to validate data6

    7ow to present report6

  • 8/9/2019 Data Modeling Principles

    11/21

     

    Dimensional Data Modeling

    ● DM is a logical design techni#ue that see&s to present the data in a standard, intuitiveframewor& that allows for highperformance access. It is inherently dimensional, and itadheres to a discipline that uses the relational model with some important restrictions. (verydimensional model is composed of one table with a multipart &ey, called the fact table, and aset of smaller tables called dimension tables. (ach dimension table has a singlepart primary&ey that corresponds e3actly to one of the components of the multipart &ey in the fact table.This characteristic 8starli&e9 structure is often called a star $oin.

    ● A fact table, because it has a multipart primary &ey made up of two or more foreign &eys,always e3presses a manytomany relationship. The most useful fact tables also contain oneor more numerical measures, or 8facts,9 that occur for the combination of &eys that defineeach record.

    ● Dimension tables, by contrast, most often contain descriptive te3tual information. Dimension

    attributes are used as the source of most of the interesting constraints in data warehouse#ueries, and they are virtually always the source of the row headers in the 04) answer set.

  • 8/9/2019 Data Modeling Principles

    12/21

     

  • 8/9/2019 Data Modeling Principles

    13/21

     

    *hysical vs )ogical

  • 8/9/2019 Data Modeling Principles

    14/21

     

    +elational :s Dimensional

  • 8/9/2019 Data Modeling Principles

    15/21

  • 8/9/2019 Data Modeling Principles

    16/21

     

    Data /arehouse Architecture

    All data warehouse systems have the following layers-

    ● Data 0ource )ayer

    ● Data (3traction )ayer

    ● 0taging Area

    ● (T) )ayer

    ● Data 0torage )ayer

    ● Data )ogic )ayer

    ● Data *resentation )ayer

    ● Metadata )ayer

    ● 0ystem 5perations )ayer

  • 8/9/2019 Data Modeling Principles

    17/21

     

  • 8/9/2019 Data Modeling Principles

    18/21

     

    Data 0ource )ayer

    ● This represents the different data sources that feed data into the data warehouse. The datasource can be of any format plain te3t file, relational database, other types of database,(3cel file, etc., can all act as a data source.

    ● Many different types of data can be a data source-

    ● 5perations such as sales data, 7+ data, product data, inventory data, mar&eting data,

    systems data.● /eb server logs with user browsing data.

    ● Internal mar&et research data.

    ● Thirdparty data, such as census data, demographics data, or survey data.

  • 8/9/2019 Data Modeling Principles

    19/21

     

    Data (3traction )ayer

    ● Data gets pulled from the data source into the data warehouse system. There is li&ely someminimal data cleansing, but there is unli&ely any ma$or data transformation.

    0taging Area

    ● This is where data sits prior to being scrubbed and transformed into a data warehouse datamart. 7aving one common area ma&es it easier for subse#uent data processing integration.

    (T) )ayer

    ● This is where data gains its

  • 8/9/2019 Data Modeling Principles

    20/21

     

    Data )ogic )ayer

    ● This is where business rules are stored. %usiness rules stored here do not affect the underlyingdata transformation rules, but do affect what the report loo&s li&e.

    Data *resentation )ayer

    ● This refers to the information that reaches the users. This can be in a form of a tabular graphical report in a browser, an emailed report that gets automatically generated and senteveryday, or an alert that warns users of e3ceptions, among others. 1sually an 5)A* toolandor a reporting tool is used in this layer.

    Metadata )ayer

    ● This is where information about the data stored in the data warehouse system is stored. Alogical data model would be an e3ample of something that;s in the metadata layer. Ametadata tool is often used to manage metadata.

    0ystem 5perations )ayer

    ● This layer includes information on how the data warehouse system operates, such as (T) $ob

    status, system performance, and user access history.

  • 8/9/2019 Data Modeling Principles

    21/21

     

    Database :s Data /arehouseDatabase-

    1sed for 5nline Transactional *rocessing !5)T*". This records the data from the user for history.

    ● The tables and $oins are comple3 since they are normalized. This is done to reduce redundant data and tosave storage space.

    ● (ntity B +elational modeling techni#ues are used for database design.

    ● 5ptimized for write operation.

    ● *erformance is low for analysis #ueries.

    Data /arehouse-

    ● 1sed for 5nline Analytical *rocessing !5)A*". This reads the historical data for the 1sers for businessdecisions.

    ●  The Tables and $oins are simple since they are denormalized. This is done to reduce the response time

    for analytical #ueries.● Data B Modeling techni#ues are used for the Data /arehouse design.

    ● 5ptimized for read operations.

    ● 7igh performance for analytical #ueries.

    ● Ceneral Data low B !(3- 5nline Insurance +egistration"