what is data warehousing

Upload: jackron

Post on 06-Apr-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 What is Data Warehousing

    1/21

    What is Data warehousing?

    Answer -A data warehouse can be considered as a storage area where interest specific or relevant

    data........

    What are fact tables and dimension tables?

    Answer -As mentioned, data in a warehouse comes from the transactions. Fact table in a data warehouse

    consists........

    What is ETL process in data warehousing?

    Answer - ETL is Extract Transform Load. It is a process of fetching.........

    Explain the difference between data mining and data warehousing.

    Answer - Data warehousing is merely extracting data from different sources, cleaning the........

    What is an OLTP system and OLAP system?

    Answer - OLTP: Online Transaction and Processing helps and manages applications based........

    What are cubes?

    Answer -A data cube stores data in a summarized version which helps in a faster analysis of data..........

    What is snow flake scheme design in database?

    Answer -A snowflake Schema in its simplest form is an arrangement of fact tables.........

    What is analysis service?

    Answer -Analysis service provides a combined view of the data used in OLAP.........

    Explain sequence clustering algorithm.

    Answer - Sequence clustering algorithm collects similar or related paths, sequences of data.........

    Explain discrete and continuous data in data mining.

    Answer - Discreet data can be considered as defined or finite data..........

    Explain time series algorithm in data mining.

    Answer - Time series algorithm can be used to predict continuous values of data.......

    What is XMLA?

    Answer - XMLA is XML for Analysis which can be considered as a standard for accessing data in OLAP.......

    Explain the difference between Data warehousing and Business Intelligence.

  • 8/2/2019 What is Data Warehousing

    2/21

  • 8/2/2019 What is Data Warehousing

    3/21

    Dependent data ware house are build........

    What is data modeling and data mining? What is this used for?

    Designing a model for data or database is called data modelling........

    Difference between ER Modeling and Dimensional Modeling.

    Dimensional modelling is very flexible for the user perspective........

    What is snapshot with reference to data warehouse?

    A snapshot of data warehouse is a persisted report from the catalogue.........

    What is degenerate dimension table?

    The dimensions that are persisted in the fact table is called dimension table.........

    What is Data Mart?

    Data Mart is a data repository which is served to a community of people.......

    What is the difference between metadata and data dictionary?

    Metadata describes about data. It is data about data. It has information about how and when......

    Describe the various methods of loading Dimension tables.

    The following are the methods of loading dimension tables.......

    What is the difference between OLAP and data warehouse?

    The following are the differences between OLAP and data warehousing......

    Describe the foreign key columns in fact table and dimension table.

    The primary keys of entity tables are the foreign keys of dimension tables.......

    What is cube grouping?

    A transformer built set of similar cubes is known as cube grouping. A single level in one dimension......

    Define the term slowly changing dimensions (SCD).

    Slowly changing dimension target operator is one of the SQL warehousing operators......

    What is a Star Schema?

    The simplest data warehousing schema is star schema.......

    Differences between star and snowflake schema.

  • 8/2/2019 What is Data Warehousing

    4/21

    Star Schema: A de-normalized technique in which one fact table is associated with several dimension tables.......

    Explain the use of lookup tables and Aggregate tables.

    At the time of updating the data warehouse, a lookup table is used.......

    What is real time data-warehousing?

    The combination of real-time activity and data warehousing is called real time warehousing.......

    What is conformed fact? What is conformed dimensions use for?

    Allowing having same names in different tables is allowed by Conformed facts.......

    Define non-additive facts.

    The facts that can not be summed up for the dimensions present in the fact table are called non-additive facts.......

    Define BUS Schema.

    A BUS schema is to identify the common dimensions across business processes......

    List out difference between SAS tool and other tools.

    The differences between SAS and other tools are......

    Why is SAS so popular?

    Statistical Analysis System is an integration of various software products which allows the developers to perform.......

    What is data cleaning? How can we do that?

    Data cleaning is also known as data scrubbing. Data cleaning is a process which ensures the set of data is correct

    and accurate......

    Explain in brief about critical column.

    A column (usually granular) is called as critical column which changes the values over a period of time.......

    What is data cube technology used for?

    Data cube is a multi-dimensional structure. Data cube is a data abstraction to view aggregated data from a number of

    perspectives.........

    What is Data Scheme?

    Data Scheme is a diagrammatic representation that illustrates data structures and data relationships to each other in

    the relational database within the data warehouse................

    Read answer

    What is Bit Mapped Index?

  • 8/2/2019 What is Data Warehousing

    5/21

    Bitmap indexes make use of bit arrays (bitmaps) to answer queries by performing bitwise logical

    operations..................

    Read answer

    What is Bi-directional Extract?

    In hierarchical, networked or relational databases, the data can be extracted, cleansed and transferred in twodirections. The ability of a system to do this is refered to as bidirectional extracts................

    Read answer

    What is Data Collection Frequency?

    Data collection frequency is the rate at which data is collected. However, the data is not just collected and stored. it

    goes through various stages of processing like extracting from various sources, cleansing, transforming and then

    storing in useful patterns................

    What is Data Cardinality?

    Cardinality is the term used in database relations to denote the occurrences of data on either side of the

    relation................Read answer

    What is Chained Data Replication?

    In Chain Data Replication, the non-official data set distributed among many disks provides for load balancing among

    the servers within the data warehouse...............

    Read answer

    What are Critical Success Factors?

    Key areas of activity in which favorable results are necessary for a company to reach its goal. There are four basic

    types of CSFs which are: ...............

    at is Data Warehousing?

    A data warehouse can be considered as a storage area where interest specific or relevant data is stored irrespective

    of the source. What actually is required to create a data warehouse can be............

    Read answer

    What is Virtual Data Warehousing?

    A virtual data warehouse provides a collective view of the completed data. A virtual data warehouse has...........

    Read answer

    Explain in brief various fundamental stages of Data Warehousing.

    Stages of a data warehouse helps to find and understand how the data in the warehouse changes. At an initial stage

    of data warehousing data of the transactions is merely copied to another server. Here, even if the copied data is

    processed for reporting, the source datas performance wont be affected..............

    Read answer

    What is active data warehousing?

  • 8/2/2019 What is Data Warehousing

    6/21

    An active data warehouse represents a single state of the business. Active data warehousing considers the analytic

    perspectives of customers................

    Read answer

    What is data modeling and data mining? What is this used for?

    Data Modeling is a technique used to define and analyze the requirements of data that supports organizationsbusiness process. In simple terms, it is used for the analysis of data objects in order to...........

    Read answer

    Difference between ER Modeling and Dimensional Modeling

    The entity-relationship model is a method used to represent the logical flow of entities/objects graphically that in turn

    create a database. It has both logical and physical model. And it is good for reporting and point queries..............

    Read answer

    What is the difference between data warehousing and business intelligence?

    Data warehousing relates to all aspects of data management starting from the development, implementation and

    operation of the data sets. It is a back up of all data relevant to business context i.e. a way of storing data.............Read answer

    Describe dimensional Modeling.

    Dimensional model is a method in which the data is stored in two types of tables namely facts table and dimension

    table. Fact table comprises of information to measure business successes and the............

    Read answer

    What is snapshot with reference to data warehouse?

    Snapshot refers to a complete visualization of data at the time of extraction. It occupies less space and can be used

    to back up and restore data quickly...............

    What is SQL Server 2005 Analysis Services (SSAS)?

    SSAS gives the business data an integrated view. This integrated view is provided by combining online analytical

    processing (OLAP) and data mining functionality................

    Read answer

    What are the new features with SQL Server 2005 Analysis Services (SSAS)?

    It offers interoperability with Microsoft office 2007. It eases data mining by offering better data mining algorithms and

    enables better predictive analysis...................

    Read answer

    What are SQL Server Analysis Services cubes?

    In analysis services, cube is the basic unit of storage. A cube has data collected from various sources that enables

    faster execution of queries. Cubes have dimensions and measures................

    Read answer

    Explain the purpose of synchronization feature provided in Analysis Services 2005.

  • 8/2/2019 What is Data Warehousing

    7/21

    Synchronization feature is used to copy database from one source server to a destination server. While the

    synchronization is in progress, users can still browse the cubes.................

    Read answer

    MDX in SQL Server 2005 Analysis Services brings exciting improvements including query support and

    expression/calculation language, Explain

    MDX in SQL server 2005 Analysis services offers CASE and SCOPE statements. CASE returns specific values

    based upon its comparison of an expression to a set of simple expressions. It can perform.............

    Read answer

    Can you explain how to deploy an SSIS package?

    A SSIS package can be deployed using the Deploy SSIS Packages page. The package and its dependencies can be

    either deployed in a specified folder in the file system or in an instance of SQL server...............

    Read answer

    Can you explain the difference between the INTERSECT and EXCEPT operators?

    INTERSECT returns data value common to BOTH queries (queries on the left and right side of the operand). On theother hand, EXCEPT returns the distinct data value from the left query (query on left side of the .................

    Read answer

    What is the new error handling technique in SQL Server 2005?

    Previously, error handling was done using @@ERROR or check @@ROWCOUNT, which didnt turn out to be a very

    feasible option for fatal errors. New error handling technique in SQL Server 2005 provides a TRY ...............

    Read answer

    What exactly is SQL Server 2005 Service Broker?

    Servive brokers allow build applications in which independent components work together to accomplish a task. Theyhelp build scalable, secure database applications. The brokers provide a message based ..................

    Read answer

    Explain the Service Broker components.

    Service broker components help build applications in which independent components work together to accomplish a

    task. These applications are independent, asynchronous and the components work together.............

    Read answer

    What is a breakpoint in SSIS? How is it setup? How do you disable it?

    Breakpoints allow the execution to be paused in order o review the status of the data, variables and the overall status

    of the SSIS package. Breakpoints in SSIS are set up through the BIDS wizard. In this wizard,..............

    xplain the concepts and capabilities of Business Intelligence.

    Business Intelligence helps to manage data by applying different skills, technologies, security and quality risks. This

    also helps in achieving a better understanding of data. Business intelligence can be considered..........

    Read answer

    Name some of the standard Business Intelligence tools in the market.

  • 8/2/2019 What is Data Warehousing

    8/21

    Business intelligence tools are to report, analyze and present data. Few of the tools available in the market

    are:..............

    Read answer

    Explain the Dashboard in the business intelligence.

    A dashboard in business intellgence allows huge data and reports to be read in a single graphical interface. Theyhelp in making faster decisions by replying on measurable data seen at a glance. They can..............

    Read answer

    SAS Business Intelligence.

    SAS business intelligence has analytical capabilities like statistics, reporting, data mining, predictions, forecasting and

    optimization..............

    Read answer

    Explain the SQL Server 2005 Business Intelligence components.

    SQL Server Integration Services:- Used for data transformation and creation. Used in data acquisition form a source

    system.................

    What is broad cast agent?

    A broadcast agent allows automation of emails to be distributed. It allows reports to be sent to different business

    objects. It also users to choose the report format and send via SMS, fax, pagers etc. broadcast.............

    Read answer

    Explain the functional differences between BO and COGNOs.

    Business objects in business intelligence are entities of the business. COGNOS makes BI and performance planning

    software...............

    Read answer

    What is a universe? Explain the types of universes in business objects.

    A universe connects the client to the data warehouse. It is a file defining relationships amongst the tables in the

    warehouse, classes and objects, database connection details...............

    Read answer

    What is security domain in Business Objects?

    Security domain in business objects is a domain containing all security information like login credentials etc. It checks

    for users and their privileges..................

    Read answer

    What is batch processing in Business Objects?

    Batch processing can be used to schedule reports. Objects can be also be used for batch processing. Batch

    processing can be used to also select the objects to be processed.................

    Read answer

    What are the functional & architectural differences between business objects and Web Intelligence Reports?

  • 8/2/2019 What is Data Warehousing

    9/21

    Functional differences - Business objects, for building or accessing reports, needs to be installed on every pc. On the

    other hand, Web intelligence reports needs a browser and a URL of the server from where Business...................

    Read answer

    What is slicing and dicing in business objects?

    Slicing and dicing of business objects is used for a detailed analysis of the data. It allows changing the position ofdata by interchanging rows and columns.............

    Read answer

    What is the security level used in BO?

    Security level used in BO:- Row Level, Column Level...............

    Read answer

    What is Object qualification?

    Object qualification is an attribute of an object that helps to determine how it can be used in multidimensional

    analysis.............

    Read answer

    What is BOMain.Key?

    A BOMain.key file contains all relevant information about the repository. It contains the address of the repository

    security domain..................

    OLAP database objects

    The following are the OLAP database objects:

    Cubes: Data in cubes are persisted in a summarized version that helps to analyze data quickly. The data is persisted,

    through which reporting can be done easily..................

    Read answer

    Cubes

    A data cube stores data in a summarized version which helps in a faster analysis of data. The data is stored in such a

    way that it allows reporting easily................

    Read answer

    Data Sources

    Data source is where the data comes from in data warehousing. The data collected from various sources and is

    cleaned..............

    Read answer

    Fact Tables

    Data in a warehouse comes from the transactions. Fact table in a data warehouse consists of facts and/or measures.

    The nature of data in a fact table is usually numerical. e.g. If I want to know the number..............

    Read answer

    Database roles

  • 8/2/2019 What is Data Warehousing

    10/21

    Database level roles are used to manage the security of the database. The role can be either fixed or

    flexible...............

    Read answer

    Explain the concepts and capabilities of OLAP.

    Online analytical processing performs analysis of business data and provides the ability to perform complexcalculations on usually low volumes of data. OLAP helps the user gain an insight on the data..............

    Read answer

    Explain the functionality of OLAP.

    Multidimensional analysis:- OLAP helps the user gain an insight on the data coming from different sources. OLAP

    helps faster execution of complex analytical and ad-hoc queries...............

    Read answer

    What are MOLAP and ROLAP?

    Multidimensional Online Analytical Processing and Relational Online Analytical Processing are tools used in analysis

    of data which is multidimensional..................Read answer

    Explain the role of bitmap indexes to solve aggregation problems.

    Bitmap indexes are useful in connecting smaller databases to larger databases. Bit map indexes can be very useful

    in performing repetitive indexes..............

    Read answer

    Explain the encoding technique used in bitmaps indexes.

    For each distinct value, one bitmap is used. The number of bitmaps can be reduced using log(C) bitmaps with to

    represent the values in each bin................Read answer

    What is Binning?

    Binning can be used to hold multiple values in one bin. Bitmaps are then used to represent the values in.............

    Read answer

    What is candidate check?

    Binning process when creates the binned indexes, answers only some queries. The base data is not checked. The

    process of checking the base data is called as a candidate check. Candidate check at times.............

    Read answer

    What is Hybrid OLAP?

    In a Hybrid OLAP, the database gets divided into relational and specialized storage. Specialized data storage is for

    data with fewer details while relational storage can be used for large amount of data..............

    Read answer

    Explain the shared features of OLAP.

  • 8/2/2019 What is Data Warehousing

    11/21

    OLAP product by default is read only. If multiple access rights are required, admin needs to make necessary

    changes..................

    Read answer

    Compare Data Warehouse database and OLTP database.

    Data Warehouse is used for business measures cannot be used to cater real time business needs of the organizationand is optimized for lot of data, unpredictable queries. On the other hand, OLTP database is for.................

    Read answer

    What is the difference between ETL tool and OLAP tool?

    ETL is the process of Extracting, loading and transforming data into meaningful form. This data can be used by the

    OLAP tool for to visualize data in different forms. ETL tools also perform some cleaning of data..................

    Read answer

    What is the difference between OLAP and DSS?

    Data driven Decision support system is used to access and manipulate data. Data Driven DSS in conjunction with On

    line Analytical Processing............

    What's A Data warehouse

    Answer1:A Data warehouse is a repository of integrated information, available for queries and analysis. Data and information areextracted from heterogeneous sources as they are generated. This makes it much easier and more efficient to run queriesover data that originally came from different sources". Another definition for data warehouse is: " A data warehouse is alogical collection of information gathered from many different operational databases used to create business intelligence

    that supports business analysis activities and decision-making tasks, primarily, a record of an enterprise's pasttransactional and operational information, stored in a database designed to favour efficient data analysis and reporting(especially OLAP)". Generally, data warehousing is not meant for current "live" data, although 'virtual' or 'point-to-point' data warehouses can access operational data. A 'real' data warehouse is generally preferred to a virtual DW becausestored data has been validated and is set up to provide reliable results to common types of queries used in a business.

    Answer2:Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information areextracted from heterogeneous sources as they are generated....This makes it much easier and more efficient to runqueries over data that originally came from different sources.

    Typical relational databases are designed for on-line transactional processing (OLTP) and do not meet the requirements foreffective on-line analytical processing (OLAP). As a result, data warehouses are designed differently than traditional

    relational databases.

    What is ODS?

    1. ODS means Operational Data Store.2. A collection of operation or bases data that is extracted from operation databases and standardized, cleansed,consolidated, transformed, and loaded into an enterprise data architecture. An ODS is used to support data mining ofoperational data, or as the store for base data that is summarized for a data warehouse. The ODS may also be used to

    audit the data warehouse to assure summarized and derived data is calculated properly. The ODS may further become theenterprise shared operational database, allowing operational systems that are being reengineered to use the ODS as there

    operation databases.

    What is a dimension table?

    A dimensional table is a collection of hierarchies and categories along which the user can dril l down and drill up. it contains

    only the textual attributes.

    What is a lookup table?

  • 8/2/2019 What is Data Warehousing

    12/21

    A lookUp table is the one which is used when updating a warehouse. When the lookup is placed on the target table (facttable / warehouse) based upon the primary key of the target, it just updates the table by allowing only new records or

    updated records based on the lookup condition.

    Why should you put your data warehouse on a different system than your OLTP system?

    Answer1:

    A OLTP system is basically " data oriented " (ER model) and not " Subject oriented "(Dimensional Model) .That is why wedesign a separate system that will have a subject oriented OLAP system...

    Moreover if a complex querry is fired on a OLTP system will cause a heavy overhead on the OLTP server that wil l affect thedaytoday business directly.

    Answer2:

    The loading of a warehouse will likely consume a lot of machine resources. Additionally, users may create querries orreports that are very resource intensive because of the potentially large amount of data available. Such loads and resource

    needs will conflict with the needs of the OLTP systems for resources and will negatively impact those production systems.

    What are Aggregate tables?

    Aggregate table contains the summary of existing warehouse data which is grouped to certain levels of

    dimensions.Retrieving the required data from the actual table, which have millions of records will take more time and alsoaffects the server performance.To avoid this we can aggregate the table to certain required level and can use it.This tables

    reduces the load in the database server and increases the performance of the query and can retrieve the result very fastly.

    What is Dimensional Modelling? Why is it important ?

    Dimensional Modelling is a design concept used by many data warehouse desginers to build thier datawarehouse. In this

    design model all the data is stored in two types of tables - Facts table and Dimension table. Fact table contains thefacts/measurements of the business and the dimension table contains the context of measuremnets ie, the dimensions on

    which the facts are calculated.

    Why is Data Modeling Important?

    Data modeling is probably the most labor intensive and time consuming part of the development process. Why botherespecially if you are pressed for time? A common response by practitioners who write on the subject is that you should nomore build a database without a model than you should build a house without blueprints.

    The goal of the data model is to make sure that the all data objects required by the database are completely and

    accurately represented. Because the data model uses easily understood notations and natural language , it can bereviewed and verified as correct by the end-users.

    The data model is also detailed enough to be used by the database developers to use as a "blueprint" for building thephysical database. The information contained in the data model will be used to define the relational tables, primary and

    foreign keys, stored procedures, and triggers. A poorly designed database will require more time in the long-term. Withoutcareful planning you may create a database that omits data required to create critical reports, produces results that are

    incorrect or inconsistent, and is unable to accommodate changes in the user's requirements.

    What is data mining?

    Data mining is a process of extracting hidden trends within a datawarehouse. For example an insurance dataware house

    can be used to mine data for the most high risk people to insure in a certain geographial area.

    What is ETL?

    ETL stands for extraction, transformation and loading.

    ETL provide developers with an interface for designing source-to-target mappings, ransformation and job controlparameter. ExtractionTake data from an external source and move it to the warehouse pre-processor database. Transformation

    Transform data task allows point-to-point generating, modifying and transforming data. Loading

    Load data task adds records to a database table in a warehouse.

  • 8/2/2019 What is Data Warehousing

    13/21

    What does level of Granularity of a fact table signify?

    GranularityThe first step in designing a fact table is to determine the granularity of the fact table. By granularity, we mean the lowestlevel of information that will be stored in the fact table. This constitutes two steps:

    Determine which dimensions will be included.Determine where along the hierarchy of each dimension the information will be kept.

    The determining factors usually goes back to the requirements

    What is the Difference between OLTP and OLAP?

    Main Differences between OLTP and OLAP are:-

    1. User and System Orientation

    OLTP: customer-oriented, used for data analysis and querying by clerks, clients and IT professionals.

    OLAP: market-oriented, used for data analysis by knowledge workers( managers, executives, analysis).

    2. Data Contents

    OLTP: manages current data, very detail-oriented.

    OLAP: manages large amounts of historical data, provides facilities for summarization and aggregation, stores informationat different levels of granularity to support decision making process.

    3. Database Design

    OLTP: adopts an entity relationship(ER) model and an application-oriented database design.

    OLAP: adopts star, snowflake or fact constellation model and a subject-oriented database design.

    4. View

    OLTP: focuses on the current data within an enterprise or department.

    OLAP: spans multiple versions of a database schema due to the evolutionary process of an organization; integrates

    information from many organizational locations and data stores

    What is SCD1 , SCD2 , SCD3?

    SCD Stands for Slowly changing dimensions.

    SCD1: only maintained updated values.

    Ex: a customer address modified we update existing record with new address.

    SCD2: maintaining historical information and current information by using

    A) Effective DateB) Versions

    C) Flags

    or combination of these

    SCD3: by adding new columns to target table we maintain historical information and current information.

    Why are OLTP database designs not generally a good idea for a Data Warehouse?

    Since in OLTP,tables are normalised and hence query response wil l be slow for end user and OLTP doesnot contain years of

    data and hence cannot be analysed.

    What is BUS Schema?

    BUS Schema is composed of a master suite of confirmed dimension and standardized definition if facts.

  • 8/2/2019 What is Data Warehousing

    14/21

    What are the various Reporting tools in the Market?

    1. MS-Excel2. Business Objects (Crystal Reports)3. Cognos (Impromptu, Power Play)

    4. Microstrategy5. MS reporting services6. Informatica Power Analyzer

    7. Actuate8. Hyperion (BRIO)

    9. Oracle Express OLAP10. Proclarity

    What is Normalization, First Normal Form, Second Normal Form , Third Normal Form?

    1.Normalization is process for assigning attributes to entitiesReducesdata redundanciesHelps eliminate data anomalies

    Produces controlledredundancies to link tables

    2.Normalization is the analysis offunctional dependency between attributes / data items of userviews?It reduces a complexuser view to a set of small andstable subgroups of fields / relations

    1NF:Repeating groups must beeliminated, Dependencies can be identified, All key attributesdefined,No repeating groups intable

    2NF: The Table is already in1NF,Includes no partial dependenciesNo attribute dependent on a portionof primary key, Still

    possible to exhibit transitivedependency,Attributes may be functionally dependent on non-keyattributes

    3NF: The Table is already in 2NF, Contains no transitivedependencies

    What is Fact table?

    Fact Table contains the measurements or metrics or facts of business process. If your business process is "Sales" , then a

    measurement of this business process such as "monthly sales number" is captured in the Fact table. Fact table also

    contains the foriegn keys for the dimension tables.

    What are conformed dimensions?

    Answer1:Conformed dimensions mean the exact same thing with every possible fact table to which they are joined Ex:Date

    Dimensions is connected all facts like Sales facts,Inventory facts..etc

    Answer2:Conformed dimentions are dimensions which are common to the cubes.(cubes are the schemas contains facts and

    dimension tables)Consider Cube-1 contains F1,D1,D2,D3 and Cube-2 contains F2,D1,D2,D4 are the Facts and Dimensions here D1,D2 are

    the Conformed Dimensions

    What are the Different methods of loading Dimension tables?

    Conventional Load:

    Before loading the data, all the Table constraints wil l be checked against the data.

    Direct load:(Faster Loading)

    All the Constraints will be disabled. Data will be loaded directly.Later the data will be checked against the table constraintsand the bad data won't be indexed.

    What is conformed fact?

    Conformed dimensions are the dimensions which can be used across multiple Data Marts in combination with multiple facts

    tables accordingly

    What are Data Marts?

  • 8/2/2019 What is Data Warehousing

    15/21

    Data Marts are designed to help manager make strategic decisions about their business.Data Marts are subset of the corporate-wide data that is of value to a specific group of users.

    There are two types of Data Marts:

    1.Independent data marts sources from data captured form OLTP system, external providers or from data generatedlocally within a particular department or geographic area.

    2.Dependent data mart sources directly form enterprise data warehouses.

    What is a level of Granularity of a fact table?

    Level of granularity means level of detail that you put into the fact table in a data warehouse. For example: Based on

    design you can decide to put the sales data in each transaction. Now, level of granularity would mean what detail are youwilling to put for each transactional fact. Product sales with respect to each minute or you want to aggregate it upto

    minute and put that data.

    How are the Dimension tables designed?

    Most dimension tables are designed using Normalization principles upto 2NF. In some instances they are further

    normalized to 3NF.

    Find where data for this dimension are located.

    Figure out how to extract this data.

    Determine how to maintain changes to this dimension (see more on this in the next section).

    What are non-additive facts?

    Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.

    What type of Indexing mechanism do we need to use for a typical datawarehouse?

    On the fact table it is best to use bitmap indexes. Dimension tables can use bitmap and/or the other types ofclustered/non-clustered, unique/non-unique indexes.

    To my knowledge, SQLServer does not support bitmap indexes. Only Oracle supports bitmaps.

    What Snow Flake Schema?

    Snowflake Schema, each dimension has a primary dimension table, to which one or more additional dimensions can join.

    The primary dimension table is the only table that can join to the fact table.

    What is data warehouse?

    A data warehouse is a electronical storage of an Organization's historical data for the purpose of analysis and

    reporting. According to Kimpball, a datawarehouse should be subject-oriented, non-volatile, integrated and time-

    variant.

    Explanatory Note

    Non-volatile means that the data once loaded in the warehouse will not get deleted later. Time-

    variant means the data will change with respect to time.

  • 8/2/2019 What is Data Warehousing

    16/21

    What is the benefits of data warehouse?

    Historical data stored in data warehouse helps to analyze different aspects of business including, performance

    analysis, trend analysis, trend prediction etc. which ultimately increases efficiency of business processes.

    Why Data Warehouse is used?

    Data warehouse facilitates reporting on different key business processes known as KPI. Data warehouse can be

    further used for data mining which helps trend prediction, forecasts, pattern recognition etc.

    What is the difference between OLTP and OLAP?

    OLTP is the transaction system that collects business data. Whereas OLAP is the reporting and analysis system on

    that data.

    OLTP systems are optimized for INSERT, UPDATE operations and therefore highly normalized. On the other hand,

    OLAP systems are deliberately denormalized for fast data retrieval through SELECT operations.

    Explanatory Note:

    In a departmental shop, when we pay the prices at the check-out counter, the sales person at the

    counter keys-in all the data into a "Point-Of-Sales" machine. That data is transaction data and the

    related system is a OLTP system. On the other hand, the manager of the store might want to view a

    report on out-of-stock materials, so that he can place purchase order for them. Such report will comeout from OLAP system

    What is data mart?

    Data marts are generally designed for a single subject area. An organization may have data pertaining to different

    departments like Finance, HR, Marketting etc. stored in data warehouse and each department may have separate

    data marts. These data marts can be built on top of the data warehouse.

    What is ER model?

    ER model is entity-relationship model which is designed with a goal of normalizing the data.

    Q1. WHAT is SQL Server Reporting Services(SSRS)?SQL Server Reporting Services is a server-based reporting platform that you can use to create andmanage tabular, matrix, graphical, and free-form reports that contain data from relational andmultidimensional data sources. The reports that you create can be viewed and managed over a World

  • 8/2/2019 What is Data Warehousing

    17/21

    Wide Web-based connection

    Q2. Architecture of SSRS:

    -Admin

    Q3. What are the three stages of Enterprise Reporting Life Cycle ?a. Authoringb. Managementc. Access and Delivery

    Q4. What are the components included in SSRS?1. A Complete set of Tools that can be used to create, manage and view reports2. A Report Server component that hosts and processes reports in a variety of formats. Output

    formats include HTML, PDF, TIFF, Excel, CSV, and more.3.An API that allows developers to integrate or extend data and report processing in customapplications, or create custom tools to build and manage reports.

    Q5. What is the benefit of using embedded code in a report?1. Reuseability of Code: function created in embedded code to perform a logic can be then used inmultiple expressions2. Centralized code: helps in better manageability of code.

    Q6. Which programming language can be used to code embedded functions in SSRS?Visual Basic .NET Code.

    Q7. Important terms used in the reporting services?

    1. Report definition: The blueprint for a report before the report is processed or rendered. A reportdefinition contains information about the query and layout for the report.

    2. Report snapshot: A report that contains data captured at a specific point in time. A reportsnapshot is actually a report definition that contains a dataset instead of query instructions.

    3. Rendered report: A fully processed report that contains both data and layout information, in aformat suitable for viewing (such as HTML).

  • 8/2/2019 What is Data Warehousing

    18/21

    4. Parameterized report: A published report that accepts input values through parameters.

    5. Shared data source: A predefined, standalone item that contains data source connectioninformation.

    6. Shared schedule: A predefined, standalone item that contains schedule information.

    7. Report-specific data source: Data source information that is defined within a report definition.

    8. Report model: A semantic description of business data, used for ac hoc reports created in ReportBuilder.

    9. Linked report: A report that derives its definition through a link to another report.

    10. Report server administrator: This term is used in the documentation to describe a user withelevated privileges who can access all settings and content of a report server. If you are using thedefault roles, a report server administrator is typically a user who is assigned to both the ContentManager role and the System Administrator role. Local administrators can have elevated permissioneven if role assignments are not defined for them.

    11. Folder hierarchy: A bounded namespace that uniquely identifies all reports, folders, report

    models, shared data source items, and resources that are stored in and managed by a report server.

    12. Report Server: Describes the Report Server component, which provides data and reportprocessing, and report delivery. The Report Server component includes several subcomponents thatperform specific functions.

    13. Report Manager: Describes the Web application tool used to access and manage the contents ofa report server database.

    14. Report Builder: Report authoring tool used to create ad hoc reports.

    15. Report Designer: Report creation tool included with Reporting Services.

    16. Model Designer: Report model creation tool used to build models for ad hoc reporting.

    17. Report Server Command Prompt Utilities: Command line utilities that you can use toadminister a report server.a) RsConfig.exe, b) RsKeymgmt.exe, c) Rs.exe

    Q8. what are the Command Line Utilities available In Reporting Services? Rsconfig Utility (Rsconfig.exe): encrypts and stores connection and account values in theRSReportServer.config file. Encrypted values include report server database connection informationand account values used for unattended report processing RsKeymgmt Utility: Extracts, restores, creates, and deletes the symmetric key used to protectsensitive report server data against unauthorized access RS Utility: this utility is mainly used to automate report server deployment and administrationtasks.Processes script you provide in an input file.

    Q. How to know Report Execution History?ExecutionLog table in ReportServer database store all the logs from last two months.SELECT * FROM ReportServer.dbo.ExecutionLog

    -DevelopmentQ. What is difference between Tablular and Matrix report?OR What are the different styles of reports?

    Tablular report: A tabular report is the most basic type of report. Each column corresponds to a

  • 8/2/2019 What is Data Warehousing

    19/21

    column selected from the database.

    Matrix report: A matrix (cross-product) report is a cross-tabulation of four groups of data:a. One group of data is displayed across the page.b. One group of data is displayed down the page.c. One group of data is the cross-product, which determines all possible locations where the across

    and down data relate and places a cell in those locations.

    d. One group of data is displayed as the "filler" of the cells.Martix reports can be considered more of a Pivot table.

    Q. How to create Drill-through reports?Using Navigation property of a cell and setting child report and its parameters in it.

    Q. How to create Drill-Down reports?To cut the story short:- By grouping data on required fields-Then toggle visibility based on the grouped filed

    Q1 Explain architecture of SSIS?

    SSIS architecture consists of four key parts:

    a) Integration Services service: monitors running Integration Services packages and manages thestorage of packages.

    b) Integration Services object model: includes managed API for accessing Integration Servicestools, command-line utilities, and custom applications.c) Integration Services runtime and run-time executables: it saves the layout of packages, runspackages, and provides support for logging, breakpoints, configuration, connections, and transactions.The Integration Services run-time executables are the package, containers, tasks, and event handlersthat Integration Services includes, and custom tasks.d) Data flow engine: provides the in-memory buffers that move data from source to destination.

  • 8/2/2019 What is Data Warehousing

    20/21

    Q2 How would you do Logging in SSIS?Logging Configuration provides an inbuilt feature which can log the detail of various events likeonError, onWarning etc to the various options say a flat file, SqlServer table, XML or SQL Profiler.

    Q3 How would you do Error Handling?A SSIS package could mainly have two types of errors

    a) Procedure Error: Can be handled in Control flow through the precedence control and redirecting theexecution flow.b) Data Error: is handled in DATA FLOW TASK buy redirecting the data flow using Error Output of acomponent.

    Q4 How to pass property value at Run time? How do you implement Package Configuration?A property value like connection string for a Connection Manager can be passed to the pkg usingpackage configurations.Package Configuration provides different options like XML File, EnvironmentVariables, SQL Server Table, Registry Value or Parent package variable.

    Q5 How would you deploy a SSIS Package on production?A) Through Manifest1. Create deployment utility by setting its propery as true .2. It will be created in the bin folder of the solution as soon as package is build.

    3. Copy all the files in the utility and use manifest file to deply it on the Prod.B) Using DtsExec.exe utilityC)Import Package directly in MSDB from SSMS by logging in Integration Services.

    Q6 Difference between DTS and SSIS?Every thing except both are product of Microsoft :-).

    Q7 What are new features in SSIS 2008?explained in other posthttp://sqlserversolutions.blogspot.com/2009/01/new-improvementfeatures-in-ssis-2008.html

    Q8 How would you pass a variable value to Child Package?too big to fit here so had a write other posthttp://sqlserversolutions.blogspot.com/2009/02/passing-variable-to-child-package-from.html

    Q9 What is Execution Tree?Execution trees demonstrate how package uses buffers and threads. At run time, the data flow enginebreaks down Data Flow task operations into execution trees. These execution trees specify howbuffers and threads are allocated in the package. Each tree creates a new buffer and may execute ona different thread. When a new buffer is created such as when a partially blocking or blockingtransformation is added to the pipeline, additional memory is required to handle the datatransformation and each new tree may also give you an additional worker thread.

    Q10 What are the points to keep in mind for performance improvement of the package?http://technet.microsoft.com/en-us/library/cc966529.aspx

    Q11 You may get a question stating a scenario and then asking you how would you create a

    package for that e.g. How would you configure a data flow task so that it can transfer datato different table based on the city name in a source table column?

    Q13 Difference between Unionall and Merge Join?a) Merge transformation can accept only two inputs whereas Union all can take more than two inputs

    b) Data has to be sorted before Merge Transformation whereas Union all doesn't have any conditionlike that.

    Q14 May get question regarding what X transformation do?Lookup, fuzzy lookup, fuzzy

  • 8/2/2019 What is Data Warehousing

    21/21