what is data warehousing
TRANSCRIPT
-
8/2/2019 What is Data Warehousing
1/21
What is Data warehousing?
Answer -A data warehouse can be considered as a storage area where interest specific or relevant
data........
What are fact tables and dimension tables?
Answer -As mentioned, data in a warehouse comes from the transactions. Fact table in a data warehouse
consists........
What is ETL process in data warehousing?
Answer - ETL is Extract Transform Load. It is a process of fetching.........
Explain the difference between data mining and data warehousing.
Answer - Data warehousing is merely extracting data from different sources, cleaning the........
What is an OLTP system and OLAP system?
Answer - OLTP: Online Transaction and Processing helps and manages applications based........
What are cubes?
Answer -A data cube stores data in a summarized version which helps in a faster analysis of data..........
What is snow flake scheme design in database?
Answer -A snowflake Schema in its simplest form is an arrangement of fact tables.........
What is analysis service?
Answer -Analysis service provides a combined view of the data used in OLAP.........
Explain sequence clustering algorithm.
Answer - Sequence clustering algorithm collects similar or related paths, sequences of data.........
Explain discrete and continuous data in data mining.
Answer - Discreet data can be considered as defined or finite data..........
Explain time series algorithm in data mining.
Answer - Time series algorithm can be used to predict continuous values of data.......
What is XMLA?
Answer - XMLA is XML for Analysis which can be considered as a standard for accessing data in OLAP.......
Explain the difference between Data warehousing and Business Intelligence.
-
8/2/2019 What is Data Warehousing
2/21
-
8/2/2019 What is Data Warehousing
3/21
Dependent data ware house are build........
What is data modeling and data mining? What is this used for?
Designing a model for data or database is called data modelling........
Difference between ER Modeling and Dimensional Modeling.
Dimensional modelling is very flexible for the user perspective........
What is snapshot with reference to data warehouse?
A snapshot of data warehouse is a persisted report from the catalogue.........
What is degenerate dimension table?
The dimensions that are persisted in the fact table is called dimension table.........
What is Data Mart?
Data Mart is a data repository which is served to a community of people.......
What is the difference between metadata and data dictionary?
Metadata describes about data. It is data about data. It has information about how and when......
Describe the various methods of loading Dimension tables.
The following are the methods of loading dimension tables.......
What is the difference between OLAP and data warehouse?
The following are the differences between OLAP and data warehousing......
Describe the foreign key columns in fact table and dimension table.
The primary keys of entity tables are the foreign keys of dimension tables.......
What is cube grouping?
A transformer built set of similar cubes is known as cube grouping. A single level in one dimension......
Define the term slowly changing dimensions (SCD).
Slowly changing dimension target operator is one of the SQL warehousing operators......
What is a Star Schema?
The simplest data warehousing schema is star schema.......
Differences between star and snowflake schema.
-
8/2/2019 What is Data Warehousing
4/21
Star Schema: A de-normalized technique in which one fact table is associated with several dimension tables.......
Explain the use of lookup tables and Aggregate tables.
At the time of updating the data warehouse, a lookup table is used.......
What is real time data-warehousing?
The combination of real-time activity and data warehousing is called real time warehousing.......
What is conformed fact? What is conformed dimensions use for?
Allowing having same names in different tables is allowed by Conformed facts.......
Define non-additive facts.
The facts that can not be summed up for the dimensions present in the fact table are called non-additive facts.......
Define BUS Schema.
A BUS schema is to identify the common dimensions across business processes......
List out difference between SAS tool and other tools.
The differences between SAS and other tools are......
Why is SAS so popular?
Statistical Analysis System is an integration of various software products which allows the developers to perform.......
What is data cleaning? How can we do that?
Data cleaning is also known as data scrubbing. Data cleaning is a process which ensures the set of data is correct
and accurate......
Explain in brief about critical column.
A column (usually granular) is called as critical column which changes the values over a period of time.......
What is data cube technology used for?
Data cube is a multi-dimensional structure. Data cube is a data abstraction to view aggregated data from a number of
perspectives.........
What is Data Scheme?
Data Scheme is a diagrammatic representation that illustrates data structures and data relationships to each other in
the relational database within the data warehouse................
Read answer
What is Bit Mapped Index?
-
8/2/2019 What is Data Warehousing
5/21
Bitmap indexes make use of bit arrays (bitmaps) to answer queries by performing bitwise logical
operations..................
Read answer
What is Bi-directional Extract?
In hierarchical, networked or relational databases, the data can be extracted, cleansed and transferred in twodirections. The ability of a system to do this is refered to as bidirectional extracts................
Read answer
What is Data Collection Frequency?
Data collection frequency is the rate at which data is collected. However, the data is not just collected and stored. it
goes through various stages of processing like extracting from various sources, cleansing, transforming and then
storing in useful patterns................
What is Data Cardinality?
Cardinality is the term used in database relations to denote the occurrences of data on either side of the
relation................Read answer
What is Chained Data Replication?
In Chain Data Replication, the non-official data set distributed among many disks provides for load balancing among
the servers within the data warehouse...............
Read answer
What are Critical Success Factors?
Key areas of activity in which favorable results are necessary for a company to reach its goal. There are four basic
types of CSFs which are: ...............
at is Data Warehousing?
A data warehouse can be considered as a storage area where interest specific or relevant data is stored irrespective
of the source. What actually is required to create a data warehouse can be............
Read answer
What is Virtual Data Warehousing?
A virtual data warehouse provides a collective view of the completed data. A virtual data warehouse has...........
Read answer
Explain in brief various fundamental stages of Data Warehousing.
Stages of a data warehouse helps to find and understand how the data in the warehouse changes. At an initial stage
of data warehousing data of the transactions is merely copied to another server. Here, even if the copied data is
processed for reporting, the source datas performance wont be affected..............
Read answer
What is active data warehousing?
-
8/2/2019 What is Data Warehousing
6/21
An active data warehouse represents a single state of the business. Active data warehousing considers the analytic
perspectives of customers................
Read answer
What is data modeling and data mining? What is this used for?
Data Modeling is a technique used to define and analyze the requirements of data that supports organizationsbusiness process. In simple terms, it is used for the analysis of data objects in order to...........
Read answer
Difference between ER Modeling and Dimensional Modeling
The entity-relationship model is a method used to represent the logical flow of entities/objects graphically that in turn
create a database. It has both logical and physical model. And it is good for reporting and point queries..............
Read answer
What is the difference between data warehousing and business intelligence?
Data warehousing relates to all aspects of data management starting from the development, implementation and
operation of the data sets. It is a back up of all data relevant to business context i.e. a way of storing data.............Read answer
Describe dimensional Modeling.
Dimensional model is a method in which the data is stored in two types of tables namely facts table and dimension
table. Fact table comprises of information to measure business successes and the............
Read answer
What is snapshot with reference to data warehouse?
Snapshot refers to a complete visualization of data at the time of extraction. It occupies less space and can be used
to back up and restore data quickly...............
What is SQL Server 2005 Analysis Services (SSAS)?
SSAS gives the business data an integrated view. This integrated view is provided by combining online analytical
processing (OLAP) and data mining functionality................
Read answer
What are the new features with SQL Server 2005 Analysis Services (SSAS)?
It offers interoperability with Microsoft office 2007. It eases data mining by offering better data mining algorithms and
enables better predictive analysis...................
Read answer
What are SQL Server Analysis Services cubes?
In analysis services, cube is the basic unit of storage. A cube has data collected from various sources that enables
faster execution of queries. Cubes have dimensions and measures................
Read answer
Explain the purpose of synchronization feature provided in Analysis Services 2005.
-
8/2/2019 What is Data Warehousing
7/21
Synchronization feature is used to copy database from one source server to a destination server. While the
synchronization is in progress, users can still browse the cubes.................
Read answer
MDX in SQL Server 2005 Analysis Services brings exciting improvements including query support and
expression/calculation language, Explain
MDX in SQL server 2005 Analysis services offers CASE and SCOPE statements. CASE returns specific values
based upon its comparison of an expression to a set of simple expressions. It can perform.............
Read answer
Can you explain how to deploy an SSIS package?
A SSIS package can be deployed using the Deploy SSIS Packages page. The package and its dependencies can be
either deployed in a specified folder in the file system or in an instance of SQL server...............
Read answer
Can you explain the difference between the INTERSECT and EXCEPT operators?
INTERSECT returns data value common to BOTH queries (queries on the left and right side of the operand). On theother hand, EXCEPT returns the distinct data value from the left query (query on left side of the .................
Read answer
What is the new error handling technique in SQL Server 2005?
Previously, error handling was done using @@ERROR or check @@ROWCOUNT, which didnt turn out to be a very
feasible option for fatal errors. New error handling technique in SQL Server 2005 provides a TRY ...............
Read answer
What exactly is SQL Server 2005 Service Broker?
Servive brokers allow build applications in which independent components work together to accomplish a task. Theyhelp build scalable, secure database applications. The brokers provide a message based ..................
Read answer
Explain the Service Broker components.
Service broker components help build applications in which independent components work together to accomplish a
task. These applications are independent, asynchronous and the components work together.............
Read answer
What is a breakpoint in SSIS? How is it setup? How do you disable it?
Breakpoints allow the execution to be paused in order o review the status of the data, variables and the overall status
of the SSIS package. Breakpoints in SSIS are set up through the BIDS wizard. In this wizard,..............
xplain the concepts and capabilities of Business Intelligence.
Business Intelligence helps to manage data by applying different skills, technologies, security and quality risks. This
also helps in achieving a better understanding of data. Business intelligence can be considered..........
Read answer
Name some of the standard Business Intelligence tools in the market.
-
8/2/2019 What is Data Warehousing
8/21
Business intelligence tools are to report, analyze and present data. Few of the tools available in the market
are:..............
Read answer
Explain the Dashboard in the business intelligence.
A dashboard in business intellgence allows huge data and reports to be read in a single graphical interface. Theyhelp in making faster decisions by replying on measurable data seen at a glance. They can..............
Read answer
SAS Business Intelligence.
SAS business intelligence has analytical capabilities like statistics, reporting, data mining, predictions, forecasting and
optimization..............
Read answer
Explain the SQL Server 2005 Business Intelligence components.
SQL Server Integration Services:- Used for data transformation and creation. Used in data acquisition form a source
system.................
What is broad cast agent?
A broadcast agent allows automation of emails to be distributed. It allows reports to be sent to different business
objects. It also users to choose the report format and send via SMS, fax, pagers etc. broadcast.............
Read answer
Explain the functional differences between BO and COGNOs.
Business objects in business intelligence are entities of the business. COGNOS makes BI and performance planning
software...............
Read answer
What is a universe? Explain the types of universes in business objects.
A universe connects the client to the data warehouse. It is a file defining relationships amongst the tables in the
warehouse, classes and objects, database connection details...............
Read answer
What is security domain in Business Objects?
Security domain in business objects is a domain containing all security information like login credentials etc. It checks
for users and their privileges..................
Read answer
What is batch processing in Business Objects?
Batch processing can be used to schedule reports. Objects can be also be used for batch processing. Batch
processing can be used to also select the objects to be processed.................
Read answer
What are the functional & architectural differences between business objects and Web Intelligence Reports?
-
8/2/2019 What is Data Warehousing
9/21
Functional differences - Business objects, for building or accessing reports, needs to be installed on every pc. On the
other hand, Web intelligence reports needs a browser and a URL of the server from where Business...................
Read answer
What is slicing and dicing in business objects?
Slicing and dicing of business objects is used for a detailed analysis of the data. It allows changing the position ofdata by interchanging rows and columns.............
Read answer
What is the security level used in BO?
Security level used in BO:- Row Level, Column Level...............
Read answer
What is Object qualification?
Object qualification is an attribute of an object that helps to determine how it can be used in multidimensional
analysis.............
Read answer
What is BOMain.Key?
A BOMain.key file contains all relevant information about the repository. It contains the address of the repository
security domain..................
OLAP database objects
The following are the OLAP database objects:
Cubes: Data in cubes are persisted in a summarized version that helps to analyze data quickly. The data is persisted,
through which reporting can be done easily..................
Read answer
Cubes
A data cube stores data in a summarized version which helps in a faster analysis of data. The data is stored in such a
way that it allows reporting easily................
Read answer
Data Sources
Data source is where the data comes from in data warehousing. The data collected from various sources and is
cleaned..............
Read answer
Fact Tables
Data in a warehouse comes from the transactions. Fact table in a data warehouse consists of facts and/or measures.
The nature of data in a fact table is usually numerical. e.g. If I want to know the number..............
Read answer
Database roles
-
8/2/2019 What is Data Warehousing
10/21
Database level roles are used to manage the security of the database. The role can be either fixed or
flexible...............
Read answer
Explain the concepts and capabilities of OLAP.
Online analytical processing performs analysis of business data and provides the ability to perform complexcalculations on usually low volumes of data. OLAP helps the user gain an insight on the data..............
Read answer
Explain the functionality of OLAP.
Multidimensional analysis:- OLAP helps the user gain an insight on the data coming from different sources. OLAP
helps faster execution of complex analytical and ad-hoc queries...............
Read answer
What are MOLAP and ROLAP?
Multidimensional Online Analytical Processing and Relational Online Analytical Processing are tools used in analysis
of data which is multidimensional..................Read answer
Explain the role of bitmap indexes to solve aggregation problems.
Bitmap indexes are useful in connecting smaller databases to larger databases. Bit map indexes can be very useful
in performing repetitive indexes..............
Read answer
Explain the encoding technique used in bitmaps indexes.
For each distinct value, one bitmap is used. The number of bitmaps can be reduced using log(C) bitmaps with to
represent the values in each bin................Read answer
What is Binning?
Binning can be used to hold multiple values in one bin. Bitmaps are then used to represent the values in.............
Read answer
What is candidate check?
Binning process when creates the binned indexes, answers only some queries. The base data is not checked. The
process of checking the base data is called as a candidate check. Candidate check at times.............
Read answer
What is Hybrid OLAP?
In a Hybrid OLAP, the database gets divided into relational and specialized storage. Specialized data storage is for
data with fewer details while relational storage can be used for large amount of data..............
Read answer
Explain the shared features of OLAP.
-
8/2/2019 What is Data Warehousing
11/21
OLAP product by default is read only. If multiple access rights are required, admin needs to make necessary
changes..................
Read answer
Compare Data Warehouse database and OLTP database.
Data Warehouse is used for business measures cannot be used to cater real time business needs of the organizationand is optimized for lot of data, unpredictable queries. On the other hand, OLTP database is for.................
Read answer
What is the difference between ETL tool and OLAP tool?
ETL is the process of Extracting, loading and transforming data into meaningful form. This data can be used by the
OLAP tool for to visualize data in different forms. ETL tools also perform some cleaning of data..................
Read answer
What is the difference between OLAP and DSS?
Data driven Decision support system is used to access and manipulate data. Data Driven DSS in conjunction with On
line Analytical Processing............
What's A Data warehouse
Answer1:A Data warehouse is a repository of integrated information, available for queries and analysis. Data and information areextracted from heterogeneous sources as they are generated. This makes it much easier and more efficient to run queriesover data that originally came from different sources". Another definition for data warehouse is: " A data warehouse is alogical collection of information gathered from many different operational databases used to create business intelligence
that supports business analysis activities and decision-making tasks, primarily, a record of an enterprise's pasttransactional and operational information, stored in a database designed to favour efficient data analysis and reporting(especially OLAP)". Generally, data warehousing is not meant for current "live" data, although 'virtual' or 'point-to-point' data warehouses can access operational data. A 'real' data warehouse is generally preferred to a virtual DW becausestored data has been validated and is set up to provide reliable results to common types of queries used in a business.
Answer2:Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information areextracted from heterogeneous sources as they are generated....This makes it much easier and more efficient to runqueries over data that originally came from different sources.
Typical relational databases are designed for on-line transactional processing (OLTP) and do not meet the requirements foreffective on-line analytical processing (OLAP). As a result, data warehouses are designed differently than traditional
relational databases.
What is ODS?
1. ODS means Operational Data Store.2. A collection of operation or bases data that is extracted from operation databases and standardized, cleansed,consolidated, transformed, and loaded into an enterprise data architecture. An ODS is used to support data mining ofoperational data, or as the store for base data that is summarized for a data warehouse. The ODS may also be used to
audit the data warehouse to assure summarized and derived data is calculated properly. The ODS may further become theenterprise shared operational database, allowing operational systems that are being reengineered to use the ODS as there
operation databases.
What is a dimension table?
A dimensional table is a collection of hierarchies and categories along which the user can dril l down and drill up. it contains
only the textual attributes.
What is a lookup table?
-
8/2/2019 What is Data Warehousing
12/21
A lookUp table is the one which is used when updating a warehouse. When the lookup is placed on the target table (facttable / warehouse) based upon the primary key of the target, it just updates the table by allowing only new records or
updated records based on the lookup condition.
Why should you put your data warehouse on a different system than your OLTP system?
Answer1:
A OLTP system is basically " data oriented " (ER model) and not " Subject oriented "(Dimensional Model) .That is why wedesign a separate system that will have a subject oriented OLAP system...
Moreover if a complex querry is fired on a OLTP system will cause a heavy overhead on the OLTP server that wil l affect thedaytoday business directly.
Answer2:
The loading of a warehouse will likely consume a lot of machine resources. Additionally, users may create querries orreports that are very resource intensive because of the potentially large amount of data available. Such loads and resource
needs will conflict with the needs of the OLTP systems for resources and will negatively impact those production systems.
What are Aggregate tables?
Aggregate table contains the summary of existing warehouse data which is grouped to certain levels of
dimensions.Retrieving the required data from the actual table, which have millions of records will take more time and alsoaffects the server performance.To avoid this we can aggregate the table to certain required level and can use it.This tables
reduces the load in the database server and increases the performance of the query and can retrieve the result very fastly.
What is Dimensional Modelling? Why is it important ?
Dimensional Modelling is a design concept used by many data warehouse desginers to build thier datawarehouse. In this
design model all the data is stored in two types of tables - Facts table and Dimension table. Fact table contains thefacts/measurements of the business and the dimension table contains the context of measuremnets ie, the dimensions on
which the facts are calculated.
Why is Data Modeling Important?
Data modeling is probably the most labor intensive and time consuming part of the development process. Why botherespecially if you are pressed for time? A common response by practitioners who write on the subject is that you should nomore build a database without a model than you should build a house without blueprints.
The goal of the data model is to make sure that the all data objects required by the database are completely and
accurately represented. Because the data model uses easily understood notations and natural language , it can bereviewed and verified as correct by the end-users.
The data model is also detailed enough to be used by the database developers to use as a "blueprint" for building thephysical database. The information contained in the data model will be used to define the relational tables, primary and
foreign keys, stored procedures, and triggers. A poorly designed database will require more time in the long-term. Withoutcareful planning you may create a database that omits data required to create critical reports, produces results that are
incorrect or inconsistent, and is unable to accommodate changes in the user's requirements.
What is data mining?
Data mining is a process of extracting hidden trends within a datawarehouse. For example an insurance dataware house
can be used to mine data for the most high risk people to insure in a certain geographial area.
What is ETL?
ETL stands for extraction, transformation and loading.
ETL provide developers with an interface for designing source-to-target mappings, ransformation and job controlparameter. ExtractionTake data from an external source and move it to the warehouse pre-processor database. Transformation
Transform data task allows point-to-point generating, modifying and transforming data. Loading
Load data task adds records to a database table in a warehouse.
-
8/2/2019 What is Data Warehousing
13/21
What does level of Granularity of a fact table signify?
GranularityThe first step in designing a fact table is to determine the granularity of the fact table. By granularity, we mean the lowestlevel of information that will be stored in the fact table. This constitutes two steps:
Determine which dimensions will be included.Determine where along the hierarchy of each dimension the information will be kept.
The determining factors usually goes back to the requirements
What is the Difference between OLTP and OLAP?
Main Differences between OLTP and OLAP are:-
1. User and System Orientation
OLTP: customer-oriented, used for data analysis and querying by clerks, clients and IT professionals.
OLAP: market-oriented, used for data analysis by knowledge workers( managers, executives, analysis).
2. Data Contents
OLTP: manages current data, very detail-oriented.
OLAP: manages large amounts of historical data, provides facilities for summarization and aggregation, stores informationat different levels of granularity to support decision making process.
3. Database Design
OLTP: adopts an entity relationship(ER) model and an application-oriented database design.
OLAP: adopts star, snowflake or fact constellation model and a subject-oriented database design.
4. View
OLTP: focuses on the current data within an enterprise or department.
OLAP: spans multiple versions of a database schema due to the evolutionary process of an organization; integrates
information from many organizational locations and data stores
What is SCD1 , SCD2 , SCD3?
SCD Stands for Slowly changing dimensions.
SCD1: only maintained updated values.
Ex: a customer address modified we update existing record with new address.
SCD2: maintaining historical information and current information by using
A) Effective DateB) Versions
C) Flags
or combination of these
SCD3: by adding new columns to target table we maintain historical information and current information.
Why are OLTP database designs not generally a good idea for a Data Warehouse?
Since in OLTP,tables are normalised and hence query response wil l be slow for end user and OLTP doesnot contain years of
data and hence cannot be analysed.
What is BUS Schema?
BUS Schema is composed of a master suite of confirmed dimension and standardized definition if facts.
-
8/2/2019 What is Data Warehousing
14/21
What are the various Reporting tools in the Market?
1. MS-Excel2. Business Objects (Crystal Reports)3. Cognos (Impromptu, Power Play)
4. Microstrategy5. MS reporting services6. Informatica Power Analyzer
7. Actuate8. Hyperion (BRIO)
9. Oracle Express OLAP10. Proclarity
What is Normalization, First Normal Form, Second Normal Form , Third Normal Form?
1.Normalization is process for assigning attributes to entitiesReducesdata redundanciesHelps eliminate data anomalies
Produces controlledredundancies to link tables
2.Normalization is the analysis offunctional dependency between attributes / data items of userviews?It reduces a complexuser view to a set of small andstable subgroups of fields / relations
1NF:Repeating groups must beeliminated, Dependencies can be identified, All key attributesdefined,No repeating groups intable
2NF: The Table is already in1NF,Includes no partial dependenciesNo attribute dependent on a portionof primary key, Still
possible to exhibit transitivedependency,Attributes may be functionally dependent on non-keyattributes
3NF: The Table is already in 2NF, Contains no transitivedependencies
What is Fact table?
Fact Table contains the measurements or metrics or facts of business process. If your business process is "Sales" , then a
measurement of this business process such as "monthly sales number" is captured in the Fact table. Fact table also
contains the foriegn keys for the dimension tables.
What are conformed dimensions?
Answer1:Conformed dimensions mean the exact same thing with every possible fact table to which they are joined Ex:Date
Dimensions is connected all facts like Sales facts,Inventory facts..etc
Answer2:Conformed dimentions are dimensions which are common to the cubes.(cubes are the schemas contains facts and
dimension tables)Consider Cube-1 contains F1,D1,D2,D3 and Cube-2 contains F2,D1,D2,D4 are the Facts and Dimensions here D1,D2 are
the Conformed Dimensions
What are the Different methods of loading Dimension tables?
Conventional Load:
Before loading the data, all the Table constraints wil l be checked against the data.
Direct load:(Faster Loading)
All the Constraints will be disabled. Data will be loaded directly.Later the data will be checked against the table constraintsand the bad data won't be indexed.
What is conformed fact?
Conformed dimensions are the dimensions which can be used across multiple Data Marts in combination with multiple facts
tables accordingly
What are Data Marts?
-
8/2/2019 What is Data Warehousing
15/21
Data Marts are designed to help manager make strategic decisions about their business.Data Marts are subset of the corporate-wide data that is of value to a specific group of users.
There are two types of Data Marts:
1.Independent data marts sources from data captured form OLTP system, external providers or from data generatedlocally within a particular department or geographic area.
2.Dependent data mart sources directly form enterprise data warehouses.
What is a level of Granularity of a fact table?
Level of granularity means level of detail that you put into the fact table in a data warehouse. For example: Based on
design you can decide to put the sales data in each transaction. Now, level of granularity would mean what detail are youwilling to put for each transactional fact. Product sales with respect to each minute or you want to aggregate it upto
minute and put that data.
How are the Dimension tables designed?
Most dimension tables are designed using Normalization principles upto 2NF. In some instances they are further
normalized to 3NF.
Find where data for this dimension are located.
Figure out how to extract this data.
Determine how to maintain changes to this dimension (see more on this in the next section).
What are non-additive facts?
Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.
What type of Indexing mechanism do we need to use for a typical datawarehouse?
On the fact table it is best to use bitmap indexes. Dimension tables can use bitmap and/or the other types ofclustered/non-clustered, unique/non-unique indexes.
To my knowledge, SQLServer does not support bitmap indexes. Only Oracle supports bitmaps.
What Snow Flake Schema?
Snowflake Schema, each dimension has a primary dimension table, to which one or more additional dimensions can join.
The primary dimension table is the only table that can join to the fact table.
What is data warehouse?
A data warehouse is a electronical storage of an Organization's historical data for the purpose of analysis and
reporting. According to Kimpball, a datawarehouse should be subject-oriented, non-volatile, integrated and time-
variant.
Explanatory Note
Non-volatile means that the data once loaded in the warehouse will not get deleted later. Time-
variant means the data will change with respect to time.
-
8/2/2019 What is Data Warehousing
16/21
What is the benefits of data warehouse?
Historical data stored in data warehouse helps to analyze different aspects of business including, performance
analysis, trend analysis, trend prediction etc. which ultimately increases efficiency of business processes.
Why Data Warehouse is used?
Data warehouse facilitates reporting on different key business processes known as KPI. Data warehouse can be
further used for data mining which helps trend prediction, forecasts, pattern recognition etc.
What is the difference between OLTP and OLAP?
OLTP is the transaction system that collects business data. Whereas OLAP is the reporting and analysis system on
that data.
OLTP systems are optimized for INSERT, UPDATE operations and therefore highly normalized. On the other hand,
OLAP systems are deliberately denormalized for fast data retrieval through SELECT operations.
Explanatory Note:
In a departmental shop, when we pay the prices at the check-out counter, the sales person at the
counter keys-in all the data into a "Point-Of-Sales" machine. That data is transaction data and the
related system is a OLTP system. On the other hand, the manager of the store might want to view a
report on out-of-stock materials, so that he can place purchase order for them. Such report will comeout from OLAP system
What is data mart?
Data marts are generally designed for a single subject area. An organization may have data pertaining to different
departments like Finance, HR, Marketting etc. stored in data warehouse and each department may have separate
data marts. These data marts can be built on top of the data warehouse.
What is ER model?
ER model is entity-relationship model which is designed with a goal of normalizing the data.
Q1. WHAT is SQL Server Reporting Services(SSRS)?SQL Server Reporting Services is a server-based reporting platform that you can use to create andmanage tabular, matrix, graphical, and free-form reports that contain data from relational andmultidimensional data sources. The reports that you create can be viewed and managed over a World
-
8/2/2019 What is Data Warehousing
17/21
Wide Web-based connection
Q2. Architecture of SSRS:
-Admin
Q3. What are the three stages of Enterprise Reporting Life Cycle ?a. Authoringb. Managementc. Access and Delivery
Q4. What are the components included in SSRS?1. A Complete set of Tools that can be used to create, manage and view reports2. A Report Server component that hosts and processes reports in a variety of formats. Output
formats include HTML, PDF, TIFF, Excel, CSV, and more.3.An API that allows developers to integrate or extend data and report processing in customapplications, or create custom tools to build and manage reports.
Q5. What is the benefit of using embedded code in a report?1. Reuseability of Code: function created in embedded code to perform a logic can be then used inmultiple expressions2. Centralized code: helps in better manageability of code.
Q6. Which programming language can be used to code embedded functions in SSRS?Visual Basic .NET Code.
Q7. Important terms used in the reporting services?
1. Report definition: The blueprint for a report before the report is processed or rendered. A reportdefinition contains information about the query and layout for the report.
2. Report snapshot: A report that contains data captured at a specific point in time. A reportsnapshot is actually a report definition that contains a dataset instead of query instructions.
3. Rendered report: A fully processed report that contains both data and layout information, in aformat suitable for viewing (such as HTML).
-
8/2/2019 What is Data Warehousing
18/21
4. Parameterized report: A published report that accepts input values through parameters.
5. Shared data source: A predefined, standalone item that contains data source connectioninformation.
6. Shared schedule: A predefined, standalone item that contains schedule information.
7. Report-specific data source: Data source information that is defined within a report definition.
8. Report model: A semantic description of business data, used for ac hoc reports created in ReportBuilder.
9. Linked report: A report that derives its definition through a link to another report.
10. Report server administrator: This term is used in the documentation to describe a user withelevated privileges who can access all settings and content of a report server. If you are using thedefault roles, a report server administrator is typically a user who is assigned to both the ContentManager role and the System Administrator role. Local administrators can have elevated permissioneven if role assignments are not defined for them.
11. Folder hierarchy: A bounded namespace that uniquely identifies all reports, folders, report
models, shared data source items, and resources that are stored in and managed by a report server.
12. Report Server: Describes the Report Server component, which provides data and reportprocessing, and report delivery. The Report Server component includes several subcomponents thatperform specific functions.
13. Report Manager: Describes the Web application tool used to access and manage the contents ofa report server database.
14. Report Builder: Report authoring tool used to create ad hoc reports.
15. Report Designer: Report creation tool included with Reporting Services.
16. Model Designer: Report model creation tool used to build models for ad hoc reporting.
17. Report Server Command Prompt Utilities: Command line utilities that you can use toadminister a report server.a) RsConfig.exe, b) RsKeymgmt.exe, c) Rs.exe
Q8. what are the Command Line Utilities available In Reporting Services? Rsconfig Utility (Rsconfig.exe): encrypts and stores connection and account values in theRSReportServer.config file. Encrypted values include report server database connection informationand account values used for unattended report processing RsKeymgmt Utility: Extracts, restores, creates, and deletes the symmetric key used to protectsensitive report server data against unauthorized access RS Utility: this utility is mainly used to automate report server deployment and administrationtasks.Processes script you provide in an input file.
Q. How to know Report Execution History?ExecutionLog table in ReportServer database store all the logs from last two months.SELECT * FROM ReportServer.dbo.ExecutionLog
-DevelopmentQ. What is difference between Tablular and Matrix report?OR What are the different styles of reports?
Tablular report: A tabular report is the most basic type of report. Each column corresponds to a
-
8/2/2019 What is Data Warehousing
19/21
column selected from the database.
Matrix report: A matrix (cross-product) report is a cross-tabulation of four groups of data:a. One group of data is displayed across the page.b. One group of data is displayed down the page.c. One group of data is the cross-product, which determines all possible locations where the across
and down data relate and places a cell in those locations.
d. One group of data is displayed as the "filler" of the cells.Martix reports can be considered more of a Pivot table.
Q. How to create Drill-through reports?Using Navigation property of a cell and setting child report and its parameters in it.
Q. How to create Drill-Down reports?To cut the story short:- By grouping data on required fields-Then toggle visibility based on the grouped filed
Q1 Explain architecture of SSIS?
SSIS architecture consists of four key parts:
a) Integration Services service: monitors running Integration Services packages and manages thestorage of packages.
b) Integration Services object model: includes managed API for accessing Integration Servicestools, command-line utilities, and custom applications.c) Integration Services runtime and run-time executables: it saves the layout of packages, runspackages, and provides support for logging, breakpoints, configuration, connections, and transactions.The Integration Services run-time executables are the package, containers, tasks, and event handlersthat Integration Services includes, and custom tasks.d) Data flow engine: provides the in-memory buffers that move data from source to destination.
-
8/2/2019 What is Data Warehousing
20/21
Q2 How would you do Logging in SSIS?Logging Configuration provides an inbuilt feature which can log the detail of various events likeonError, onWarning etc to the various options say a flat file, SqlServer table, XML or SQL Profiler.
Q3 How would you do Error Handling?A SSIS package could mainly have two types of errors
a) Procedure Error: Can be handled in Control flow through the precedence control and redirecting theexecution flow.b) Data Error: is handled in DATA FLOW TASK buy redirecting the data flow using Error Output of acomponent.
Q4 How to pass property value at Run time? How do you implement Package Configuration?A property value like connection string for a Connection Manager can be passed to the pkg usingpackage configurations.Package Configuration provides different options like XML File, EnvironmentVariables, SQL Server Table, Registry Value or Parent package variable.
Q5 How would you deploy a SSIS Package on production?A) Through Manifest1. Create deployment utility by setting its propery as true .2. It will be created in the bin folder of the solution as soon as package is build.
3. Copy all the files in the utility and use manifest file to deply it on the Prod.B) Using DtsExec.exe utilityC)Import Package directly in MSDB from SSMS by logging in Integration Services.
Q6 Difference between DTS and SSIS?Every thing except both are product of Microsoft :-).
Q7 What are new features in SSIS 2008?explained in other posthttp://sqlserversolutions.blogspot.com/2009/01/new-improvementfeatures-in-ssis-2008.html
Q8 How would you pass a variable value to Child Package?too big to fit here so had a write other posthttp://sqlserversolutions.blogspot.com/2009/02/passing-variable-to-child-package-from.html
Q9 What is Execution Tree?Execution trees demonstrate how package uses buffers and threads. At run time, the data flow enginebreaks down Data Flow task operations into execution trees. These execution trees specify howbuffers and threads are allocated in the package. Each tree creates a new buffer and may execute ona different thread. When a new buffer is created such as when a partially blocking or blockingtransformation is added to the pipeline, additional memory is required to handle the datatransformation and each new tree may also give you an additional worker thread.
Q10 What are the points to keep in mind for performance improvement of the package?http://technet.microsoft.com/en-us/library/cc966529.aspx
Q11 You may get a question stating a scenario and then asking you how would you create a
package for that e.g. How would you configure a data flow task so that it can transfer datato different table based on the city name in a source table column?
Q13 Difference between Unionall and Merge Join?a) Merge transformation can accept only two inputs whereas Union all can take more than two inputs
b) Data has to be sorted before Merge Transformation whereas Union all doesn't have any conditionlike that.
Q14 May get question regarding what X transformation do?Lookup, fuzzy lookup, fuzzy
-
8/2/2019 What is Data Warehousing
21/21