csci12 report aug18

22
GROUP 5 Master Jenniferson Napallatan Neri Openiano Ostil

Upload: karenostil

Post on 14-Jul-2015

165 views

Category:

Technology


0 download

TRANSCRIPT

GROUP 5

Master Jenniferson NapallatanNeri

OpenianoOstil

TOPICS

• Index• Indexed Sequential File• Properties of Indexed Sequential File• Datawarehousing

Index

• Indexes provide fast searching of a table based on one or more key columns. Indexes on foreign keys can also

greatly improve the performance of join.

Indexed Sequential File

A file combining properties of random-access files and sequential files

• Records in indexed sequential files are stored in the order that they are written to the disk.

• Records may be retrieved in sequential order or in random order using a numeric index to represent the record number in file

Properties

• Primary Storage Area: Records in indexed sequential files are

stored in the order that they are written to the disk. Records may be retrieved in sequential order or in random order using a numeric index to represent the record number in the file.

Properties

Records are stored sequentially, originally to speed access on a tapesystem. In contrast, a relational databaseuses a query optimizer which automaticallySelects indexes. The record size, specifiedwhen the file is created, may range from 1 to8000 bytes.

Properties

2. Separate Indexes:The Indexed Access method of reading or writing data only provides the desiredoutcome if in fact the file is organized as anISAM file with the appropriate, previouslydefined keys. Access to data via thepreviously defined key(s) is extremely fast.

Properties

Multiple keys, overlapping keys and key compression within the hash tables are supported. A utility todefine/redefine keys in existing files isprovided. Records can be deleted, although"garbage collection" is done via a separateutility.

Properties

3. Overflow Area:When an ISAM file is created, index nodes are fixed, and their pointers do notchange during inserts and deletes that occurlater (only content of leaf nodes changeafterwards).

Properties

node exceed the node's capacity, new records are stored in overflow chains. Ifthere are more inserts than deletionsfrom a table, these overflow chains cangradually become very large, and thisaffects the time required for retrieval of arecord.

Properties

Indexed sequential files: commonly used for transaction files because they take less disk space thankeyed files, and are faster to read frombeginning to end than a keyed file.

Data Warehousing

What is a Data Warehouse?

DW is a subject-oriented, integrated, time-variant, and nonvolatile collection of data intended to

support management decision making

Data Warehousing

What is a Data Warehouse?

DW is a subject-oriented, integrated, time-variant, and nonvolatile collection of data intended to support management decision making

Data Warehousing

DATABASE vs DATA WAREHOUSE

Database: transactional (relational, object-oriented, network, heierarchical)

Data Warehouse: mainly INTENDED for decision support applications

**optimized for retrieval not routine transactional processing**

Data Warehousing

What is a Data Warehousing?

combining multiple and usually varied sources into one comprehensive and easily manipulated database. (wiseGEEK.com)

Data Warehousing

Properties:

1. Organized around major subject areas of an org. (i.e. sales ,suppliers,products, etc.)

2. Integrated from multiple operational OLTP data sources ** OLTP = OnLine Transaction Processing db

Data WarehousingProperties:

3. Periodic updates (based on schedules)

There is a trend wherein updates are gearing towards near real-time reporting of business analytics.

Data Warehousing

Advantages:2.Competitive advantage3.Increased productivity of corporate decision makers3. Potential high return on investment as the

org. Finds the best way to impove efficiency and/or profitability

Data Warehousing

Encountered Problems:2.Underestimation of resources required to load the data2. Hidden data integrity problems in source data3. Omitting data later found to be required

Data WarehousingEncountered Problems:

4. Ever increasing end user demands5. Consolidating data from diparate data sources 6. High resource demands (huge amount of

storage; queries that process millions of rows)

7. Ownership of data

Data Warehousing

Encountered Problems:

8. Difficulty in determining what the business really wants or needs to analyze

9. “Big Bang” projects that seem never-ending