meljun cortes computer information processing chapter 9 with_notes

31
CCT101: Chapter 9 Files

Upload: meljun-cortes

Post on 02-Jun-2015

314 views

Category:

Documents


0 download

DESCRIPTION

MELJUN CORTES Computer information processing chapter 9 with_notes

TRANSCRIPT

Page 1: MELJUN CORTES Computer information processing chapter 9 with_notes

CCT101: Chapter 9

Files

Page 2: MELJUN CORTES Computer information processing chapter 9 with_notes

OBJECTIVESOBJECTIVES

• Describe the types of data processing files• Describe the types of file organization• Data validation

Page 3: MELJUN CORTES Computer information processing chapter 9 with_notes

FILE, RECORD & FIELD - Field

• Data item• e.g. student name

- Record• A group of related data items or fields• e.g. student record

- File• A collection of related records• e.g. Student file

Page 4: MELJUN CORTES Computer information processing chapter 9 with_notes

ENTITY SET, ENTITY & ATTRIBUTES

- Attributes

• Describe the properties of the entity (I.e. field)

- Entity• Which or when we store facts (i.e. records)

- Entity set• A collection of logically related entities (i.e. file)

Page 5: MELJUN CORTES Computer information processing chapter 9 with_notes

1. Physical file :

– Refers to how the data is stored i.e. the actual arrangement of data in storage device

2. Logical file :

– What a file contains & how the data should be processed

Logical File & Physical Files

Page 6: MELJUN CORTES Computer information processing chapter 9 with_notes

• It is a field within the record which is used for locating & processing the recorde.g. student number

Key Field

Page 7: MELJUN CORTES Computer information processing chapter 9 with_notes

FILE LENGTH

• Fixed-length records– Each record has the same length– Advantage: Easier to design– Disadvantage: Wasted storage space

• Variable-length records– Each record does not have the same length– Advantage: Saves storage space– Disadvantage: More difficult to design

Page 8: MELJUN CORTES Computer information processing chapter 9 with_notes

1. Writing :

– The act of transferring a record from main memory to secondary storage.

2. Insertion :

– Adding a new record to an existing file.

3. Deleting :

– Removing a record from a file.

INFORMATION RETRIEVAL

Page 9: MELJUN CORTES Computer information processing chapter 9 with_notes

4. Updating :– Making changes to the contents of a record to show the new

status of information.

5. Sorting :

– Rearranging the records in a file for the purpose of producing ordered reports.

6. Merging :

– Combination of 2 or more files to produce a single output file.

INFORMATION RETRIEVAL

Page 10: MELJUN CORTES Computer information processing chapter 9 with_notes

7. Matching :– Where 2 or more output files are compared record

against record to ensure there is a complete set of records for each key. Mismatched records are highlighted for action.

8. Searching :– Involves looking for a record with a certain key value

9. Appending :- Adding a record at the last available space of an

existing file

INFORMATION RETRIEVAL

Page 11: MELJUN CORTES Computer information processing chapter 9 with_notes

• The number of records that are changed as a result of updating when compared to the total number of records in the file.

– HIT RATE

=

• Volatility :– Measuring the number of additions and deletions in a file.

• File growth– No of records additions – number of records deletions

number or records affectedtotal records on file

ACTIVITY RATIO (HIT RATE)

Page 12: MELJUN CORTES Computer information processing chapter 9 with_notes

1. Master file

– Permanent or semi-permanent data

– Used for reference and updating

– Shows the current status of data

– Never empty except at its time of creation

– E.g. stock master file

TYPES OF DP FILES

Page 13: MELJUN CORTES Computer information processing chapter 9 with_notes

2. Transaction file

– Contains source or transaction data

– Used for updating master file

– E.g. sales transaction file

3. Work file

– Temporary file

– Used for storing intermediate data for further processing

– E.g. file used by sort utility

TYPES OF DP FILES

Page 14: MELJUN CORTES Computer information processing chapter 9 with_notes

4. Transition file

– Temporary file for specific use

– E.g. meter readings, customer’s detail for printout

5. Security & backup file

– Extra copy of file against damage/loss

6. Audit file

– Enables auditor to check correct functioning of computer based procedures

– Keeps a copy of all transactions

TYPES OF DP FILES

Page 15: MELJUN CORTES Computer information processing chapter 9 with_notes

FILE ORGANISATIONS

• 4 Types

1. Serial

2. Sequential

3. Indexed-sequential

4. Random

Page 16: MELJUN CORTES Computer information processing chapter 9 with_notes

• Simplest, not in any order

• Placed record in next available space

• Suitable for– Unsorted transaction files

– Print files

– Dump files

– Temporary data files

• Access in order of records placed

SERIAL ORGANISATION

Page 17: MELJUN CORTES Computer information processing chapter 9 with_notes

SERIAL ORGANISATION

• Advantages :

– File design is simple

– Efficient for high activity file

– Effective use of low cost file media suitable for batch processing

• Disadvantage :

– File are to be processed from beginning to the end

Page 18: MELJUN CORTES Computer information processing chapter 9 with_notes

• Predefined order

• A designated field within the record is selected as basis in ordering records

• This key is also known as Record key or Simply key

• Suitable for master file

• Not for fast response on line enquiring systems

• E.g. Payroll transaction file

SEQUENTIAL ORGANISATION

Page 19: MELJUN CORTES Computer information processing chapter 9 with_notes

SEQUENTIAL ORGANISATION• Advantages :

– File design is simple

– Efficient for high activity file

– Effective use of low cost file media suitable for batched transactions

• Disadvantage :

– Entire file must be processed even if activity is low

– Transactions required sorting

Page 20: MELJUN CORTES Computer information processing chapter 9 with_notes

• Physical sequence to primary key

• Builds an index separate from the data or

records

• Accessed randomly and sequentially

• 3 main parts– Prime (Home) area

– Overflow area

– Index area

INDEXED SEQUENTIALORGANISATION

Page 21: MELJUN CORTES Computer information processing chapter 9 with_notes

INDEXED SEQUENTIALORGANISATION

• When insufficient space in home area (prime area), overflow area will be used

• Overflow areas created at cylinder & track level

• Access controlled by means of pointers

• File reorganization to be done

• Overflow records recovered & indexes rebuilt

Page 22: MELJUN CORTES Computer information processing chapter 9 with_notes

- Support three types of processing :

1. Sequential processing

2. Selective sequential processing/ Random access

3. Block is searched record by record until record is found/ Direct access/ Dynamic access

INDEXED-SEQUENTIAL FILES

Page 23: MELJUN CORTES Computer information processing chapter 9 with_notes

• Predictable relationship between record key & record’s location on disc

• Not in sequence physically, scattered in random

• Direct addressing

• Key as physical address of record

• Device dependent

RANDOM ORGANISATION

Page 24: MELJUN CORTES Computer information processing chapter 9 with_notes

INDEXED-SEQUENTIAL ORGANISATION• Advantages :

– Transactions may be sorted or unsorted

– Only the affected master records are processed during updating

– Response time is reasonably fast

– Facilities file enquiry

– Be processed sequentially and randomly

• Disadvantage :

– Each master file access requires index file access

– Requires direct access storage devices (still costly)

– Storage space required for indexes

Page 25: MELJUN CORTES Computer information processing chapter 9 with_notes

RANDOM ORGANIZATION

• Predictable relationship between record key and record location on disc

• Records may be scattered in random

• Direct addressing

Page 26: MELJUN CORTES Computer information processing chapter 9 with_notes

RANDOM ORGANIZATION

• Key transformation techniques used

1. Division remainder method

Divide key value by an appropriate number

Remainder of division as address of record

Number used to divide is prime number

2. Mid Square Hashing

The key is squared, specified digits extracted from middle of the

result to yield address of the results

Page 27: MELJUN CORTES Computer information processing chapter 9 with_notes

RANDOM ORGANIZATION

3. Hashing By Folding

– Key is divided into 2 or more parts which are then added together

– Truncation to bring result into required range of numbers

Page 28: MELJUN CORTES Computer information processing chapter 9 with_notes

RANDOM ORGANISATION• Advantages :

– As index are not required, space and searching time are saved

– Insertion and deletion or records can take place

• Disadvantage :

– Variable-length records are difficult to handle

– Gaps in keys can caused wasted space

– Synonym can occur

– Allocation of efficient overflow areas is difficult

Page 29: MELJUN CORTES Computer information processing chapter 9 with_notes

• Double punching method

• Sight verification

DATA VERIFICATION

Page 30: MELJUN CORTES Computer information processing chapter 9 with_notes

• Presence

• Size

• Range

• Character check

• Format

• Reasonableness

• Check digits

DATA VALIDATION

Page 31: MELJUN CORTES Computer information processing chapter 9 with_notes

• Adequate program checkpoint/ restart facilities

• File dumps

• Generations of backup files

ERROR RECOVERY