28=16341_sad ppt27
TRANSCRIPT
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 1/29
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 2/29
Introduction• Information systems in business are file- and
database-oriented. Data are accumulated into filesthat are processed and maintained by the system.
• Databases, draw on data accumulated in transactionand other kinds of files and are designed to share
data for different applications.• The system analyst is responsible for designing the
files, determining their contents, and selecting amethod for organizing the data.
• At same time, if the proposed applications will drawon database resources, the analyst must developthe means of interacting with the database.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 3/29
Basic File Terminology• Data Item:
Individual elements of data are called dataitems. Each data item is identified by name and
has a specific value associated with it.
The association of a value with a field createsone instance of data item.
Data items can comprise subitems or subfields.
E.g Date is often used as a single data item,consisting of three subitems of month, day and
year.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 4/29
Basic File Terminology• Record:
The complete set of related data pertaining toan entry, is a record.
Each field has a defined length and type.
When the number and size of data item in arecord are constant for every record, the record
is called a fixed-length record.
Variable-length records are less common in mostbusiness applications than fixed-length designs.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 5/29
Basic File Terminology• Record Key:
To distinguish one specific record from another,systems analysts select one data item in the
record that is likely to be unique in all records of
a file and use it for identification purposes.This item, called the record key, key attribute, or
simply key, is already part of the record, not
additional data added to it just for the purposeof identification.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 6/29
Basic File Terminology• Entity:
An entity is any person, place, thing or event of interest to the organization and about which
data are captured, stored, or processed.
Patients and tests are entities of interest inhospitals, while banking entities include
customers and checks.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 7/29
Basic File Terminology• File:
A file is a collection of related records. Eachrecord in a file is included because it pertains to
the same entity.
A file of checksThe number of records in the file determines the
file size. If each record is fixed-length and uses
200 characters of storage, the file uses 6 times200 characters of storage.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 8/29
Basic File Terminology• Databases:
A database is an integrated collection of data stored in
different types of records, and in a way that makes themaccessible for multiple applications.
The interrelation of the records derives from the relationshipin the data, not from their physical storage location.
Records for different entities are typically stored in a database(whereas files store record for single entity). In a universitydatabase, for example, records for students, courses, andfaculty are interrelated in the same database.
Databases donot eliminate the need for files in aninformation system. Different types of files are still needed tocapture the details of events and business activities, toprepare reports, or to store data that are not in the database.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 9/29
Data Structure DiagramsPurpose:
Data structure diagrams are graphic tools that show the
logical data structure requirements of an informationsystem application.
They serve four purposes:
1. Verify information requirements
2. Describe data associated with entities.
3. Show the relationship between entities.
4. Communicate the data requirements to a file designer
or database administrator.
Each data item either identifies the entities ordescribe an important attribute. Data structure
diagrams organize the data.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 10/29
Data Structure DiagramsNotation:
A common notation is used in preparing data
structure diagrams.
Entities are represented by rectangles, with entity
name at the top and a list of attributes (data items
or fields) describing the entity.
Each entity is identifiable by a key attribute, which
by convention is the first data item listed.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 11/29
Data Structure DiagramsUse in file design:
The use of data structure diagrams requires the
analyst to address the important questions about
the entity being described:
• What data items will uniquely identify an
occurrence of the entity?
• By what means will information about the entity be
accessed?
• What other data items describes the attributes of
the entity?
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 12/29
Data Structure Diagrams
Check
Account number
Check numberDate
Payee
Amount of transaction
Entity name
Key
Data items
Figure: Data structure diagram for checking examples
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 13/29
Data Structure Diagrams• Figure includes a simple data structure diagram for
checking example introduced.
• As illustration shows, the record key, which in this case
is the account number, uniquely identifies the account.
Other details , including check number, date, payee,
and amount of the transaction, are attributes.• Analyzing the use of the checking information through
the data structure diagram shows that the actual check
number must be used for identification purposes. Since
the account number, uniquely identifies the account
but doesnot describe transactions involving it, a
combined key of account number and check number
must be used to trace individual transactions.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 14/29
Types of Files• There are mainly four types of files:
Master file
Transaction file
Table file
Report file
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 15/29
Master File• A master file is a collection of records about an
important aspect of an organization’s activities.
• It may contain data describing the current status of
specific events or business indicators.
• E.g the master file in accounts payable system
shows the balance owed to every vendor and
supplier from whom the organization purchases
supplies or services.
• A second type of master file reflects the history of events affecting a particular entity.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 16/29
Transaction File• A transaction file is a temporary file with two
purposes:
1. accumulating data about events as they occur
2. updating master files to reflect the results of
current transactions.
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 17/29
Table File• Table files contain reference data used in processing
transactions, updating master files, or producing
output.
• Table files conserve storage space and ease
program maintenance by storing in a file data that
otherwise would be included in programs or masterfile records.
l
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 18/29
Report File• Report files are temporary files used when printing time is
not available for all the reports produced, a situation that
frequently arises in overlapped processing. (the capabilityof a computer to simultaneously carry out input,
processing, and output operations which increases
throughput time considerably).
• The computer writes the report or document to a file on
magnetic disk or tape, where it remains until it can be
printed.
•
This process is known as spooling, i.e. , output that cannotbe printed when it is produced is spooled into a report file.
h l
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 19/29
Other Files• Other kinds of files play a role in information systems.
• A backup file is a copy of a master, transaction, or table file
made to ensure that a duplicate is available if anythinghappens to the original.
• Archival files, copies made for long-term storage of data,
usually are stored away from the computer center to
prevent their being inadvertently accessed or retrieved for
use, thus ensuring their preservation.
h d f fil i i
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 20/29
Methods of file organization
• Sequential organization
• Direct-Access organization
• Indexed organization
S i l i i
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 21/29
Sequential organization
• Sequential organization is the simplest way to store and
retrieve records in a file.
• In sequential file, records are stored one after the another
without concern for the actual value of data in the records.
• The first record is stored at the beginning of the file.
• The second is stored right after the first, the third after thesecond and so on.
• This order never changes in sequential file organization.
S i l i i 1
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 22/29
Sequential organization 1
• Reading sequential files: To read sequential file, the system
always starts at the beginning of the file and reads its way
up to the record, one record at a time.
• Searching for records: Sequential files do not use physical
record keys; records are accessed in order of their
appearance in file.• Evaluation of sequential files: when there is need to access
every record in a file then it is a good method. If on the
average of about one-half of the records in the file is to be
used then it is also acceptable. On the other hand, where
the requirement is to find one particular record in a very
large file, sequential file organization becomes a
disadvantage.
Di A i i
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 23/29
Direct-Access organization
• Direct-access files are keyed files. They associate a record
with a specific key value and a particular storage location.
• All records are stored by key at addresses rather than by
position.
• If the program knows the record key, it can determine the
location address of a record and retrieve it independentlyof every other record in a file.
• In general, if fewer than 10 percent of the records in a file
will be needed during a typical processing run, the file
should not be established as a sequential file.
• On the other hand, if more than 40 percent of the records
will be accessed, the analyst should select the sequential
organization.
Di t A i ti 2
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 24/29
Direct-Access organization 2
• Using the record key as the storage address is called addressing.
When this method can be used, it is simple and quick.
• However, the requirements of this method often prevent its
use. Direct addressing should have a data set with the following
characteristics:
1. The key set (i.e., the range of key values assigned) is in a dense,
ascending order with few unused values (unused values mean
wasted storage space). Therefore few open gaps in key values
are wanted.
2. The record keys correspond to the numbers of the storageaddresses: there is a storage address for each actual or
possible key value in the file and there are no duplicate key
values.
Di t A i ti 3
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 25/29
Direct-Access organization 3
• Hash addressing: When direct addressing is not possible but
direct access is necessary the analyst specifies the alternative
access method of hashing.
• Hashing (also called Key transformation or randomizing) refers
to the process of deriving a storage address from a record key.
• An algorithm (an arithmetic procedure) is devised to change a
key value into another value that serves as a storage address.
(The data value in the record itself does not change).
Di t A i ti 4
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 26/29
Direct-Access organization 4
• Types of hashing algorithms: A simple hashing algorithm for
changing the social security number into a suitable storage
address follows:
1. Strip off the first three digits of the number. 456821455
becomes 821455.
2. Divide the new key by prime number. Here we are using 41.
3. Modular division is used.
4. 19.
• Folding: Split the key into pieces and process them further
(add, subtract, divide, etc).
821
+ 455
1276 storage location
Di t A i ti 5
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 27/29
Direct-Access organization 5
• Extraction: Select specific digits from the key and process
them with the remaining digits.
814 (1st, 3rd , 4th digits)
- 255 (2nd, 5th , 6th digits)
599 storage location
• Squaring: Multiply the number by itself and then apply otherhashing methods.
821,455 * 821,455 = 67,478, 831
Fold first half with second half. Extract 1st and 2nd to other
digits
6747 578
8831 15
15,578 storage location 593 storage location
I d d i ti
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 28/29
Indexed organization
• A third way of accessing records is through an index.
• The basic form of index includes a record key and a storage
address for a record.
• To find a record when a storage address is unknown (as with
direct access and hashing structures), it is necessary to scan the
records.
• However the search will be faster if an index is used, since it
takes less time to search an index than an entire file of data.
Ch t i ti f I d
7/27/2019 28=16341_sad ppt27
http://slidepdf.com/reader/full/2816341sad-ppt27 29/29
Characteristics of an Index
• An index is a separate file from master file to which it pertains.
• Each record in the index contains only two items of data: a
record key and a storage address.
• To find a specific record when the file is stored under an indexed
organization, the index is first searched to find the key of record
wanted.
• When it is found, the corresponding storage address is noted
and then the program accesses the record directly.
• This method uses a sequential scan of the index, followed by
direct access to the appropriate record.• The index help speed the search compared with a sequential
file, but it is slower than direct addressing