04.01 file organization

DBMS

File Organization for Conventional DBMS

Bishal [email protected]

Storage Devices and its Characteristics

Storage Hierarchy• Primary Storage is the top level and is made up

of CPU registers, CPU cache and memory which are the only components that are directly accessible to the systems CPU. The CPU can continuously read data stored in these areas and execute all instructions as required quickly in a uniform manner. Secondary Storage differs from primary storage in that it is not directly accessible by the CPU. A system uses input/output (I/O) channels to connect to the secondary storage which control the data flow through a system when required and on request

Storage Hierarchy

•Secondary storage is non-volatile so does not lose data when it is powered down so consequently modern computer systems tend to have a more secondary storage than primary storage. All secondary storage today consist of hard disk drives (HDD), usually set up in a RAID configuration, however older installations also included removable media such us magneto optical or MO

Storage Hierarchy

•Tertiary Storage is mainly used as backup and archival of data and although based on the slowest devices can be classed as the most important in terms of data protection against a variety of disasters that can affect an IT infrastructure. Most devices in this segment are automated via robotics and software to reduce management costs and risk of human error and consist primarily of disk & tape based back up devices

Storage Hierarchy

•Offline Storage is the final category and is where removable types of storage media sit such as tape cartridges and optical disc such as CD and DVD. Offline storage is can be used to transfer data between systems but also allow for data to be secured offsite to ensure companies always have a copy of valuable data in the event of a disaster.

Memory Hierarchy

Digital data devices

Hard disk internal structure

Checksum

Checksum•Checksums are used to ensure the

integrity of data portions for data transmission or storage. A checksum is basically a calculated summary of such a data portion.

•Network data transmissions often produce errors, such as toggled, missing or duplicated bits.

•Some checksum algorithms are able to recover (simple) errors by calculating where the expected error must be and repairing it.

Parity Bits

Disk Subsystem•Multiple disks connected to a computer

system through a controller▫Controllers functionality (checksum, bad

sector remapping) oftencarried out by individual disks; reduces load on controller

•Disk interface standards families▫ATA(AT adaptor) range of standards▫SATA(Serial ATA)▫SCSI(Small Computer System Interconnect)

range of standards▫SAS(Serial Attached SCSI)▫Several variants of each standard (different

speeds and capabilities)

Disk Subsystem• Disks usually connected directly to computer system• In Storage Area Networks (SAN), a large number of

disks are connected by a high-speed network to a number of servers

• In Network Attached Storage (NAS) networked storage provides a file system interface using networked file system protocol, instead of providing a disk system interface

RAID - redundant array of independent disks

•RAID is short for redundant array of independent (or inexpensive) disks. It is a category of disk drives that employ two or more drives in combination for fault tolerance and performance. RAID disk drives are used frequently on servers but aren't generally necessary for personal computers. RAID allows you to store the same data redundantly (in multiple paces) in a balanced way to improve overall storage performance.

http://www.webopedia.com/TERM/D/disk_drive.html

http://www.webopedia.com/TERM/F/fault_tolerance.html

http://www.webopedia.com/TERM/S/server.html

http://www.webopedia.com/TERM/P/personal_computer.html

• Level 0: Striped Disk Array without Fault ToleranceProvides data striping(spreading out blocks of each file across multiple disk drives) but no redundancy. This improves performance but does not deliver fault tolerance. If one drive fails then all data in the array is lost.

• Level 1: Mirroring and DuplexingProvides disk mirroring. Level 1 provides twice the read transaction rate of single disks and the same write transaction rate as single disks.

• Level 2: Error-Correcting CodingNot a typical implementation and rarely used, Level 2 stripes data at the bit level rather than the block level.

http://www.webopedia.com/TERM/D/disk_mirroring.html

• Level 3: Bit-Interleaved ParityProvides byte-level striping with a dedicated parity disk. Level 3, which cannot service simultaneous multiple requests, also is rarely used.

• Level 4: Dedicated Parity DriveA commonly used implementation of RAID, Level 4 provides block-level striping (like Level 0) with a parity disk. If a data disk fails, the parity data is used to create a replacement disk. A disadvantage to Level 4 is that the parity disk can create write bottlenecks.

• Level 5: Block Interleaved Distributed ParityProvides data striping at the byte level and also stripe error correction information. This results in excellent performance and good fault tolerance. Level 5 is one of the most popular implementations of RAID.

Performance Measures of Disks

• Access time: the time from when a read or write request is issued to when data transfer begins. To access data on a given sector of a disk, the arm first must move so that it is positioned over the correct track, and then must wait for the sector to appear under it as the disk rotates. The time for repositioning the arm is called seek time, and it increases with the distance the arm must move. Typical seek time range from 2 to 30 milliseconds. Average seek time is the average of the seek time, measured over a sequence of (uniformly distributed) random requests, and it is about one third of the worst-case seek time.

• Once the seek has occurred, the time spent waiting for the sector to be accesses to appear under the head is called rotational latency time. Average rotational latency time is about half of the time for a full rotation of the disk. (Typical rotational speeds of disks ranges from 60 to 120 rotations per second).

• The access time is then the sum of the seek time and the latency and ranges from 10 to 40 milli-sec.

Performance Measures of Disks

•data transfer rate, the rate at which data can be retrieved from or stored to the disk. Current disk systems support transfer rate from 1 to 5 megabytes per second.

•reliability, measured by the mean time to failure. The typical mean time to failure of disks today ranges from 30,000 to 800,000 hours (about 3.4 to 91 years).

Optimization of Disk-Block Access•Data is transferred between disk and

main memory in units called blocks. •A block is a contiguous sequence of bytes

from a single track of one platter. •Block sizes range from 512 bytes to

several thousand. •The lower levels of file system manager

covert block addresses into the hardware-level cylinder, surface, and sector number

• Access to data on disk is several orders of magnitude slower than is access to data in main memory. Optimization techniques besides buffering of blocks in main memory. Scheduling: If several blocks from a cylinder need to be transferred, we may save time by requesting them in the order in which they pass under the heads. A commonly used disk-arm scheduling algorithm is the elevator algorithm.

• File organization. Organize blocks on disk in a way that corresponds closely to the manner that we expect data to be accessed. For example, store related information on the same track, or physically close tracks, or adjacent cylinders in order to minimize seek time. IBM mainframe OS's provide programmers fine control on placement of files but increase programmer's burden.

•Nonvolatile write buffers. Use nonvolatile RAM (such as battery-back-up RAM) to speed up disk writes drastically (first write to nonvolatile RAM buffer and inform OS that writes completed).

•Log disk. Another approach to reducing write latency is to use a log disk, a disk devoted to writing a sequential log. All access to the log disk is sequential, essentially eliminating seek time, and several consecutive blocks can be written at once, making writes to log disk several times faster than random writes.

Optical Disks

http://www.doc.ic.ac.uk/~nd/surprise_97/journal/vol2/lpl/

04.01 file organization

Documents

data disk

storage devices

parity data

disk tape

totransfer data

data transfer

replacement disk

aparity disk