distributed dbmss- concept and design jing luo cs 157b dr. lee fall, 2003

27
Distributed DBMSs- Distributed DBMSs- Concept and Design Concept and Design Jing Luo Jing Luo CS 157B CS 157B Dr. Lee Dr. Lee Fall, 2003 Fall, 2003

Upload: angelina-bridges

Post on 04-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Distributed DBMSs-Distributed DBMSs-Concept and DesignConcept and DesignDistributed DBMSs-Distributed DBMSs-Concept and DesignConcept and Design

Jing LuoJing LuoCS 157BCS 157BDr. LeeDr. Lee

Fall, 2003Fall, 2003

Page 2: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

DBMSs

Centralized DBMS

• It allows users to access only a single logical database located at one site under its control.

Distributed DBMS

• It allows users to access not only the data at their own site but also data stored at remote sites.

Page 3: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Definitions• Distributed database: A logically

interrelated collection of shared data (and a description of this data) physically distributed over a computer network.

• Distributed DBMS: The software system that permits the management of the distributed database and makes the distribution transparent to users.

Page 4: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Users access the distributed database via applications• Local applications

Applications are those do not require data from other sites.

• Global applicationsApplications are those do require data from other sites.

Page 5: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Characteristics of DDBMS

• A collection of logically related shared data;• The data is split into a number of fragments;• Fragments may be replicated;• Fragments/replicas are allocated to sites;• The sites are linked by a communications network;• The data at each site is under the control of a DBMS;• The DBMS at each site can handle local applications,

autonomously;• Each DBMS participates in at lease one global

application.

Page 6: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

A DDBMS is required to have at least one global application.It is not necessary for every site in the system to have its own local database.

DDBMS

Computer network

Site 1

Site 2

Site 3

Site 4

DB

DBDB

Page 7: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Distributed processing

A centralized database that can be accessed over a computer network.

Page 8: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Distributed Processing (cont’d)

Distributed Processing

Computer network

Site 3

Site 4

Site 1

Site 2

DB

Page 9: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Distributed DBMS vs. Distributed Processing

Distributed DBMS• System consists

of data that is physically distributed across a number of sites in the network.

Distributed processing

• Data is centralized, even though other users may be accessing the data over the network.

Page 10: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Parallel DBMSs

A DBMS running across multiple processors and disks that is designed to execute operations in parallel, whenever possible, in order to improve performance

Page 11: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Three Main Architectures for Parallel DBMSs

To provide multiple processors withcommon access to a single database, a parallel DBMS must provide for shared resource management.• Shared memory• Shared disk• Shared nothing

Page 12: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Shared memory is a tightly coupled architecture in which multiple processors within a single system share system memory.• Symmetric multiprocessing (SMP)

This approach has become popular on platforms ranging from personal workstations that support a few microprocessors in parallel, to RISC (Reduced Instruction Set Computer) based machines, all the way up to the largest mainframes.

• The architecture provides high-speed data access for a limited number of processors, but it is not scalable beyond about 64 processors when the interconnection network becomes a bottleneck.

Page 13: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Shared Memory (cont’d)

• Shared Memory

CPU CPU CPU CPU

Interconnection network

MemoryDB DB DB

Page 14: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Shared disk is a loosely-coupled architecture optimized for applications that are inherently centralized and require high availability and performance.

• Each processor can access all disks directly, but each has its own private memory.

• Shared disk architecture eliminates the shared memory performance bottleneck without introducing the overhead associated with physically partitioned data.

Page 15: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Shared Disk (cont’d)• Shared Disk

Memory Memory Memory Memory

CPU CPUCPUCPU

Interconnection network

DB DB DB

Page 16: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Shared nothing known as massively parallel processing, is a multiple processor architecture in which each processor is part of a complete system, with its own

memory and disk storage.• The database is partitioned among all

the disks on each system associated with the database, and data is transparently available to users on all system.

• This architecture can easily support a large number of processors.

Page 17: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Shared nothing (cont’d)• SN

Memory

CPU

CPU

Memory

Interconnection network

Memory

CPU

Memory

CPU

DB

DB

DB

DB

Page 18: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Homogeneous & Heterogeneous DDBMSs

Homogeneous system• All sites use the

same DBMS product.

Heterogeneous system• Sites may run different

DBMS products, which need not be based on the same underlying data model, and so the system may be composed of relational, network, hierarchical, and object-oriented DBMSs.

Page 19: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Heterogeneous system problems

In a heterogeneous system, translations are required toallow communication between different DBMSs. The system has the task of locating the data and performing any necessary translation.

Data required from another site may have:• Different hardware• Different DBMS products• Different hardware and different DBMS products

If the hardware is different but the DBMS products are the same, involving the change of codes and word length. If the DBMS products are different, involving the mapping of data structures in one data model to the equivalent data structures inanother data model.

Page 20: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Heterogeneous system problems (cont’d)

An additional complexity is the provision of a common

Conceptual schemas. The integration of data models can be very difficult owing to the semantic heterogeneity.

For example, attributes with the same name in two Schemas may represent different things. Equally

well, Attributes with different names may model the same

thing.

Page 21: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Solution Gateways, which convert the language and model of each different DBMS into the language and model of the relational system.Limitation• It may not support transaction management. The

gateway between two systems may be only a query translator. For example, a system may not coordinate concurrency control and recovery of transactions that involve updates to the pair of databases.

• The gateway approach is concerned only with the problem of translating a query expressed in one language into an equivalent expression in another language. As such, generally it does not address the issues of homogenizing the structural and representational differences between different schemas.

Page 22: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

A multidatabase system (MDBS) is a distributed DBMS in which each site maintains complete autonomy. An MDBS resides transparently on top of existing database and file systems, and presents a single database to its users. It maintains a global schema against which users issue queries and updates; an MDBS maintains only the global schema and the local DBMSs themselves maintain all user data.

Page 23: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Concepts of Networking

NetworkAn interconnected collection of autonomouscomputers that are capable of exchanging information. For our purposes, the DDBMS is built on top of a network in such of a way that the Network is hidden from the user.

Page 24: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Classification of network

LAN: a local area network is intended for connecting computers at the same site.

WAN: a wide area network is used when computers or LANs need to be connected over long distances.

A special case of the WAN is a metropolitan area network (MAN), which generally covers a city or suburb.

Page 25: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Summary of WAN and LAN characteristics

WAN• Distances up to thousands of

kilometers link autonomous computers

• Network managed by independent organization (using telephone or satellite links)

• Data rate up to 33.6 kbits/(dial-up via modem), 45 Mbit/s (T3 circuit)

• Complex protocol• Use point-to-point routing• Use irregular topology• Error rate about 1:10^5

LAN• Distances up to a few

kilometers• Link computers that

cooperate in distributed applications

• Network managed by users (using privately owned cables)

• Data rate up to 2500 Mbit/s (ATM)

• Simpler protocol• Use broadcast routing• Use bus or ring topology• Error rate about 1:10^9

Page 26: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Network protocolsa set of rules that determines how messages between computers are sent, interpreted, and processed.

• TCP/IP (Transmission Control Protocol/Internet Protocol)

• SPX/IPX (Sequenced Packet Exchange/Internetwork Package Exchange)

• NetBIOS (Network Basic Input/Output System)

• APPC (Advanced Program-to-Program Communications)

Page 27: Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003

Network protocol (cont’d)

• DECnet• AppleTalk• WAP (Wireless Application Protocol)• SPX/IPX (Sequenced Packet

Exchange/Internetwork Package Exchange)