01 whirlwind tour

84
© Jim Gray, AndreasReuter Transaction Processing -Conceptsand Techniques W IC S August2 -6,1999 The Whirlwind Tour Chapter 1a Aug. 2 Aug. 3 Aug. 4 Aug. 5 Aug. 6 9:00 Intro & terminology TP m ons & ORBs Logging & res. M gr. Files& BufferM gr. Structured files 11:00 Reliability Locking theory Res. M gr. & Trans. M gr. COM + A ccesspaths 13:30 Fault tolerance Locking techniques CICS & TP & Internet CORBA/ EJB + TP G roupw are 15:30 Transaction models Q ueueing A dvanced Trans. M gr. Replication Perform ance & TPC 18:00 Reception Workflow Cyberbricks Party FREE

Upload: ashish61scs

Post on 08-Aug-2015

44 views

Category:

Education


1 download

TRANSCRIPT

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The Whirlwind Tour

Chapter 1a

Aug. 2 Aug. 3 Aug. 4 Aug. 5 Aug. 6 9:00 Intro &

terminologyTP mons& ORBs

Logging &res. Mgr.

Files &Buffer Mgr.

Structuredfiles

11:00 Reliability Lockingtheory

Res. Mgr. &Trans. Mgr.

COM+ Access paths

13:30 Faulttolerance

Lockingtechniques

CICS & TP& Internet

CORBA/EJB + TP

Groupware

15:30 Transactionmodels

Queueing AdvancedTrans. Mgr.

Replication Performance& TPC

18:00 Reception Workflow Cyberbricks Party FREE

2

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Transactions: Where It All Started

[Cuneiform] documents now number about half a million, three- quarters of them more or less directly related to the history of law - dealing, as they do, with contracts, acknowledgment of debts, receipts, inventories, and accounts, as well as containing records and minutes of judgments rendered in courts, business letters, administrative and diplomatic correspondence, laws, international treaties, and other official transactions. The total evidence enables the historian to reach back as far as the beginnings of writing, to the dawn of history.[ ... ]Moreover, because of the inconvenience of writing in stone or clay, Mesopotamians wrote only when economic or political necessity demanded it.

(Encyclopaedia Britannica, 1974 edition)

3

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

From Transactions to Transaction Processing Systems - I

Database. An abstract system state, represented as marks on clay tablets, was maintained. Today, we would call this the database.

Transactions. Scribes recorded state changes with new records (clay tablets) in the database. Today, we would call these state changes transactions.

The Sumerian way of doing business involved two components:

4

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

From Transactions to Transaction Processing Systems - II

Change

Reality Abstraction

Transaction

Que

ry

AnswerDB'

DB

The real state is represented by an abstraction, called the database, and the transformation of the real state is mirrored by the execution of a program, called a transaction, that transforms the database.

5

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Transactions Are In ...

Each time you make a phone call, there is a call setup transaction that allocates some resources to your conversation; the call teardown is a second transaction, freeing those resources. The call setup increasingly involves complex algorithms to find the callee (800 numbers could be anywhere in the world) and to decide who is to be billed (800 and 900 numbers have complex billing). The system must deal with features like call forwarding, call waiting, and voice mail. After the call teardown, billing may involve many phone companies.

Communications:

6

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Transactions Are In ...

Each time you purchase gas using a credit card, the point-of-sale terminal connects to the credit card company's computer. In case that fails, it may alternatively try to debit the amount to your account by connecting to your bank.

This generalizes to all kinds of point-of-sale terminals such as cash registers, ATMs, etc.

When banks balance their accounts with each other (electronic fund transfer), they use transactions for reliability and recoverability.

Finance:

7

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Transactions Are In ...

Making reservations for a trip requires many related bookings and ticket purchases from airlines, hotels, rental car companies, and so on.

From the perspective of the customer, the whole trip package is one purchase. From the perspective of the multiple systems involved, many transactions are executed: One per airline reservation (at least), one for each hotel reservation, one for each car rental, one for each ticket to be printed, on for setting up the bill, etc.

Along the way, each inquiry that may not have resulted in a reservation is a transaction, too.

Travel:

8

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Transactions Are In ...

Order entry, job and inventory planning and scheduling, accounting, and so on are classical application areas of transaction processing. Computer integrated manufacturing (CIM) is a key technique for improving industrial productivity and efficiency. Just-in-time inventory control, automated warehouses, and robotic assembly lines each require a reliable data storage system to represent the factory state.

Manufacturing:

9

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Transactions Are In ...

This application area includes all kinds of physical machinery that needs to interact with the real world, either as a sensor, or as an actor. Traditionally, such systems were custom made for each individual plant, starting from the hardware. The usual reason for that was that 20 years ago off-the-shelf systems could not guarantee real-time behavior that is critical in these applications. This has changed, and so has the feasibility of building entire systems from scratch. Standard software is now used to ensure that the application will be portable.

Real-Time Systems:

10

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

A Transaction Processing System

A transaction processing system (TP-system) provides tools to ease or automate application programming, execution, and administration of complex, distributed applications.

Transaction processing applications typically support a network of devices that submit queries and updates to the application.

Based on these inputs, the application maintains a database representing some real-world state.

Application responses and outputs typically drive real-world actuators and transducers that alter or control the state.

The applications, database, and network tend to evolve over several decades.

Increasingly, the systems are geographically distributed, heterogeneous (they involve equipment and software from many different vendors), continuously available (there is no scheduled

downtime), and have stringent response time requirements.

11

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

ACID Properties: First Definition

Atomicity: A transaction’s changes to the state are atomic: either all happen or none happen. These changes include database changes, messages, and actions on transducers.

Consistency: A transaction is a correct transformation of the state. The actions taken as a group do not violate any of the integrity constraints associated with the state. This requires that the transaction be a correct program.

Isolation: Even though transactions execute concurrently, it appears to each transaction T, that others executed either before T or after T, but not both.

Durability: Once a transaction completes successfully (commits), its changes to the state survive failures.

12

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Structure of a Transaction Program

The application program declares the start of a new transaction by invoking BEGIN_WORK().

All subsequent operations will be covered by the transaction. Eventually, the application program will call COMMIT_WORK(), if a new consistent state has been reached. This makes sure the new state becomes durable.

If the application program cannot complete properly (violation of consistency constraints), it will invoke ROLLBACK_WORK(), which appeals to the atomicity of the transaction, thus removing all effects the program might have had so far.

If for some reason the application fails to call either commit or rollback (there could be an endless loop, a crash, a forced process termination), the transaction system will automatically invoke ROLLBACK_WORK() for that transaction.

13

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The End User’s View of a Transaction Processing System

Delete Message Cancel Message

Logon

Name______

Password___

From Subject Jim hi Chris it's raining Betty more bugs

from: Jim subject: hi <text>

Headers

Read Message to: Jim subject: dinner <text, sound, image>

Send Message

Mailboxes and MailOperations on Mail and Mailboxes

Andreas

Jim

Bruce

Chris

Betty

14

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The Administrator's/Operator’s View of a TP System

Data Base

Data Comm

Hong Kong

Berlin

New York

Application

Mail Gateway Other Mail Systems Repository

Administrator & Operator

15

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Performance Measures of Interactive Transactions

Performance/ Small/Simple Medium Complex

Transaction

________________________________________________________________

Instr./transaction 100k 1M 100M

Disk I/O / TA 1 10 1000

Local msgs. (B) 10 (5KB) 100 (50KB) 1000 (1MB)

Remote msgs. (B) 2 (300B) 2 (4KB) 100 (1MB)

Cost/TA/second 10k$/tps 100k$/tps 1M$/tps

Peak tps/site 1000 100 1

16

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Client-Server Computing: The Classical Idea

DeleteSendRead

Headers

Logon

Presentation In Workstation

Workstation Client Host Server(s)

Data communications

Transactional Remote

Procedure Call

TP Monitor

Headers

Read

Logon

Send

Delete

Services

Dat

a B

ase

17

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Client-Server Computing: The CORBA Idea

Client on WSPresentationServices etc

IDLStub

IDLSkeleton

Object Implementation:Jim´s Mailbox

Request: Delete

Object Request Broker

18

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Client-Server Computing: The WWW Idea

WWW-Browser

Java-Applet

+

Java DatabaseConnection

(JDBC)Driver Code

HTTPServer

Java-applet

JDBC-driver code

DatabaseServer

proprietary protocol

JDBC-ODBC-bridge

ODBCdriver

prop.protocol

JDBC networkdriver

public protocol

(e.g. TCP/IP)

JDBCdriver

19

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

User Client TP Monitor Service (server)

Scre

en

Net

wor

k

Net

wor

k

Another TP-Monitor and Server

Dat

abas

e

Tim

e

Using Transactional Remote Procedure Calls (TRPCs)

20

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Terms We Have Introduced So Far

Resource manager: The system comes with an array of transactional resource managers that provide ACID operations on the objects they implement. Database systems, persistent programming languages, and queue managers are typical examples.

Durable state: Application state represented as durable data stored by the resource managers.

TRPC: Transactional remote procedure calls allow the application to invoke local and remote resource managers as though they were local. They also allow the application designer to decompose the application into client and server processes on different computers.

Transaction program: Inquiries and state transfor-mations are written as programs in conventional or specialized programming languages. The programmer brackets the successful execution of the program with a Begin-Commit pair and brackets a failed execution with a Begin-Rollback pair.

21

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Terms We Have Introduced So Far

Atomicity: At any point before the commit, the application or the system may abort the transaction, invoking rollback. If the transaction is aborted, all of its changes to durable objects will be undone (reversed), and it will be as though the transaction never ran.

Consistency: The work within a Begin-Commit pair must be a correct transformation.

Isolation: While the transaction is executing, the resource managers ensure that all objects the transaction reads are isolated from the updates of concurrent transactions.

Durability: Once the commit has been successfully executed, all the state transformations of that transaction are made durable and public.

22

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The World According to the Resource Manager

Application

Application Servers

Resource Managers

Resource Managers

TransactionApplication

Servers

Transaction Manager

23

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Where To Split Client/Server?

Presentation

Flow Control

Application Logic (=business objects)

Data Access

Server

Thin

ThinFat

Fat

24

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Client/Server Infrastructure

Client ServerMiddleware

GUI

OOUI

SystemMgmt.

OS

Objects

Group-ware

TP-Mon.

DBMS

OS

SQLORB

TRPC

Security

Transport

Mail

WWW

Files

etc.

25

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Transactional Core Services

Recovery Manager

Write Commit Log Record & Force Log

Commit Phase 1? Yes/No

Commit Phase 2 ack

Transaction Recovery Functions

Work RequestsResource Manager

Normal Funcitons

Lock Requests

Log Records

Work Requests

Lock Manager

transid

Log Manager

Application

Begin_Work()

Commit_Work()

Join_Work

26

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The X/Open TP-Model

RM Resource Manager

TM Transaction Manager

Application

Requests

Begin Commit Abort

Join

Prepare, Commit, Abort

27

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The X/Open Distributed Transaction Processing Model

TM Transaction

Manager

Application

Requests

Begin Commit Abort

TM Transaction

Manager

RM Resource Manager

Server

RequestsRemote Requests

Start

CM Communications

Manager

CM Communications

Manager

IncomingOutgoing

RM Resource Manager

Prepare, Commit, AbortPrepare, Commit, Abort

28

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The OTS Model

transactionoriginator

TA-context

TA-context

TA-context

recoverableserver

Transactionservice

transmittedwith request

creationtermination

invocationcommitcoordination

29

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Transaction Processing System Feature List

Application development features

Application generators; graphical programming interfaces; screen painters; compilers; CASE tools; test data generators; starter system with a complete set of administrative and operations functions, security, and accounting.

Repository features

Description of all components of the system, both hardware and software. Description of the dependencies among components (bill-of-material). Description of all changes to all components to keep track of different versions. The repository is a database. Its role in the system must be complete, extensible, active and allow for local autonomy.

TP-Monitor Features

Process management; server classes; transactional remote procedure calls; request-based authentication and authorization; support for applications and resource managers in implementing ACID operations on durable objects.

30

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Transaction Processing System Feature List

Data communications features

Uniform I/O interfaces; device independence; virtual terminal; screen painter support; support for RPC and TRPC; support for context-oriented communication (peer-to-peer).

Database features

Data independence; data definition; data manipulation; data control; data display; database operations.

Operations featuresArchiving; reorganization; diagnosis; recovery; disaster recovery; change control; security; system extension.

Education and testing featuresImbedded education; online documentation; training systems; national language features; test database generators; test drivers.

31

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Data Communications Protocols

SNA LU0

SNA LU6.2 PU2.1

OSIX.25 TCP IP

Named Pipes

Standard Interface To All Networks

Applications

add: transactions, rpc, naming, security, reliable messaeges, and uniform interface.

32

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Presentation Management

1 LOGON 2 NAME PIC X(20) 2 PIN PIC 9(4)

READ TERMINAL CHECK PIN DISPLAY HELLO OR NO

OUR BANK

NAME_____

PASSWORD_

Form Description Repository

Device Description

Application

PM

33

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

SQL Data DefinitionTABLE (=File)

TUPLE (=record)

COLUMN (=field)

dept loc

emp view

SELECT dept,loc FROM employee where loc = 7;

DEFINE VIEW emp_view AS

VIEW

DOMAIN (= type)

name dept loc

employee

34

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

SQL Data Manipulation

name dept loc

employee

a

name dept loc

employee address

a

PROJECT (column subset)

SELECT (row

subset)

JOIN (matching values)

join

name dept loc

employee

a

project select

dept mgr

35

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Summary of Chapter 1

A transaction processing system is a large web of application generators, system design and operation tools, and the more mundane language, database, network, and operations software.

The repository and the applications that maintain it are the mechanisms needed to manage the TP system. The repository is a transaction processing application.

It represents the system configuration as a database and supplies change control by transactions that manipulate the configuration and the repository.

The transaction concept, like contract law, is intended to resolve the situation when exceptions arise. The first order of business in designing a system is, therefore, to have a clear model of system failure modes. What breaks? How often do things break?

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Aug. 2 Aug. 3 Aug. 4 Aug. 5 Aug. 6 9:00 Intro &

terminologyTP mons& ORBs

Logging &res. Mgr.

Files &Buffer Mgr.

Structuredfiles

11:00 Reliability Lockingtheory

Res. Mgr. &Trans. Mgr.

COM+ Access paths

13:30 Faulttolerance

Lockingtechniques

CICS & TP& Internet

CORBA/EJB + TP

Groupware

15:30 Transactionmodels

Queueing AdvancedTrans. Mgr.

Replication Performance& TPC

18:00 Reception Workflow Cyberbricks Party FREE

Chapter 1b

Basic Terminology

37

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

A Word About Words (Chapter 2)

Humpty Dumpty: “When I use a word, it means exactly what I chose it to mean; nothing more nor less.” Alice: “The question is, whether you can make words mean so many different things.”Humpty Dumpty: “The question is, which is to be master, that’s all.”

Lewis Carroll

38

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Basic Computer Terms

To get any confusion that might be caused by the many synonyms in our field out of the way, let us adopt the following conventions for the rest of this class:

domain = data type = ...field = column = attribute = ...record = tuple = object = entity = ...block = page = frame = slot = ...file = data set = table = ...process = task = thread = actor = ...function=request=method=...

All the other terms and definitions we need will be briefly introduced and explained during the session.

39

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Basic Hardware Architecture I

In Bell and Newell’s classic taxonomy, hardware consists of three types of modules: Processors, memory, and communications (switches or wires).

Processors execute instructions from a program, read and write memory, and send data via communication lines.

Computers are generally classified as supercomputers, mainframes, minicomputers, workstations, and personal computers. However, these distinctions are becoming fuzzy with current shifts in technology.

40

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Basic Hardware Architecture II

Today’s workstation has the power of yesterday’s mainframe. Similarly, today’s WAN (wide area network) has the communications bandwidth of yesterday’s LAN (local area network). In addition, electronic memories are growing in size to include much of the data formerly stored on magnetic disk.

These technology trends have deep implications for transaction processing.

41

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Basic Hardware Architecture III

Distributed processing: Processing is moving closer to the producers and consumers of the data (workstations, intelligent sensors, robots, and so on).

Client-server: These computers interact with each other via request-reply protocols. One machine, called the client, makes requests to another, called the server. Of course, the server may in turn be a client to other machines.

Clusters: Powerful servers consist of clusters of many processors and memories, cooperating in parallel to perform common tasks.

42

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Basic Hardware Architecture IV

processor

Memory

processor

Memory

processor

processor

The Network

processor

Memory

processor

43

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Memories - The Economic Perspective I

The processor executes instructions from virtual memory, and it reads and alters bytes from the virtual memory. The mapping between virtual memory and real memory includes electronic memory, which is close to the processor, volatile, fast, and expensive, and magnetic memory, which is "far away" from the processor, non-volatile, slow, and cheap. The mapping process is handled by the operating system with some hardware assistance.

Memory performance is measured by its access time: Given an address, the memory presents the data at some later time. The delay is called the memory access time. Access time is a combination of latency (the time to deliver the first byte), and transfer time (the time to move the data). Transfer time, in turn, is determined by the transfer size and the transfer rate. This produces the following overall equation:memory access time = latency + ( transfer size / transfer rate )

44

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Memories - The Economic Perspective II

Memory price-performance is measured in one of two ways: Cost/byte. The cost of storing a byte of data in that media. Cost/access. The cost of reading a block of data from that media.

This is computed by dividing the device cost by the number of accesses per second that the device can perform.

The actual units are cost/access/second, but the time unit is implicit in the metric’s name.

These two cost measures reflect the two different views of a memory’s purpose: it stores data, and it receives and retrieves data.

45

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Memories- The Economic Perspective III

Kilo Byte

Mega Byte

Giga Byte

Tera Byte

Peta Byte

access time (seconds)10 10 10

10

10

10

3

5

7

10 10-9 -6 -3 0 3

10

10

10

9

11

13

cache

electronic main

1015

electronic secondary

(RAMdisc)

magnetic optical

discs

online tape

nearline tape and

optical disc

Size vs Speedoffline tape

Typ

ical

larg

e sy

stem

cap

acit

y

46

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Memories- The Economic Perspective VI

10 10 10-9 -6 -3 10 0 10 3

$

electr. main electronic

secondary

magnetic optical discs

online tape

nearline tape,

optical disc

Price vs Speed

10

10

10

10

-4

-2

0

10

10

2

4

6

access time (seconds)

offline tape

$ / M

B

47

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Magnetic Memory

There are two types of magnetic storage media: disk and tape. Disks rotate, passing the data in the cylinder by the electronic read-write heads every few milliseconds. This gives low access latency. The disk arm can move among cylinders in tens of milliseconds. Tapes have approximately the same storage density and transfer rate, but they must move long distances if random access is desired. Consequently, tapes have large random access latencies—on the order of seconds.

Disk Access Time = Seek_Time +

Rotational_Latency +

(Transfer_Size/ Transfer_Rate)

48

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Magnetic Memory

Compare the times required for two access patterns to 1MB stored in 1000 blocks on disk:

Sequential access: Read or write sectors [x, x + 1, ..., x + 999] in ascending order. This requires one seek (10 ms) and half a rotation (5 ms) before the data in the cylinder begins transferring the megabyte at 10 MBps (the transfer takes 100 ms, ignoring one-cylinder seeks).

The total access time is 115ms.

Random access: Read the 1000 sectors [x, ..., x + 999] in random order. In this case, each read requires a seek (10 ms), half a rotation (5 ms), and then the 1 kb transfer (.1 ms). Since there are 1000 of these events, the total access time is 15.1 seconds.

49

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Memory Hierarchies

off line

processor

cache

main memory

online external storage

near line (archive) storage

memory capacity

current data

registers

cache

block addressed non-volatile electronic or magnetic

tape or disc robots

electronic storage

50

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Memory Hierarchies

The hierarchy uses small, fast, expensive cache memories to cache some data present in larger, slower, cheaper memories.

If hit ratios are good, the overall memory speed approximates the speed of the cache.

At any level of the memory hierarchy, the hit ratio is defined as:hit ratio = references satisfied by cache / all references to cache

Suppose a cache memory with access time C has hit rate H, and suppose that on a miss the secondary memory access time is S. Further, suppose that C = .01 • S. The effective access time of the cache will be as follows:Effective memory access time = H • C + (1 - H) • S

= H • (.01 • S) + ( 1 - H) • S = (1 - .99 • H) • S (1 - H) • S

51

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The Five Minute Rule Assume there are no special response time (real-time) requirements; the decision to

keep something in cache is, therefore, purely economic. To make things simple, suppose that data blocks are 10 KB. At 1995 prices, 10 KB of main memory cost about $1. Thus, we could keep the data

in main memory forever if we were willing to spend a dollar. With 10 KB of disk costing only $.10, we could save $.90 if we kept the 10 KB on

disk. In reality, the savings are not so great; if the disk data is accessed, it must be moved

to main memory, and that costs something. How much, then, does a disk access cost?

A disk, along with all its supporting hardware, costs about $3,000 (in 1995) and delivers about 30 acc./sec.; the cost, therefore, is about $100. At this rate, if the data is accessed once a second, it costs $100.10 to store it on disk (disk storage and disk access costs). That is considerably more than the $1 to store it in main memory.

The break-even point is about one access per 100 seconds. At that rate, the main memory cost is about the same as the disk storage cost plus the disk access costs. At a more frequent access rate, diskstorage is more expensive. At a less frequent rate, disk storage is cheaper. Anticipating the cheaper main memory that will result from technology changes, this observation is called the five-minute rule rather than the two-minute rule.

52

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The Five Minute Rule

Keep a data item in electronic memory if its access frequency is five minutes or higher; otherwise keep it in magnetic memory.

Similar arguments apply to objects stored on tape and cached on disk. Given the object size, the cost of cache, the cost of secondary memory, and the cost of accessing the object in secondary memory once per second, the frequency at the break-even point in units of accesses per second (a/s) is given by the following formula:

Frequency ((Cache_Cost/Byte - Secondary_Cost/Byte) . Object_Bytes) / (Object_Access_Per_Second_Cost) a/s

53

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The Rules of Exponential Growth

Electronic memory:

MemoryChipCapacity(year) = 4 Kb/chip

for year in [1970...2000] Moore’s Law

Magnetic memory:

MagneticAreaDensity(year) = 10 Mb/inch2

for year [1970...2000] Hoagland’s Law

Processors:

SunMips(year) = 2 MIPS

for year in [1984...2000] Joy’s Law

((year-1970)/3)

((year-1970)/10)

(year-1984)

54

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Communication Hardware

The definition of the four kinds of networks by their diameters. These diameters imply certain latencies (based on the speed of light). In 1990, Ethernet (at 10 Mbps) was the dominant LAN. Metropolitan networks typically are based on 1 Mbps public lines. Such lines are too expensive for transcontinental links at present; most long-distance lines are therefore 50 Kbps or less. As you will get from the news, these things are changing fast.

Cluster 100 m .5 µs 1 Gbps 10 µs

LAN (local area network) 1 km 5. µs 10 Mbps 1 ms

MAN (metro area network) 100 km .5 ms 1 Mbps 10 ms

WAN (wide area network) 10,000 km 50. ms 50 Kbps 210 ms

The early 90s

55

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Communication Hardware

Type of Network Diameter Latency Bandwidth Send 1 KB

Cluster 100 m .5 µs 1 Gbps 5 µs

LAN (local area network) 1 km 5. µs 1 Gbps 10 µs

MAN (metro area network) 100 km .5 ms 100 Mbps .6 ms

WAN (wide area network) 10,000 km 50. ms 100 Mbps 50 ms

Point-to-point bandwidth likely to be common among computers by the year 2000.

Scenario 2000

56

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Processor Architectures

processorprocessor

processorprocessorprocessorprocessor

The Network

Private Memory

processor

Shared MemoryGlobal Memory Shared Disks /

tapes

Private Memories

57

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Processor Architectures

Shared nothing: In a shared-nothing design, each memory is dedicated to a single processor. All accesses to that data must pass through that processor. Processors communicate by sending messages to each other via the communications network.

Shared global: In a shared-global design, each processor has some private memory not accessible to other processors. There is, however, a pool of global memory; shared by the collection of processors. This global memory is usually addressed in blocks (units of a few kilobytes or more) and is RAM disk or disk.

Shared memory: In a shared-memory design, each processor has transparent access to all memory. If multiple processors access the data concurrently, the underlying hardware regulates the access to the shared data and provides each processor a current view of the data.

58

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Address Spacesse

gmen

ts

process

address space

process

shar

ed c

ode

segm

ents

shar

ed d

ata

segm

ents

process

address space address space

59

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Address Spaces

Memory segmentation and sharing: A process executes in an address space—a paged, segmented array of bytes. Some segments may be shared with other address spaces. The sharing may be execute-only, read-only, or read-write. Most of the segment slots are empty (lightly shaded boxes), and most of the occupied segments are only partially full of programs or data.

To simplify memory addressing, the virtual address space is divided into fixed-size segment slots, and each segment partially fills a slot.

Typical slot sizes range from 2**24 to 2**32 bytes. This gives a two-dimensional address space, where addresses are {segment_number, byte}. Again, segments are often partitioned into virtual memory pages, which are the unit of transfer between main and secondary memory. If an object is bigger than a segment, it can be mapped into consecutive segments of the address.

60

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Processes

A process is a virtual processor. It has an address space that contains the program the process is executing and the memory the process reads and writes. One can imagine a process executing Java programs statement by statement, with each statement reading and writing bytes in the address space or sending messages to other processes.

Processes provide an ability to execute programs in parallel; they provide a protection entity; and they provide a way of structuring computations into independent execution streams. So they provide a form of fault containment in case a program fails.

Processes are building blocks for transactions, but the two concepts are orthogonal. A process can execute many different transactions over time, and parts of a single transaction may be executed by many processes.

Each process executes on behalf of some user, or authority, and with some priority. The authority determines what the process can do: which other processes, devices, and files the process can address and communicate with. The process priority determines how quickly the process’s demand for resour-ces will be serviced if other processes make competing demands. Short tasks typically run with high priority, while large tasks are given lower priority.

61

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Protection Domains

There are two ways to provide protection :

Process = protection domain: Each subsystem executes as a separate process with its own private address space. Applications execute subsystem requests by switching processes, that is, by sending a message to a process.

Address space = protection domain: A process has many address spaces: one for each protected subsystem and one for the application. Applications execute subsystem requests by switching address spaces. The address space protection domain of a subsystem is just an address space that contains some of the caller’s segments; in addition, it contains program and data segments belonging to the called subsystem. A process connects to the domain by asking the subsystem or OS kernel to add the segment to the address space. Once connected, the domain is callable from other domains in the process by using a special instruction or kernel call.

62

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Protection Domains

process

Application DataBase Network OS Kernel

A process may have many protection domains.

63

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Threads

There is a need for multiple processes per address space:

For example, to scan through a data stream, one process is appointed the producer, which reads the data from an external source, while the second process processes the data. Further examples of cooperating processes are file read-ahead, asynchronous buffer flushing, and other housekeeping chores in the system.

Processes can share the same address space simply by having all their address spaces point to the same segments. Most operating systems do not make a clean distinction between address spaces and processes. Thus a new concept, called a thread or a task, is introduced.

But note: Several operating systems do not use the term process at all. For example, in the Mach operating system, thread means process, and task means address space; in MVS, task means process, and so on.

64

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Threads

The term thread often implies a second property: inexpensive to create and dispatch. Threads are commonly provided by some software that found the operating system processes to be too expensive to create or dispatch. The thread software multiplexes one big operating system process among many threads, which can be created and dispatched hundreds of times faster than a process.

The term thread is used in the following to connote these light-weight processes. Unless this light-weight property is intended, “process” is used. Several threads usually share a common address space. Typically, all the threads have the same authorization identifier, since they are part of the same address space domain, but they may have different scheduling priorities.

65

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Messages and Sessions

There are two styles of communication among processes:

Datagrams: The sender of a message determines the recipient's address (e.g. the process name) and constructs an envelope consisting of the sender's name and address, the recipient's name and address, and the message text. This envelope is delivered to the capable hands of the communication system. It is analogous to sending letters by mail.

Sessions: Before any messages are sent, a fixed connection is established between sender and receiver, a so-called session. Once it has been established, both parties can send and receive messages via this session. This symmetry is often referred to as "peer-to-peer". Establishing a session requires a datagram. A session must at some point be closed down explicitly. It is analogous to a phone conversation.

66

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Advantages of Sessions

Shared state: A session represents shared state between the client and the server. A datagram might go to any process with the designated name, but a session goes to a particular instance of that name.

Authorization: Processes do not always trust each other. The server often checks the client’s credentials to see that the client is authorized to perform the requested function. The authentication protocols require multi-message exchanges. Once the session key is established, it is shared state.

Error correction: Messages flowing in each session direction are numbered sequentially. These sequence numbers can detect lost messages and duplicate messages.

Performance: The operations described are fairly costly. Each of the steps often involves several messages. By establishing a session, this information is cached.

67

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Clients and Servers

The question of how computations consisting of many interacting processes should be structured has no simple answer. Currently, two styles are particularly popular: peer-to-peer and client-server.

The debate about which style is "better" often creates the impression that they are radically different. But in reality, peer-to-peer is more general and more complex, and it subsumes client-server. Here is a brief characterization:

Peer-to-peer: The two processes are independent peers, each executing its computation and occasionally exchanging data with the other.

Client-server: The two processes interact via request-reply exchanges in which one process, the client, makes a request to a second process, the server, which performs this request and replies to the client.

68

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Clients and Servers

The limitation of the client-server model lies in the fact that it implies a synchronous pattern of one request/one response.

There are, however, cases in which one request generates thousands of replies, or where thousands of requests generate one reply. Operations that have this property include transferring a file between the client and server or bulk reading and writing of databases. In other situations, a client request generates a request to a second server, which, in turn, replies to the client. Parallelism is a third area where simple RPC is inappropriate. Because the client-server model postulates synchronous remote procedure calls, the computation uses one processor at a time. However, there is growing interest in schemes that allow many processes to work on problems in parallel. The RPC model in its simplest form does not allow any parallelism.

69

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Remote Procedure Calls (RPCs)

LOCAL PROCDURE CALL

REMOTE PROCDURE CALL

z = add(x,y)

z

add(int x,y) { return x + y }

z = add(x,y)

add(int x,y) { return x + y }

add, x, y

x + y

unpack & call

pack and sendunpack,return

pack & send

z

Server

70

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Naming

Naming has to do with the problem of how a client denotes a server it wants to invoke. Typical naming schemes distinguish between an object's name, its address, and its location. The name is an abstract identifier for the object, the address is the path to the object, and the location is where the object is.

An object can have several names. Some of these names may be synonyms, called aliases. Let us say that Bruce and Lindsay are two aliases for Bruce Lindsay. For this to be explicit, all names, addresses, and locations must be interpreted in some context, called a directory. For example, in our RPC context, Bruce means Bruce Nelson, and in our publishing context, Bruce means Bruce Spatz. Within the 408 telephone area, Bruce Lindsay’s address is 927-1747, and outside the United States it is +1-408-927-1747.

71

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Name Servers

Names are grouped into a hierarchy called the name space. An international commission has defined a universal name space standard, X.500, for computer systems. The commission administers the root of that name space. Each interior node of the hierarchy is a directory. A sequence of names delimited by a period (.) gives a path name from the directory to the object.

No one stores the entire name space—it is too big, and it is changing too rapidly. Certain processes, called name servers, store parts of the name space local to their neighborhood; in addition, they store a directory of more global name servers.

72

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Authentication Techniques

Passwords are the simplest technique. The client has a secret password, a string of bytes known only to it and the server. The client sends his password to the server to prove the client’s identity. A second password is then needed to authenticate the server to the client. Thus, two passwords are required, and they must be sent across the wire.

Challenge-response uses only one password or key. In this scheme, the client and the server share a secret encryption key. The server picks a random number, N, and encrypts it with the key as EN. The server sends EN to the client and challenges the client to decrypt it using the secret key. If the client responds with N, the server believes the client knows the secret encryption key. The client can also authenticate the server by challenging it to decrypt a second random number. The shared secret is stored at both ends, but random numbers are sent across the wire.

73

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Authentication Techniques

Public key system: Each authid has a pair of keys—a public encryption key, EK, and a private decryption key, DK. The keys are chosen so that DK(EK(X)) = X, but knowing only EK and EK(X) it is hard to compute X. Thus, a process’s ability to compute X from EK(X) is proof that the process knows the secret DK. Each authid publishes its public key to the world. Anyone wanting to authenticate the process as that authid goes through the challenge protocol: The challenger picks a random number X, encrypts it with the authid’s public key EK, and challenges the process to compute X from EK(X). Secrets are stored in one place only, and they do not go across the wire.

74

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Scheduling

The purpose of scheduling is to make sure all requests get processed, i.e. are assigned to a specific server process. There are basically two additional constraints:

Short response times: The requests should not wait longer than necessary before they get serviced.

Economic usage of resources: The required throughput should be achieved with the minimum number of resources (processors, nodes, links, etc.).

Throughput and response time at resource utilization r are related by the following formula:

Average_Response_Time(r) = (1/ (1 - r)) • Service_Time

75

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

The Scheduling Problem

0

10

20

30

0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1

Utilization:

Res

pons

e T

ime

(in

mul

tipl

es o

f se

rvic

e ti

me) Response Time vs Utilization

76

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

File Organizations

File

unstructured structured

entry sequenced relative key sequenced hash

associativedirect

77

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

SQL in a Distributed Environment

Application Program

SQL : set oriented logic

File System: record logic

Network: msg. transport

SQL: set oriented logic

File Server: records and files

Network: message transport

SQL: set oriented logic

File Server: records and files

Network: message transport

SQL: set oriented logic

File Server: records and files

Network: message transport

SQL: set oriented logic

File Server: records and filesNetwork: message transport

Client

SQL Servers

78

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Software Performance

1,000,000

procedure call

domain switch

LAN rpc

local rpc

WAN rpc

process create

sequential write recordrandom read memory record

simple database transaction

1

10

100

1,000

10,000

100,000

.1

INSTRUCTIONS MICROSECONDS (with 10 mips and Ethernet)

disc accessWAN transmit delay

1KB on Ethernet

1KB memory copy

null transactionmain memory transation

process dispatch

random read/write disc record

sequential read record

random write memory record

1

10

100

1,000

10,000

100,000

79

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

message formats

protocol machine protocol machine

Client Machine

Operating System

ServerOperating System

Unix VMS

API compilerPortable Program linker/loader

"local" compiled program

Porting and Installation Steps

Client process

FAPServer Machine

Operation and Inter-Operation

Protocol Standards

80

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Relevant FAP-Standards

CSMA/CD, Token Ring, etc.: Low-level protocols that specify how bits are physically transmitted across a shared medium.

IP/TCP, NetBIOS, HTTP: Transport level protocols. LU6.2: SNA´s peer-to-peer protocol that allows both session oriented and

client-server-style communication under transaction protection. OSI-TP: ISO´s rendering of a protocol that provides a functionality very

similar to LU6.2. ASN.1: Protocol for exchanging data formatting and structuring

information. Required for RPCs in a heterogeneous environment. DRDA: Interoperability standard for IBM SQL-systems. ODBC, JDBC: Interoperability standards for general SQL-systems.

81

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Relevant API-Standards

SQL: Portability standard for accessing relational databases (lots of proprietary extensions).

APPC, CPI-C: Two of IBM´s APIs for the LU6.2 protocol. X/Open-XA, X/Open-XA+, etc.: APIs by the X/Open

consortium on ISO´s OSI-TP protocols. IDL: OMG´s interface definition language to let objects be

integrated through an object request broker. STDL: Language for programming TP-applications; based on

the ACMS TP-monitor. Java: The web´s favorite programming language; comes with

its own FAP-component.

82

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

OSI Standards and X/Open APIs

CM Communi-

cations Manager

TM Transaction

Manager

RM Resource Manager

Application

requests

begin commit abort

prepare, commit,

abort

transid is leaving this node

CM Communi-

cations Manager

TM Transaction

Manager

RM Resource Manager

Server

requests

prepare, commit,

abort

new transid is arriving

remote requests

OSI/TP and CCR protocols

start

prepare, commit, abort

+ack, -ack, restart

83

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

A Last Glance at TP-Standards

PARTICIPANTS PROTOCOL/API DEFINERapplication : TM TX X/Open DTPapplication : RM RM specific various

(e.g. SQL, Queues)application:server RPC or ROSE OSI + application

TM : RM XA X/Open DTPTM: CM XA+ X/Open DTPTM-TM OSI-TP + CCR OSI

Each resource manager (RM) registers with its local transaction manager (TM). Applications start and commit transactions by calling their local TM. At commit, the TM invokes every participating RM. If the transaction is distributed, the communications manager informs the local and remote TM about the incoming or outgoing transaction, so that the two TMs can use the OSI-TP protocol to commit the transaction.

84

© Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999

Summary

Transaction processing systems comprise all parts of a system, software and hardware.

Building such a system requires to consider end-to-end arguments at all levels of abstraction.

The performance of distributed TP systems is influenced by the hardware architecture (what is shared), by software issues (which protocols are used), and by configuration aspects (what limits scaleability).

The multitude of those influences gives rise to a constant dilemma: Should one restrict the variety to few (proprietary) components for better tuning and performance, or should one embrace all the standards for openness - at the risk of poor scaleability and performance?