c ompiling h igh-level a ccess i nterfaces for m ulti-site s oftware stanford university

43
November 1 999 CHAIMS 1 Compiling High-level Access Interfaces for Multi-site Software Stanford University Objective: Investigate revolutionary approaches to large-scale software composition. Approach: Develop & validate a composition-only language. Contributions and plans: Hardware and software platform independence. Asynchrony by splitting up CALL-statement. Performance optimization by invocation scheduling. Potential for multi-site dataflow optimization. www-db.stanford.edu/CHAIMS CHAIMS: Mega-Programming Research CHAIM S

Upload: nike

Post on 25-Feb-2016

25 views

Category:

Documents


1 download

DESCRIPTION

CHAIMS: Mega-Programming Research . C ompiling H igh-level A ccess I nterfaces for M ulti-site S oftware Stanford University Objective : Investigate revolutionary approaches to large-scale software composition . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 1

Compiling High-level AccessInterfaces for Multi-site Software Stanford University

Objective: Investigate revolutionary approaches to large-scale software composition.

Approach: Develop & validate a composition-only language.

Contributions and plans: • Hardware and software platform independence.• Asynchrony by splitting up CALL-statement.• Performance optimization by invocation scheduling.• Potential for multi-site dataflow optimization.

www-db.stanford.edu/CHAIMS

CHAIMS: Mega-Programming Research

CHAIMS

Page 2: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 2

Presentation• Motivation and Objectives

– changes in software production– basis for new visions and education

• Concepts of CHAIMS– CHAIMS language– CHAIMS architecture and composition process– Scheduling– Dataflow optimization

• Status, Plans, Conclusions

Page 3: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 3

Coding

Integration

1970 1990 2010

Shift in Programming Tasks

Page 4: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 4

Hypotheses• After the Y2K effort no large software app-

lications will be written from the ground up. They will always be composed using existing legacy code.

• Composition requires functionalities not available in current mainstream programming languages.

• Large-scale systems enable and require different optimizations.

• Composition programmers will use different tools from base programmers. (type A versus type B -- [Belady]

Page 5: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 5

Languages & Interfaces• Large languages intended to support coding

and composition have not been successful– Algol 68– PL/1– Ada– CLOS

• Databases are being successfully composed, using Client-server, Mediator architectures

– distribution -- exploit network capabilities– heterogeneity -- autonomy creates heterogneity– simple schemas -- some human interpretation– service model -- public and commercial sources

in use: C, C++, Fortran, Java

Page 6: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 6

Typical Scenario: LogisticsA general has to ship troops and/or equipment

from San Diego NOSC to Washington DC:– at different times ship different kind of materiel:

» criteria for suitable means of transport differ– not every airport equally suited– congestion, prices– actual weather– certain due or ready dates

Today: call different companies, look up information on the web, make reservations one-by-oneTomorrow: system proposes shipping methods that take many conditions into account

» hand-coded systems» composition of processes

Page 7: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 7

C H A I M S

Megamodules

Megaprogram for composition, written by domain programmer

CHAIMS system automates generation of client for

distributed system

Megamodules, provided by various megamodule

providers

CHAIMS

Page 8: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 8

Megamodules - DefinitionMegamodules are large, autonomous, distributed,

heterogeneous services or processes.• large: computation intensive, data intensive, ongoing

processes (monitoring of the real world, simulation services)• distributed: remote, available to more than one client• heterogeneous: a variety of languages and systems

accessible by various distribution protocols• autonomous: maintenance and control over recourses

remains with provider, differing ontologies ( ==> SKC)Examples:

– logistics: “find best transportation from A to B”, reservation systems– genomics: compose various analysis tools (now manual control)

Page 9: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 9

Architecture for today: Fat Clients Domain expert

Client computer

Control &Computation

Services

I/O

a bcd

e

Wrappers to resolve

differences

I/O

DataResources

Page 10: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 10

Service Architecture: Thin Clients Domain expert

Client workstation

ComputationServices

IO module

MEGA modules

IO module

ab

cd

e

DataResources

Sites RT

S U T

C

Page 11: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 11

Issues in Heavy-weight ServicesServices are not free for a client:

• execution time of a service• transfer time for data• fees for services ?

What the client applications need:==> monitoring progress of a service==> allow choice among equivalent services

based on estimated waiting time and fees==> high performance due to parallelism among

distributed remote services==> preliminary overview results, information to

select level of accuracy / results size==> effective optimization techniques

Page 12: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 12

Challenge in the new world:Empower Non-technical Domain Experts

Company providing services:• domain experts of domain of service (e.g. weather)• technical experts for programming for distribution

protocols, setting up servers in a middleware system• marketing experts

“Megaprogrammer”:• is domain expert of domain that uses these services• is not technical expert of middleware system or

experienced programmer,• wants to focus on problem at hand (=results of using

megaprogram)• e.g. scientist, logistics officer

Page 13: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 13

A purely compositional language?Which languages did succeed?

– Algol, ADA: integrated composition and computation– C, C++ focus on computation

Why a new language?– complexity: not all facilities of a common language

(compare to approach of Java), – inhibiting traditional computational programming

(compare C++ and Smalltalk concerning object-oriented programming)

– focus on issue of composition, parallelism by natural asynchrony, and novel optimizations

Page 14: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 14

CHAIMS “Logical” Architecture

Customer

Megaprogramclients(in CHAIMS)

Network/Transport(DCE, CORBA,...)

Megamodules(Wrapped or Native)

Page 15: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 15

CHAIMS Physical Architecture

Network CORBA, JAVA RMI, DCE, DCOM...

MegaprogramClients in CHAIMS

Megamodules (wrapped, native) each supportingsetup, estimate, invoke, examine, extract, and terminate.

Page 16: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 16

CALL statements - growth & split

Copying

Code sharing

Parameterized computation

Objects with overloaded method names

Remote procedure calls to distributed modules

Constrained (black box) access to encapsulated data

progressin

scale ofcomputing

ExtractInvokeEstimate ExamineSetup

CHAIMSdecomposes CALL functions

CALL gainedfunctionality

Page 17: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 17

CHAIMS PrimitivesPre-invocation:

SETUP: set up the connection to a megamoduleSET-, GETATTRIBUTES: set global parameters in a megamoduleESTIMATE: get estimate of execution time for optimization

Invocation and result gathering:INVOKE: start a specific methodEXAMINE: test status of an invoked methodEXTRACT: extract results from an invoked method

Termination:TERMINATE: terminate a method invocation or a connection to

a megamodule

Control: Utility:WHILE, IF GETPARAM: get default parameters

Page 18: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 18

Megaprogram Example: Overview

InputOutput- Input- Output

RouteInfo- AllRoutes- CityPairList- ...

AirGround- CostForGround- CostForAir- ...

Routing- BestRoute- ...

RouteOptimizer- Optimum- ...

General I/O-megamodule» Input function takes as parameter a default

data structure containing names, types and default values for expected input

Travel information:» Computing all possible routes between two

cities» Computing the air and ground cost for each

leg given a list of city-pairs and data about the goods to be transported

Two megamodules that offer equivalent functions for calculating optimal routes

» Optimum and BestRoute both calculate the optimum route given routes and costs

» Global variables: Optimization can be done for cost or for time

Page 19: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 19

Megaprogram Example: Codeio_mmh = SETUP ("InputOutput")route_mmh = SETUP ("RouteInfo")...best2_mmh.SETATTRIBUTES (criterion = "cost")

cities_default = route_mmh.GETPARAM(Pair_of_Cities)input_cities_ih = io_mmh.INVOKE ("input”, cities_default)WHILE (input_cities_ih.EXAMINE() != DONE) {}cities = input_cities_ih.EXTRACT()...route_ih = route_mmh.INVOKE ("AllRoutes", Pair_of_Cities = cities)WHILE (route_ih.EXAMINE() != DONE) {}routes = route_ih.EXTRACT() …

IF (best1_mmh.ESTIMATE("Best_Route") < best2_mmh.ESTIMATE("Optimum") ) THEN {best_ih = best1_mmh.INVOKE ("Best_Route", Goods = info_goods, Pair_of_Cities = cities, List_of_Routes = routes, Cost_Ground = cost_list_ground, Cost_Air = cost_list_air)}ELSE {best_ih = best2_mmh.INVOKE ("Optimum", Goods = info_goods, …...best2_mmh.TERMINATE()

// Setup connections to megamodules.

// Set global variables valid for all invocations // of this client.

// Get information from the megaprogram user // about the goods to be transported and about// the two desired cities.

// Get all routes between the two cities.

//Get all city pairs in these routes.//Calculate the costs of all the routes.

// Figure out the optimal megamodule for// picking the best route.

//Pick the best route and display the result.

// Terminate all invocations

Page 20: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 20

Operation of one Megamodule

• SETUP

• SETATTRIBUTES provides context

• ESTIMATE serves scheduling

• INVOKE initiates remote computation

• EXAMINE checks for completion

• EXTRACT obtains results

• TERMINATE I / ALL

M handle

M handle

M handle

M handle

I handle

I handle

I handle

M handle

I handle

Page 21: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 21

CHAIMS Megaprogr. LanguagePurely compositional:

– only variety of CALLs and control flow– no primitives for input/output ==> instead use general and

problem-specific I/O megamodules– no primitives for arithmetic ==> use math megamodules

Splitting up CALL-statement:– parallelism by asynchrony in sequential program– novel possibilities for optimizations– reduction of complexity of integrated invoke statements

• higher-level language (assembler => HLLs, HLLs => composition/megamodule paradigm)

Page 22: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 22

Architecture: Creation Process

d

a

b

c

MEGA modules

CHAIMS Repository

adds information to

MegamoduleProvider

Writes native programs or wraps non-CHAIMS

compliant megamodules

Wrapper Templates

e

Page 23: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 23

writes

Architecture: Composition Process

Megaprogrammer

CSRT(compiled megaprogram)

Megaprogram(in CHAIMS language)

CHAIMS Compiler

generates

CHAIMS Repository

information

information

Page 24: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

June 1998 CHAIMS 24

Runtime Architecture

Distribution System (CORBA, RMI…)

CSRT(compiled megaprogram)

ed

a

b

cMEGA modules IO module(s)

Page 25: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 25

writes

Architecture: AllActive at different times

e

Megaprogrammer

d

a

b

c

Distribution System (CORBA, RMI…)

CSRT(compiled megaprogram)

Megaprogram(in CHAIMS language)

CHAIMS Compiler

generates

MEGA modules

CHAIMS Repository

adds information to

MegamoduleProvider

wraps non-CHAIMScompliant megamodules

information

information

Wrapper Templates

Page 26: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 26

Multiple Transport Protocols

Megaprogrammer

CHAIMS - language

M e g a m o d u l e s

CHAIMS-protocols

CORBA-idl DCE-idl Java-class

CHAIMS API defines interface between megaprogrammer and megaprogram; the megaprogram is

written in the CHAIMS language.

The CHAIMS protocols define the calls the mega-modules have to understand. These protocols are slightly different for the different distribution protocols, and are defined by an idl for CORBA, another idl for DCE, and a Java class for RMI.

Megaprogram

Page 27: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 27

Name of Person

Data objects: BlobsMinimal Typing within CHAIMS:Integer, boolean only for controlAll else is placed into Binary Large OBjects (Blobs),

transparent to compiler :Alternatives• ASN.1, with conversion routines• XML Example: Person_Information

complex

First Name string Joe Last Name string Smith

Personal Data complex Address

Date of Birth date 6/21/54 Soc.Sec.No string 345-34-345

Page 28: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 28

Wrapper: CHAIMS ComplianceCHAIMS protocol - support all CHAIMS primitives

– if not native, achieved by wrapping legacy codes

• State management and asynchrony: » clientId (megamodule handle in CHAIMS language)» callId (invocation handle in CHAIMS language)» results must be stored for possible extraction(s) until

termination of the invocation

• Data transformation: » BLOBs must be converted into the megamodule

specific data types (coding/decoding routines)

Page 29: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 29

Architecture: Three Views

Transport View moving around data blobs and CHAIMS messages

Composition View (megaprogram)

- composition of megamodules

- directing of opaque data blobs

Data View - exchange of data - interpretation of

data - in/between

megamodules

CHAIMS Layer

Distribution Layer

Objective: Clear separation between composition of services, computation of data, and transport

Page 30: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 30

execution of a remote method

synchronous

invoke a methodie extract results

setup / set attributes s

s

e

i

time

decomposed(no benefit for one module)

asynchronous

s,i

time

etim

e

available for other methods

e

s,i

Scheduler: Decomposed Execution

Page 31: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 31

Optimized Execution of Modules

M1 M4(<M1+M2)

M5

M2

M3 (>M1+M2)

i1e1

e4e3

e2

i3i4

i5

i2

e5

time

M1

M4

M5

M2

M3

i1

e1

e2

e3

e4

e5

i2

i3

i4

i5

time

data dependenciesexecution of a module

non-optimized

optimized by scheduleraccording to estimates

invoke a methodie extract results

Page 32: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 32

Decomposed Parallel Execution

time

M1M4

(<M1+M2)

M5

M2

M3<M1+M2)

optimized by scheduleraccording to estimates

invoke a methodextract results

set up / set attributes

Long setup timesoccur, for instance,when a subset of a large database hasto be loaded for asimple search, sayTransatlantic fightsfor an optimal arrival.

Page 33: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 33

M1M4

(<M1+M2)

M5

M2

M3 (>M1+M2)

Decomposed Optimized Execution

M1M4

(<M1+M2)

M5

M2

M3 (>M1+M2)

optimized by scheduleraccording to estimates

invoke a methodextract results

set up / set attributes

time

prio

r tim

e

Page 34: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 34

Scheduling: Simple Example

1 cost_ground_ih = cost_mmh.INVOKE ("Cost_for_Ground", 1 List_of_City_Pairs = city_pairs,Goods = info_goods)

2 WHILE (cost_ground_ih.EXAMINE() != DONE) {} 3cost_list_ground = cost_ground_ih.EXTRACT()

3 cost_air_ih = cost_mmh.INVOKE ("Cost_for_Air", 2 List_of_City_Pairs = city_pairs,Goods = info_good)

4 WHILE (cost_air_ih.EXAMINE() != DONE) {} 4cost_list_air = cost_air_ih.EXTRACT()

order inunscheduledmegaprogram

order in automaticallyprescheduled megaprogram

Page 35: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

June 1998 CHAIMS 35

Iterated Invocations

invoke a methodextract results

set up / set attributes

prio

r tim

e

M6.1

M6.2

M6.3

M6.4

M6.5

M6.1

M6.2

M6.3

M6.5

M6.4

Avoid repeatedsetups

time

Page 36: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

June 1998 CHAIMS 36

& Repeated Extractions

invoke a methodextract resultspartial for iteratingfull for presentation

set up / set attributes

prio

r tim

e, d

isib

ct in

voct

ions

M6.1

M6.2

M6.3

M6.4

M6.5

M6.1

M6.2

M6.3

M6.5

M6.4

time,

sha

red

setu

p

M6.1

M6.2

M6.3

M6.5

M6.4

t i

m e

,sh

ared

set

up &

par

tial e

xtra

ct

Avoid largeexactsuntilsatisfied

Page 37: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 37

Scheduling: HeuristicsINVOKES: call INVOKE’s as soon as possible

» may depend on other data» moving it outside of an if-block: depending on cost-

function (ESTIMATE of this and following functions concerning execution time, dataflow and fees (resources).

EXTRACT: move EXTRACT’s to where the result is actually needed

» no sense of checking/waiting for results before they are needed

» instead of waiting, polling all invocations and issue next possible invocation as soon as data could be extracted

TERMINATE: terminate invocations that are no longer needed (save resources)

» not every method invocation has an extract (e.g. print-like functions)

Page 38: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 38

Compiling into a NetworkMega Program

Module A

Module B

Module CModule E

Module DModule F

current CHAIMS systemMega Program

Module DModule F

control flow data flow

with distribution dataflow optimization

Mega Program

Module A

Module B

Module C Module E

Module DModule F

Page 39: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 39

CHAIMS Implementation• Specify minimal language

– minimal functions: CALLs, While, If *– minimal typing {boolean, integer, string, handles, object}

» objects encapsulated using ASN.1 standard– type conversion in wrappers, service modules*

• Compiler for multiple protocols (one-at-time, mixed*)• Wrapper generation for multiple protocols• Native modules for I/O, simple mathematics*, other• Implement API for CORBA, Java RMI, DCE usage• Wrap / construct several programs for simple demos• Schedule optimization *• Demonstrate use in heterogeneous setting• Define full-scale demonstration * in process

Page 40: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 40

CHAIMS

Conclusion: Research Questions

• Is a Megaprogramming language focusing only on composition feasible?

• Can it exploit on-going progress in client-server models and be protocol independent?

• Can natural parallelism for distributed services be effectively scheduled?

• Can high-level dataflow among distributed modules be optimized?

• Can CHAIMS express clearly a high-level distributed SW architecture?

• Can the approach affect SW process concepts and practice?

Page 41: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 41

Conclusion: Questions not addressed

• Will one Client/Server protocol subsume all others?– distributed optimization remains an issue

• Synchronization / Concurrency Control– autonomy of sources negates current concepts– if modules share databases, then database locks may

span setup/terminate all for a megaprogram handle.

• Will software vendors consider moving to a service paradigm?

– need CHAIMS demonstration for evaluation

Page 42: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 42

Integration Science

IntegrationScience

ArtificialIntelligence

knowledge mgmtmodels

uncertainty

Systems Engineering

analysisdocumentation

costing

Databasesaccessstoragealgebras

Page 43: C ompiling  H igh-level  A ccess I nterfaces for  M ulti-site  S oftware       Stanford University

November 1999

CHAIMS 43

CHAIMS