parallel combinators for the encore programming...

51
IT 16 007 Examensarbete 30 hp Februari 2016 Parallel Combinators for the Encore Programming Language Daniel Sean McCain Institutionen för informationsteknologi Department of Information Technology

Upload: others

Post on 24-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

IT 16 007

Examensarbete 30 hpFebruari 2016

Parallel Combinators for the Encore Programming Language

Daniel Sean McCain

Institutionen för informationsteknologiDepartment of Information Technology

Page 2: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only
Page 3: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Teknisk- naturvetenskaplig fakultet UTH-enheten Besöksadress: Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0 Postadress: Box 536 751 21 Uppsala Telefon: 018 – 471 30 03 Telefax: 018 – 471 30 00 Hemsida: http://www.teknat.uu.se/student

Abstract

Parallel Combinators for the Encore ProgrammingLanguage

Daniel Sean McCain

With the advent of the many-core architecture era, it will become increasingly important for programmers to utilize all of the computational power provided by the hardware in order to improve the performance of their programs. Traditionally, programmers had to rely on low-level, and possibly error-prone, constructs to ensure that parallel computations would be as efficient as possible. Since the parallel programming paradigm is still a maturing discipline, researchers have the opportunity to explore innovative solutions to build tools and languages that can easily exploit the computational cores in many-core architectures.

Encore is an object-oriented programming language oriented to many-core computing and developed as part of the EU FP7 UpScale project. The inclusion of parallel combinators, a powerful high-level abstraction that provides implicit parallelism, into Encore would further help programmers parallelize their computations while minimizing errors. This thesis presents the theoretical framework that was built to provide Encore with parallel combinators, and includes the formalization of the core language and the implicit parallel tasks, as well as a proof of the soundness of this language extension and multiple suggestions to extend the core language.

The work presented in this document shows that parallel combinators can be expressed in a simple, yet powerful, core language. Although this work is theoretical in nature, it is a useful starting point for the implementation of parallel combinators not only in Encore, but also in any language that has features similar to this language.

Tryckt av: Reprocentralen ITCIT 16 007Examinator: Edith NgaiÄmnesgranskare: Tobias WrigstadHandledare: Dave Clarke

Page 4: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only
Page 5: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Contents

1 Introduction 11.1 Using Abstraction to Increase Parallel Performance . . . . . . . . . . . . . 1

1.1.1 MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.2 Dataflow-Based Models . . . . . . . . . . . . . . . . . . . . . . . . 31.1.3 Parallelism and Common Data Structures . . . . . . . . . . . . . . 31.1.4 Orc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Futures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 Encore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Combinators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.5 Thesis Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Extending Encore with Parallel Combinators 82.1 Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Type System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Operational Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3.1 Monoids in Encore . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3.2 Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.3 Transition System . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.4 Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.5 Type Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.6 Deadlock-freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Additional Constructs 343.1 Core Language Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.1.1 Let Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.1.2 Wildcards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.1.3 Sequencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2 Combining Combinators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2.1 Prune Combinator . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.2.2 Bind Combinator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.2.3 Imitating Maybe Types . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Syntactic Sugar for Parallel Combinators . . . . . . . . . . . . . . . . . . 363.4 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

i

Page 6: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

4 Conclusions and Future Work 394.1 Formalizing Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.2 Avoiding Intermediate Parallel Structures . . . . . . . . . . . . . . . . . . 394.3 Dealing with Unneeded Tasks and Futures . . . . . . . . . . . . . . . . . . 404.4 (Efficiently) Implementing Parallel Combinators . . . . . . . . . . . . . . 404.5 Creating Benchmarks for the Parallel Combinators . . . . . . . . . . . . . 404.6 Exploring More Combinators . . . . . . . . . . . . . . . . . . . . . . . . . 414.7 Integrating the Semantics with Active Objects . . . . . . . . . . . . . . . 41

Bibliography 42

ii

Page 7: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Chapter 1

Introduction

In the past, many performance optimizations were focused on the CPU, such as in-creasing clock speed or cache size. Many times, a CPU upgrade would cause previouslywritten programs’ performance to improve. As the speed of current CPUs stagnate, thefocus of improving processing power has shifted to increasing computational hardware,such as incrementing the amount of processor cores. This architecture change is causingparallelism to become an important factor in high-performance software [1].

Many-core processors, which contain several tens, hundreds, or even thousands ofprocessor cores [2], are starting to gain importance both because of their increasingcomputation power and their energy efficiency. In addition to these new processors,cloud computing allows us to have access to huge amounts of processing power by givingusers access to multiple (virtual) computers. With computational hardware, includingthat in mainstream cellphones and computers, becoming more parallel and distributed,programmers must learn to take advantage of these capabilities if they wish to fullyexploit these systems.

Developing parallel software is no easy task. Traditionally, the programmer had to beaware of the hardware details of the platform that would run their program to improveperformance as much as possible. This would force them to manage both programcorrectness and communication issues, which may lead to subtle bugs which are hardto find and may only rarely appear. While sequential programming is mature and well-studied, the same cannot be said for the parallel programming paradigm. This givesresearchers ample opportunities to explore innovative solutions for building tools andlanguages for many-core systems.

This chapter serves as an introduction to various central concepts and related work,which help motivate the work provided by this thesis.

1.1 Using Abstraction to Increase Parallel Performance

The term parallelism may not often be well understood, as it is not always accuratelydescribed. In part, this is because it can be used interchangeably with the term con-currency in some contexts. In order to differentiate these two words, I will quote the

1

Page 8: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

excellent definitions given by Simon Marlow [3]:

A parallel program is one that uses a multiplicity of computational hardware(e.g. multiple processor cores) in order to perform computation more quickly.Different parts of the computation are delegated to different processors thatexecute at the same time (in parallel), so that results may be delivered earlierthan if the computation had been performed sequentially.

In contrast, concurrency is a program-structuring technique in which thereare multiple threads of control. Notionally the threads of control execute“at the same time”; that is, the user sees their effects interleaved. Whetherthey actually execute at the same time or not is an implementation detail;a concurrent program can execute on a single processor through interleavedexecution, or on multiple physical processors.

While parallel programming is concerned only with efficiency, concurrentprogramming is concerned with structuring a program that needs to interactwith multiple independent external agents (for example the user, a databaseserver, and some external clients). Concurrency allows such programs to bemodular; the thread that interacts with the user is distinct from the threadthat talks to the database. In the absence of concurrency, such programshave to be written with event loops and callbacks – indeed, event loops andcallbacks are often used even when concurrency is available, because in manylanguages concurrency is either too expensive, or too difficult, to use.

It is also important to differentiate between data parallelism and task parallelism.The former refers to the distribution of data across several nodes, normally to performthe same computation on the data. The latter focuses on the distribution of tasks acrossseveral nodes, each of which may be performing different computations. These tasksmay be performed by threads, for example. In order to achieve the best performance, itmay be necessary to have a mix of both of these forms of parallelization.

Nvidia CUDA [4] and MPI [5] are two well known parallel-computing platforms.In order to achieve efficient computations, these platforms use low-level constructs tosend, coordinate and synchronize data. Our interest is in maintaining good run times byincreasing program parallelism while decreasing the use of low-level constructs, whichmay force the developer to understand the hardware details of a specific architecture.The use of high abstraction levels would help provide semantic guarantees, as well assimplify the understanding, debugging, and testing of a parallel program.

1.1.1 MapReduce

Google’s MapReduce [6] is a well known programming model that uses a simple interfaceto automatically parallelize and distribute computations to various clusters. MapReducecreates an abstraction that gives its users the power to concentrate on their computationwithout having to manage many of the low-level tasks. It was inspired by the map and

2

Page 9: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

reduce primitives from functional programming language. Users mostly only have toconcentrate on writing both of these functions to express the computation. This takesaway the concern about distributing and coordinating the parallel work, as well as dealingwith machine, network or data failures.

The abstractions provided by MapReduce minimizes the complexity of writing paral-lel programs, yet it may become difficult to manage the combination of multiple MapRe-duce operations. The FlumeJava [7] Java library tries to solve this by deferring theevaluation of the computation with the use of classes that represent immutable parallelcollections. It constructs a dataflow graph of the execution plan, optimizes it and ex-ecutes it using the underlying parallel primitives, which include MapReduce, when theresults are needed. The user needs no knowledge of the details of the parallel collec-tions nor the operations’ implementation strategies to achieve efficient computations, asFlumeJava abstracts these details away.

1.1.2 Dataflow-Based Models

An alternative to increasing parallelism is through the use of dataflow models, whichallow programmers to build parallel computations by only specifying the data and controldependencies. These models tend to be more explicit about data dependencies, andcan be seen as graphs, where the nodes represent data and the edges represent thecomputations that occur between nodes. Concurrent Collections (CnC) [8] combines thismodel with the use of a specification language to facilitate the implementation of parallelprograms. CnC provides a separation between domain experts, which might be non-professional programmers, and tuning experts, programmers that worry about efficientlyimplementing parallel programs. The domain expert must provide the implementationof a computation in a separate programming language, and the tuning expert maps theCnC specification to a specific target architecture. Thus, CnC serves to help with thecommunication between domain and tuning experts, and has been implemented in avariety of programming languages, including C++, .NET, Java and Haskell.

Another option could be using FlowPools [9], a deterministic dataflow collections ab-stractions for Scala. FlowPools can be seen as concurrent data structures which, throughthe use of a functional interface, easily compose to create more complex computations.These structures can also be generated dynamically at run time. Users are given thepossibility to think about their computations as sequential programs, whereas the taskof performing and optimizing the parallelization is given to the framework, which isdone by using efficient non-blocking data structures to build complex parallel dataflowprograms.

1.1.3 Parallelism and Common Data Structures

A problem that may appear with the previously mentioned solutions is that they ex-cel in flat data parallelism, but not in the use of sparse data. Data Parallel Haskell(DPH) [10], influenced by the NESL programming language [11], extends the GlasgowHaskell Compiler (GHC) with the capabilities to perform efficient computations on ir-

3

Page 10: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

regular data. A new data type is provided to the user, the parallel array, as well asmultiple functions that operate on this data type. The provided functions are namedafter their sequential counterparts, differing not only in their implementation but alsoin their name (the parallel versions have a suffixed ‘P’ in their function name). Thesefamiliar function names, in conjunction to the provided syntactic sugar, provides usersof DPH with the means to easily write parallel programs that use sparse data. A solu-tion similar to this was proposed by Bergstrom et al. [12], where an implementation ofParallel ML was extended to include data-only flattening, which tries to take advantageof modern Multiple-Instruction-Multiple-Data architectures.

Aleksandar Prokopec et al. extended the Scala programming language collection li-brary with a wide variety of parallel structures and operators [13], in the hope thatimplicit parallelism will facilitate writing reliable parallel code. Similar to DPH, thisframework allows users to easily switch from sequential collections to their parallel coun-terparts. This is done by splitting a collection of data into subcollections which are sentto different processors, applying some operation on these subcollections, and recombiningthe data into a final collection. Adaptive scheduling, done by work stealing, is appliedto utilize all available computation cores as efficiently as possible. Each processor main-tains a task queue, and when a processor has no more elements in its queue, it can stealthe oldest tasks in other queues. The function names do not differ from their sequentialcounterparts, nor does it specialize in sparse data, as opposed to DPH.

1.1.4 Orc

The Orc programming language [14] was the primary source of inspiration for the parallelcombinators in this thesis. It expresses the control flow in concurrent programs withhigh-level primitive constructs, called combinators. The power of Orc comes from itsimplicitly distributed combinators, which are used to compose expressions. Expressionsin Orc call external services, called sites, which may publish any number of results. If asite explicitly reports that they will not respond, it is considered halted. The followingare the four combinators found in Orc, using F and G as expressions that contain sitecalls:

• The parallel combinator, F | G, executes two independent expressions in paralleland publishes the results of both.

• The sequential combinator, F >x> G, executes the first expression, F , where eachof its published values are bound to the variable x and given to a second expression,G, which will run in parallel to, and independently from, the first one.

• The pruning combinator, F <x< G, executes two expressions, F and G, in parallel.The first expression will suspend if it depends on the value of the variable x. Oncethe second expression publishes a value, its execution is terminated, x is assignedthat value, and the first expression proceeds with its execution. This is Orc’s onlymechanism to terminate parts of a computation.

• The otherwise combinator, F ; G, only executes the second expression if the firstone halts and has published no values.

Although Orc is a minimal language that was designed only to test and experiment

4

Page 11: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

with its combinators [15], its approach to parallelism and expressive power has servedas source of inspiration to library implementors of other programming languages [16].

1.2 Futures

Futures, first introduced by Baker and Hewitt [17], are computations that are evaluatedasynchronously from the current computation flow. They provide programmers with thecapability to start a computation in advance, usually in parallel, and request its valuelater.

Future variables are variables that will eventually contain the result of the future.These variables can be seen as containers that will at some point contain the value ofa future computation. Once this computation writes the result to the future, it is saidto be fulfilled. Depending on the semantics of the language that implements them, onemight be able to pass these variables around, wait for their result, or use operators onthem to create other futures.

Work done by a computation involving futures should require no more work than asequential version of the same computation. Thus, they are an attractive addition to alanguage that can run computations in parallel.

Futures were first featured in the MultiLisp programming language [18], which usedthe future keyword to explicitly annotate expressions that could be evaluated in paral-lel with the rest of the program. These expressions would immediately return a future,which could be used in other expressions. If the value of the future is needed in anoperation, the computation would be suspended until the value of the future is deter-mined. These futures are considered implicit, as the programmer does not have to worrywhether a variable is a future or not. If futures are explicit, it is necessary to state whento retrieve the value of a future variable. A few examples of modern languages thatsupport futures are C++ (starting from C++11 [19]), Racket [20], and Java [21].

Speculations are similar to futures in that they represent values which are retrievedat a later time. They are, unlike futures, delayed and not always evaluated. This makesspeculations useful in lazy languages, as their values may not always be needed.

The value of futures, as opposed to speculations, are generally needed. In this thesis,we consider that this may not be the case.

1.3 Encore

Encore [22] is a programming language that is being developed as part of the EU FP7UpScale project1. Encore is a general purpose parallel object-oriented programminglanguage that supports scalable performance. This is done by a combination of actor-based concurrency, asynchronous communication and guaranteed race freedom. While

1The project is partly funded be the EU project FP7-612985 UpScale: From Inherent Concurrency

to Massive Parallelism through Type-based Optimisations

5

Page 12: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

the language contains many features, this section will focus on a few core languageelements.

Parallelism in Encore is currently achieved by using active objects, which are objectsthat run in parallel. They have encapsulated fields and methods, which are accessed asother object-oriented languages. Active objects always immediately return a future asa result to their method calls, which will eventually contain the result of the executedmethod. However, there is no guarantee that the method is run immediately. Eventhough active objects are run in parallel, they each have a single thread of execution.This means that all method calls are internally serialized in an active object, and itsmethods are ran in a sequential order.

Futures are first-class elements of Encore, which can be passed around as variables,blocked on in multiple ways, and chained to functions, similar to how callbacks work inother languages, such as Javascript. They are, however, computationally expensive toblock on, as they block the active object’s thread of control.

Encore also has passive objects that return results synchronously, i.e. the calling ob-jects block until the passive object returns its result. These correspond to the “regular”classes in other languages. These object are used to pass data between active classesand help represent state.

The language also includes anonymous functions, which can be assigned to variables,used as arguments and returned from methods. With the use of polymorphism and typeinference, high-level functions can be easily written and used in Encore.

As Encore is still a young language (it has been worked on for about two years, at thetime of the writing of this thesis), its syntax and some of its semantics may change. Manyof the core features of the language have already been chosen and may be representedas a core language [22]. This thesis uses and extends some of these core elements, whileusing little of the current Encore syntax.

1.4 Combinators

A combinators is, as described by John Hughes, “a function which builds program frag-ments from program fragments” [23]. In essence, combinators provide the means tobuild powerful functions by using and combining other, smaller, functions. This givesdevelopers the possibility to program tools without the need of specifying how the pro-gram works in great detail. This is done by focusing on high-level operations such ashigher-order functions and previously defined combinators, features that are commonlyfound in functional programming languages.

A few examples of common combinators are the map function (which applies a func-tion to a collection of elements), the filter function (using a predicate function on acollection of elements, it returns the collection of elements that satisfy the predicate),and the fold function (which reduces a collection of objects into a single object by using asupplied function). The use and composition of combinators have given rise to powerfullibraries, such as QuickCheck [24], or parser combinators [25–28] such as Parsec [29].Both of these examples, while having been originally been written in Haskell, have been

6

Page 13: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

ported to, and inspired the creation of tools in, multiple other languages.

1.5 Thesis Contribution

Given the advantages of high-level parallel programming abstractions and combinators,such as those provided by the Orc programming language, it would seem beneficial tocombine the expressive power of these features with futures. Including such combinatorsin Encore would give the language a simple, yet powerful tool which could be used toelegantly solve problems with implicit parallelization.

Implementing such an extension in a programming language may seem like a fairlysimple task, yet there are multiple subtleties which must be taken into account in orderto avoid issues during run time. Thus, it is useful to have a theoretical framework tobase the implementation of such an extension on, as it gives the possibility to proveimportant properties, such as deadlock-freedom.

This thesis presents the theoretical framework that was built to provide Encore withsuch an extension. This includes the formalization of the core language and the implicitparallel tasks, as well as a proof of the soundness of this language extension and multiplesuggestions to extend the core language.

Although this thesis focuses on Encore, its content is applicable to any programminglanguage that uses futures and wishes to combine their use with parallel combinators.

7

Page 14: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Chapter 2

Extending Encore with Parallel

Combinators

Formalizing the semantics of the parallel combinator extension before adding it to Encorehelps not only to prove properties about this new extension, but also to accuratelydescribe how its different components should work and interact once implemented.

The formalization of Encore has already been expressed in the µEncore calculus [22],yet it is unnecessarily complex for expressing parallel combinators in the language. Al-though Encore is an object-oriented programming language, a paradigm for which thereare already multiple simple formalizations [30, 31], there was no need to use any object-orientedness to express the combinators, due to their functional nature. The semanticsof the combinators from the Orc programming language [32, 33] were not expressiveenough to use in this thesis, as they do not clearly show where published values comefrom. Other formalizations for languages with futures already exist [34–36], yet theyeither have to be changed or can be simplified in order to obtain an elegant calculusthat is closer to expressing the traits found in this extension and its implementation inEncore.

Thus, a small calculus was created in order to express the core language of thesubset of Encore necessary to include the parallel combinator extension. This chapterintroduces this core language, as well as its properties and their proofs.

2.1 Grammar

The following syntax represents a small subset of the Encore programming language,including the new syntax for the parallel combinators:

e ::= expressions| v values| e e function application| async e asynchronous expression| get e get expression

8

Page 15: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

| e � e future chain| [e] container singleton| e � e container structure| {e} parallel singleton| e | e parallel combinator| e � e sequence combinator| e >< e otherwise operator| liftf e lift future combinator| extract e extract combinator| peek e peek combinator| join e join combinator

v ::= values| c basic language constants| f future variables| x variables| λx.e lambda-abstractions| [] empty container structure| [v ] container value singleton| v � v container structure| {} empty parallel structure| {v} parallel value singleton| v | v parallel structure

It is not necessary to include the full syntax of Encore to express the extension, asthe expressions already included in Encore that are of particular interest for the parallelcombinators are the asynchronous expression, the get expression and future chaining.Asynchronous expressions are those that should be ran in parallel, and return futuresas a result. Get expressions block on futures, waiting for their results to be available.Future chains run an expression which uses the value of a future, as soon as it is fulfilled.

Containers are not an integral part of Encore, yet are an important structure thatare used in the parallel combinator extension, as they represent an abstraction for datastructures in Encore. Examples of containers are lists, sets and multisets.

The parallel combinator extension is composed of the parallel singleton, a data struc-ture that contains parallel expressions, as well as the different functions that operate onthese expressions. The elements that form part of this new language extension are allof the expressions from and below the parallel singleton in the expression grammar, andall of the values from and below the empty parallel structure in the value grammar.

2.2 Type System

Type systems provide guarantees regarding the prevention of execution errors during aprogram’s run time. The language used for the subset of Encore with parallel expres-

9

Page 16: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

sions is sufficiently expressed with a first-order type system. While these systems lackthe power of type parametrization and type abstraction, they provide the higher orderfunctions needed in this language extension, thus significantly reducing the complexityof type expressions in the parallel combinators.

The set of included types is generated by the following grammar:

τ ::= K | Fut τ | Ct τ | Par τ | τ → τ

K represents basic types (such as integers or booleans). Fut τ represents the type offuture variables of type τ . Ct τ represents the type of containers, which are a structuresthat are composed of elements of type τ . Par τ represents the type of parallel structureswith elements of type τ . τ → τ represents the function type, and is right-associative(i.e. the expression τ1 → τ2 → τ3 stands for τ1 → (τ2 → τ3)).

The typing context Γ, a sequence of variables and their types, will also include futurevariables and their types. If the context is empty (written as ∅), it will be omitted bywriting � e : τ , representing the expression e with type τ under the empty set ofassumptions. The typing context has the following grammar:

Γ ::= ∅ | Γ, x:τ | Γ, f :Fut τ

Judgments are assertions about the expressions in the language. The judgments thatare used are Γ � �, which states that Γ is a well-formed environment, and Γ � e : τ ,which states that e is a well-formed term of type τ in Γ. The validity of these judgmentsis defined by the type rules in figure 2.1.

2.3 Operational Semantics

Formal semantics can be used to rigorously specify the meaning of a programming lan-guage, helping to form the basis to analyze and implement it by leaving no ambiguitiesin its interpretation. Operational semantics worry about how computations take effect,and do so by concisely describing the evaluation steps that take place. This chapterintroduces the Encore language semantics with the parallel combinators extension, aswell the use of the monoid and task abstraction.

2.3.1 Monoids in Encore

Containers and parallel combinators in this thesis are monoids, an abstraction thatsignificantly simplifies their syntax and representation. The use of monoids in the En-core syntax helps to further disconnect the language specification from the languageimplementation; the implementation should follow the specification, not the other wayaround.

A monoid is an algebraic structure (S, ◦, eS) that has an associative binary operation,◦, and an identity element eS , which satisfies the following properties:

Closure: ∀a, b ∈ S : a ◦ b ∈ S

10

Page 17: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Associativity: ∀a, b, c ∈ S : (a ◦ b) ◦ c = a ◦ (b ◦ c)

Identity element: ∃e ∈ S : ∀a ∈ S : e ◦ a = a ◦ e = a

If the order of the operations on the monoid is not important, the monoid is consid-ered a commutative monoid. These monoids also satisfy the commutative axiom:

Commutativity: ∀a, b ∈ S : a ◦ b = b ◦ a

A monoid homomorphism φ is a mapping from one monoid (S, ◦, eS) to another(T, ∗, eT ) such that for all elements a and b in S, φ(a ◦ b) = φ(a) ∗ φ(b) and φ(eS) = eT .

The expression [e1] � · · · � [en] represents a (possibly empty) container, as describedin section 2.1. The symbol � is the associative binary operation of the monoid, and theidentity element is [].

The expression {e1} | · · · | {en} represents a (possibly empty) monoidal structureof (possibly repeating) parallel expressions e and values v. This expression is also amonoidal structure, using | as the associative binary operation of the monoid, and {} asthe identity element.

2.3.2 Configurations

A configuration config is a multiset of futures, tasks and dependent tasks (chains). Theunion operator on configurations is denoted by a whitespace, and the empty configurationby ε. The configuration grammar is the following:

config ::= ε | (fut f) | (fut f v) | (task e f) | (chain f1 e f2) | config config

A future variable f refers to memory location where the return value of an asyn-chronous execution can be retrieved.

The following objects are used in the configuration:

• (fut f): the future f contains no value yet.

• (fut f v): the future f contains the value v.

• (task e f): the value of the expression e will be stored in the future f .

• (chain f1 e f2): the value of the expression e depends on the future variable f1and, once fulfilled, will deposit the result of e in the future f2.

Tasks are the main configurations used to produce calculations. In order for the firsttransitions in the system to occur, there must be a “main” task, which is expressed as(task e0 fmain). The future variable fmain will contain the final value of the execution ofthe initial expression e0. Thus, the initial environment and configuration of the systemis fmain � (task e0 fmain) (fut fmain).

The term complete system configuration refers to the collection of configurations thatexists in each execution step starting from the initial configuration. In the complete sys-tem configuration, each subconfiguration (fut f) either has a corresponding (task e f)or a corresponding (chain f � e f), and vice-versa.

11

Page 18: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Lemma 2.3.1 (Complete system configuration reduction rule property). Ifthe configuration config has the property of being a complete system configuration, thisproperty will be preserved by all the reduction rules.

Proof. By induction on a derivation of config.

Terminal configurations are configurations in their terminal (i.e. final) form. Theyare composed of future configurations with attached values.

2.3.3 Transition System

This transition system uses reduction semantics [37], a variation of structural opera-tional semantics [38] that represents a separation between an evaluation step and theposition in an expression where it has taken place. This is done by decomposing anexpression into two components: the redex, which is where the evaluation takes place,and a reduction context C, which contains a hole, [], where the evaluated redex is placed.Once the expression is decomposed, the redex is reduced into a contractum by using thecorresponding contraction rule, which is then “plugged in” the hole of the context. C[e]represents the expression that is obtained by replacing the hole of C with the expressione. Thus, if the expression e in C[e] reduces to e�, the resulting expression after this pro-cess would be C[e�]. The evaluation and task contexts of the semantics of this languageextension are shown in figure 2.4.

Structural congruence equivalences allow us to compare configurations when theirorder is not important. The congruence equivalences, shown in figure 2.5, were inspiredby those of π-calculus [39, 40].

The equivalence rules for the parallel combinators, shown in figure 2.6, permit thesystem to optimize the operations that will be performed on parallel structures. PeekM

expresses that a peek operation on a parallel structure can be seen as the result ofperforming a peek on the elements of the parallel structure, and another peek on these“subpeeks”. PeekS states that performing two peek operations on an expression canbe simplified into a single peek operation. LiftFutGet shows a resemblance between alifted future and a singleton parallel expression where a get operation is performed onthe future. This last rule is of particular use to the extract combinator.

Some of the parallel combinators can be defined recursively. Those that have thisspecification are defined in figure 2.7. The sequence combinator applies a function to theelements of a parallel combinator. Although it is equivalent to the map function in otherprogramming languages, this particular syntax has been chosen due to its resemblance toboth the sequential combinator in Orc and to the chain operator. The extract combinatortransforms a parallel structure into a container. The join combinator flattens a parallelstructure.

The full list of the operational semantic rules is shown in figure 2.8. Some of theserules use the fresh x predicate, which is used to assert that the variable x is globallyunique. The rules can be divided into three categories: Encore-related, future-handlingand parallel combinators.

12

Page 19: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

The Encore-related rules are those that are already found, in one form or another,in the Encore programming language. β-red represents a β-reduction, where all of thevariables x in the expression e are substituted by the value v. The Async rule createsa new future computation, where the expression e is evaluated in a task and a futurevariable f is immediately returned. The Get rule retrieves the value of a future variableonce it has been fulfilled. The Chain rule supplies a future variable and an expression,and immediately returns a new future variable which, once fulfilled, will contain thevalue of computation of the provided expression using the supplied future variable as aresult.

Future-handling rules are not performed in contexts, but on configurations. TheChainVal rule starts the computation of an expression once the future that it is dependingon has been fulfilled. The FutVal rule creates a future-with-value configuration once itsrelated task has fully evaluated its expression.

Parallel combinator rules describe the combinators in this new language extension.The SeqLF rule is a special case of the sequence operator, where a chain is a appliedto a lifted future. The OtherV rule eliminates the second expression of an otherwisecombinator once the first expression contains a value. The OtherH rule eliminates thefirst expression of an otherwise combinator once it has been evaluated to an emptyparallel structure. PeekV non-deterministically chooses a value from a parallel structureas the value of a peek operator. PeekH supplies an empty parallel structure if itsexpression is already one. The Struct rule states that structurally congruent terms havethe same reductions.

2.4 Parallelism

The previous section defines the reduction rules that may be used in a sequential versionof the language. Certain operations, such as async, would have to be defined in theevaluation context in order for the sequential version to be deterministic, yet this wouldbe an undesirable property in this extension, as we wish for it to be as parallel as possible.

In order to extend Encore with parallelism, the rules must be able to operate onmultiple configurations and parallel expressions at a time. The parallel reduction rules(figure 2.9) expand the rules defined in section 2.3 with this trait. OneConfig reduces asingle configuration, and ShareConfig reduces multiple configurations at the same timewhile sharing a common configuration. It is important to note that a configuration thatdoes not reduce may also be an empty configuration. The last rule, ManyPar, reducesmultiple elements in a parallel expression in the same step.

2.5 Type Soundness

Type soundness is a property that guarantees that a certain level of coherence exists in aprogramming language; that well-typed programs are free from certain types of behaviorsduring their execution, and thus there is always some transition in the program state (i.e.the program will never be stuck). Type systems can statically check that the language

13

Page 20: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

primitives are being correctly used. This includes, but is not limited to, checking thatthe correct amount of arguments are provided to functions, that the types of the inputsand outputs of functions are correct, that the called functions exist, etc.

The approach used in this thesis to prove soundness was inspired by the methodproposed by Wright and Felleisen [41], where soundness can be divided into two theo-rems: progress and preservation. This thesis also relies on the canonical forms conceptintroduced by Martin-Lof [42]. The progress theorem indicates that a well-typed termis not stuck, as it is either a value or it can take an evaluation step. The preservationtheorem indicates that if a well-typed term takes an evaluation step, then the resultingterm is also well-typed.

It is useful have a series of lemmas which ease the proof of the theorems. The firstof these lemmas are the inversion lemmas, one for expressions and another for config-urations. The former shows how a proof of some expression may have been generated.The latter does the same for some configuration.

Lemma 2.5.1 (Inversion of the typing relation of expressions). We distinguishall the possible cases that can arise, which is one case per type rule:

1. If Γ � c : R, then R = τ for some τ ∈ Basic.2. If Γ � f : R, then R = Fut τ for some τ such that f :Fut τ ∈ Γ.3. If Γ � x : R, then x:R ∈ Γ.4. If Γ � λx.e : R then R = τ → R� for some R� with Γ, x:τ � e : R�.5. If Γ � e1 e2 : R then there is some τ such that Γ � e1 : τ → R and Γ � e2 : τ .6. If Γ � async e : R, then there is some τ such that R = Fut τ and Γ � e : τ .7. If Γ � get e : R, then there is some τ such that R = τ and Γ � e : Fut τ .8. If Γ � e1 � e2 : R, then there is some τ and τ � such that R = Fut τ , Γ � e1 : Fut τ �

and Γ � e2 : τ � → τ .9. If Γ � [] : R, then there is some τ such that R = Ct τ .

10. If Γ � [e] : R, then there is some τ such that R = Ct τ and Γ � e : τ .11. If Γ � e1 � e2 : R, then there is some τ such that R = Ct τ , Γ � e1 : R and

Γ � e2 : R.12. If Γ � {} : R, then there is some τ such that R = Par τ .13. If Γ � {e} : R, then there is some τ such that R = Par τ and Γ � e : τ .14. If Γ � e1 | e2 : R, then there is some τ such that R = Par τ , Γ � e1 : R and

Γ � e2 : R.15. If Γ � e1 � e2 : R, then there is some τ and τ � such that R = Par τ , Γ � e1 : Par τ �

and Γ � e2 : τ � → τ .16. If Γ � e1 >< e2 : R, then there is some τ such that R = Par τ , Γ � e1 : R and

Γ � e2 : R.17. If Γ � liftf e : R, then there is some τ such that R = Par τ and Γ � e : Fut τ .18. If Γ � extract e : R, then there is some τ such that R = Ct τ and Γ � e : Par τ .19. If Γ � peek e : R, then there is some τ such that R = Par τ and Γ � e : R.20. If Γ � join e : R, then there is some τ such that R = Par τ and Γ � e : Par R.

Proof. By induction on the derivation of the typing judgment Γ � e : τ .

14

Page 21: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Lemma 2.5.2 (Inversion of the typing relation of configurations).1. If Γ � (fut f) ok and Γ � f : R, then there is some τ such that R = Fut τ and

f :R ∈ Γ.2. If Γ � (fut f v) ok, Γ � f : R and Γ � v : R�, then there is some τ such that

R = Fut τ , R� = τ and f :R ∈ Γ.3. If Γ � (task e f) ok, Γ � e : R and Γ � f : R�, then there is some τ such that

R = τ , R� = Fut τ and f :R� ∈ Γ.4. If Γ � (chain f1 e f2) ok, Γ � f1 : R, Γ � e : R� and Γ � f2 : R��, then there

are some τ and τ � such that R = Fut τ , R� = τ → τ �, R�� = Fut τ �, f1:R ∈ Γ andf2:R�� ∈ Γ

Proof. By induction on the derivation of the typing judgment Γ � config ok.

A canonical form is an expression that has been fully evaluated to a value [42]. Thecanonical forms lemma gives us information about the possible shapes of a value.

Lemma 2.5.3 (Canonical forms). If � v : τ , then1. if v is a value of type K, then v is a constant of correct form.2. if v is a value of type τ → τ �, then v has the form λx.e.3. if v is a value of type Fut τ , then v has the form f .4. if v is a value of type Ct τ , then either v is an empty container structure with the

form [], a container structure with a value [v�], where v� has type τ , or a containerstructure with the form v1 � v2, where v1 and v2 have type Ct τ .

5. if v is a value of type Par τ , then either v is an empty parallel structure with theform {}, a parallel structure with a value {v�}, where v� has type τ , or a parallelstructure with the form v1 | v2, where v1 and v2 have type Par τ .

Proof. By induction on a derivation � v : τ .1. If τ is K, then the only rule which lets us give a value of this type is Val c.2. If τ is τ → τ �, then the only rule that can give us a value of this type is Val Fun.3. If τ is Fut τ , then by inspection of the typing rules, only Val f gives a value of this

type.4. If τ is Ct τ , then by inspection of the typing rules, there are three possible cases

for v:(a) If v = [], then the only rule that gives us a value of this type is Val Empty-

Container.(b) If v = [v�], then the only rule that gives us a value of this type is Val Container.(c) If v = v1 � v2, then the only rule that gives us a value of this type is Val

ContainerAppend.5. If τ is Par τ , then by inspection of the typing rules, there are three possible cases

for v:(a) If v = {}, then the only rule that gives us a value of this type is Val EmptyPar.(b) If v = {v�}, then the only rule that gives us a value of this type is Val Par.(c) If v = v1 | v2, then the only rule that gives us a value of this type is ParT Par.

15

Page 22: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

A future f depends on another future f � if the configuration that leaves the value inf , such as (task e f) or (chain f �� e f), contains f � in the expression to evaluate, or iswaiting for the result of f � to be available. It is also possible that f may depend on afuture variable f � if it is in the value expression of some other future variable, (fut f �� e).These dependencies are expressed as f � f �, and, using the depset function (figure 2.10),are defined as one of three possibilities, depending on whether the future f is associatedto a task, a chain or a future with a value configuration.

• Associated to a task configuration:(task e f) f � ∈ depset(e)

f � f �

• Associated to a chain configuration:(chain f �� e f) f � = f �� ∨ f � ∈ depset(e)

f � f �

• Associated to a future configuration with a value:(fut f e) f � ∈ depset(e)

f � f �

In order for the expression of a task configuration (task e f) to advance a step inthe progress theorem, it may require the result of some future variable f �, which maystill not have a value. These dependencies form the basis of dependency trees, whichare formed by walking through all the possible future dependency paths. The leavesof any dependency tree will have no dependencies, and their associated future variablewill either have an attached value, or an associated task configuration with no futuredependencies. This configuration will be able to progress a step because it does notdepend on the value of any future variables. The configuration configdep ex is shown asan example to help visualize a dependency tree:

configdep ex ≡ (fut f1 42) (fut f2 f1)

(fut f3) (task ((get f1) + (get (get f2))) f3)

(fut f4) (task (f3 � λx.x) f4)

This configuration has the dependency tree shown in figure 2.11. It is easy to seethat the different dependency paths of this tree are f4 � f3 � f2 � f1 and f4 � f3 � f1. Iff1 had been a future without a value, and no future dependencies, its associated taskwould have been able to take a reduction step using one of the rules in section 2.3.3.This is because f1 would not be expecting the value of any future variables, and wouldhave no reason to not be able to take a reduction step.

As tasks are the configurations that calculate the values which are placed into theirassociated futures, the only semantic rules that may not progress without an attachedvalue for some future variable is the Get rule. This rule requires special care, as problemsin the progress theorem may arise if the rule is not treated correctly. If there are nocycles in the dependency tree, there must be at least one future variable in the tree withan associated configuration that can reduce, using the progress theorem.

Theorem 2.5.1 (Progress). If we have a complete system configuration config suchthat Γ � config : ok, then either config is a terminal configuration, or there existssome configuration config� such that config −→ config�.

16

Page 23: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Proof. By induction on a derivation of config.Case C-Fut: Here (fut f) ∈ config. By the definition of a complete system con-

figuration, there exists some task configuration (task e f) or some chain configuration(chain f � e f) that is related to (fut f). It is thus possible to apply the progresstheorem to this related configuration.

Case C-Task: Here (task e f) ∈ config. By the definition of a complete systemconfiguration, there exists some future configuration (fut f) that is related to (task e f).If the expression e is a value, the appropriate C-Config case of this theorem applies toboth the task and its related future configurations. If the expression e is not a value, wehave different subcases to consider.

Subcase Val c: Here, e ≡ c. Since e is a basic value, the appropriate C-Config caseof this theorem applies to both the task and its related future configurations.

Subcase Val f: Here, e ≡ f �. Since e is a future variable, the appropriate C-

Config case of this theorem applies to both the task and its related futureconfigurations.

Subcase Val x: Here, e ≡ x. Impossible, as there are no variables in Γ.Subcase Val Fun: Here, e ≡ λx.e�. Since e is an abstraction, the appropriate

C-Config case of this theorem applies to both the task and its related futureconfigurations.

Subcase Val App: Here, e ≡ e1 e2, with � e1 : τ � → τ and � e2 : τ �, by inversion.If e1 is not a value, then by evaluation context, e1 e2 −→ e�1 e2. If e1 is a valueand e2 is not, then by evaluation context, e1 e2 −→ e1 e�2. If both expressionsare values, then the canonical forms lemma tells us that e1 has the form λx.e�.Thus, the β-red rule applies to the task configuration.

Subcase Val Async: Here, e ≡ async e�, with � e� : τ , by inversion. The Async

rule applies to the task configuration.Subcase Val Get: Here, e ≡ get e�, with � e� : Fut τ , by inversion. If e� is not a

value, then by evaluation context, get e� −→ get e��. If e� is a value, then thecanonical forms lemma tells us that e� has the form f �. Two possible casescan occur:1. f � has an attached value. That is, there exists some configuration (fut f � v)

in config, with some value v. Thus, the appropriate C-Config case ofthis theorem applies to this future and the task configurations.

2. f � has no attached value. That is, there exists some configuration (fut f �)in config which still has not received its value. It is possible to traversethe dependency tree of f � to find some configuration where the progresstheorem can be applied to. There are three possible cases that can occur:(a) A leaf is reached, and its future has no attached value. As leaves have

no future dependencies, the future has to have an associated taskconfiguration (as chain configurations have at least one dependency).If the task’s expression is a value, the appropriate C-Config case ofthis theorem applies to the task and its related future configurations.If its expression is not a value, the task can take a reduction step by

17

Page 24: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

applying the progress theorem.(b) A node is reached where its future has no attached value and its

related configuration does not need the value of its dependencies toperform a reduction step. As chain configurations require the resultof at least one dependency to take a step, the only other possiblerelated configuration is a task configuration. It is thus possible forthe task configuration related to this node to take a reduction stepby applying the progress theorem to it.

(c) A node is reached where its future, f ��, has no attached value andits related configuration needs the value of a dependency to performa reduction step. If the value of the dependency is available, the ap-propriate C-Config case of this theorem applies. If not, it is possibleto walk down the dependency tree of f �� until a node is found wherethe progress theorem can be applied to its related configuration.

Subcase Val Chain: Here, e ≡ e1 � e2, with � e1 : Fut τ � and � e2 : τ � → τ , byinversion. If e1 is not a value, then by evaluation context, e1 � e2 −→ e�1 �e2. If e1 is a value and e2 is not, then by evaluation context, e1 � e2 −→e1 � e�2. If both expressions are values, then the Chain rule applies to thetask configuration.

Subcase Val EmptyContainer: Here, e ≡ []. Since e is an empty container struc-ture, the appropriate C-Config case of this theorem applies to both the taskand its related future configurations.

Subcase Val Container: Here, e ≡ [e�], with � e� : τ , by inversion. If e� is nota value, then by evaluation context, [e�] −→ [e��]. If e� is a value, then thecanonical forms lemma tells us that e� has the form v, in which case e is acontainer value singleton, and, thus, the appropriate C-Config case of thistheorem applies to both the task and its related future configurations.

Subcase Val ContainerAppend: Here, e ≡ e1 � e2, with � e1 : Ct τ and � e2 : Ct τ ,by inversion. If e1 is not a value, then by evaluation context, e1�e2 −→ e�1�e2.If e1 is a value and e2 is not, then by evaluation context, e1 � e2 −→ e1 � e�2.If both expressions are values, then the canonical forms lemma tells us thateach expression has the form [], [vi] or [vi] � [v�i]. Therefore, e is a value withthe form v1 � v2, and, thus, the appropriate C-Config case of this theoremapplies to both the task and its related future configurations.

Subcase Val EmptyPar: Here, e ≡ {}. Since e is an empty parallel structure, theappropriate C-Config case of this theorem applies to both the task and itsrelated future configurations.

Subcase Val Par: Here, e ≡ {e�}, with � e� : τ , by inversion. If e� is not a value,then by evaluation context, {e�} −→ {e��}. If e� is a value, then the canonicalforms lemma tells us that e� has the form v, in which case e is a parallel valuesingleton, and, thus, the appropriate C-Config case of this theorem applies toboth the task and its related future configurations.

Subcase ParT Par: Here, e ≡ e1 | e2, with � e1 : Par τ and � e2 : Par τ , by

18

Page 25: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

inversion. If e1 is not a value, then by evaluation context, e1 | e2 −→ e�1 | e2.If e2 is not a value, then by evaluation context, e1 | e2 −→ e1 | e�2. If bothexpressions are values, then the canonical forms lemma tells us that eachexpression has the form {}, {vi} or {vi} | {v�i}. Therefore, e is a value withthe form v1 | v2, and, thus, the appropriate C-Config case of this theoremapplies to both the task and its related future configurations.

Subcase ParT Seq: Here, e ≡ e1 � e2, with � e1 : Par τ � and � e2 : τ � → τ , byinversion. If e2 is not a value, then by evaluation context, e1 � e2 −→ e1 �e�2. If e2 is a value and e1 is not, then by evaluation context, e1 � e2 −→e�1 � e2. If both expressions are values, then the canonical forms lemma tellsus that e1 has the form {}, {v1} or {v1} | {v�1}, and e2 has the form λx.e��2.For the first form of e1, the SeqE rule applies to the task configuration. Forthe second form of e1, the SeqS rule applies to the task configuration. For thethird form of e1, the SeqM rule applies to the task configuration.

Subcase ParT Otherwise: Here, e ≡ e1><e2, with � e1 : Par τ and � e2 : Par τ , byinversion. If e1 is not a value, then by evaluation context, e1><e2 −→ e�1><e2.If e1 is a value, then the canonical forms lemma tells us that e1 is either {},{v1} or {v1} | {v�1}. In the first case, the OtherH rule applies to the taskconfiguration, and in the rest of the cases, the OtherV rule applies to the taskconfiguration.

Subcase ParT LiftFut: Here, e ≡ liftf e�, with � e� : Fut τ , by inversion. If e� isnot a value, then by evaluation context, liftf e� −→ liftf e��. If e� is a value,then the canonical forms lemma tells us that e� has the form f , and, thus, itis possible to apply the LiftFutGet rule to the expression.

Subcase ParT Extract: Here, e ≡ extract e�, with � e� : Par τ , by inversion. If e�

is not a value, then by evaluation context, extract e� −→ extract e��. If e� isa value, then the canonical forms lemma tells us that e� has the form {}, {v} or{v} | {v�}. In the first case, the ExtractE rule applies to the task configuration.In the second case, the ExtractS rule applies to the task configuration. Inthe third case, the ExtractM rule applies to the task configuration.

Subcase ParT Peek: Here, e ≡ peek e�, with � e� : Par τ , by inversion. If e� isnot a value, then by evaluation context, peek e� −→ peek e��. If e� is a value,then the canonical forms lemma tells us that e� is either {}, {v} or {v} | {v�}.In the first case, the PeekH rule applies to the task configuration, and in therest of the cases, the PeekV rule applies to the task configuration.

Subcase ParT Join: Here, e ≡ join e�, with � e� : Par τ , by inversion. If e� is nota value, then by evaluation context, join e� −→ join e��. If e� is a value, thenthe canonical forms lemma tells us that e� has the form {}, {v} or {v} | {v�}.In the first case, the JoinE rule applies to the task configuration. In the secondcase, the JoinS rule applies to the task configuration. In the third case, theJoinM rule applies to the task configuration.

Case C-Chain: Here (chain f1 e f2) ∈ config. If the future f1 has an attached value,i.e. there exists some configuration (fut f1 v) ∈ config, then the appropriate C-Config

19

Page 26: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

case of this theorem applies to the chain and its related future configuration. If the futuref1 has no attached value, i.e. there exists some configuration (fut f1) ∈ config, then,as with the Val Get subcase the in C-Task case of this proof, it is possible to traversethe dependency tree of f1 to find some configuration where the progress theorem can beapplied to. There are three possible cases that can occur:

1. A leaf is reached, and its future has no attached value. As leaves have no futuredependencies, the future has to have an associated task configuration (as chainconfigurations have at least one dependency). If the task’s expression is a value,the appropriate C-Config case of this theorem applies to the task and its relatedfuture configurations. If its expression is not a value, the task can take a reductionstep by applying the progress theorem.

2. A node is reached where its future has no attached value and its related configu-ration does not need the value of its dependencies to perform a reduction step. Aschain configurations require the result of at least one dependency to take a step,the only other possible related configuration is a task configuration. It is thuspossible for the task configuration related to this node to take a reduction step byapplying the progress theorem to it.

3. A node is reached where its future, f ��, has no attached value and its relatedconfiguration needs the value of a dependency to perform a reduction step. If thevalue of the dependency is available, the appropriate C-Config case of this theoremapplies. If not, it is possible to walk down the dependency tree of f �� until a nodeis found where the progress theorem can be applied to its related configuration.

Case C-Config: Here config1 config2 ⊆ config. By the induction hypothesis, fori ∈ {1, 2}, either configi is a terminal configuration, or there exists some configurationconfig�i such that configi −→ config�i. If both configi are not terminal configurations,then there are different subcases to consider:

Subcase 1: One of the configurations is (fut f v) and the other is (task (get f) f �).Thus, the Get rule applies to these configurations.

Subcase 2: One of the configurations is (fut f1 v) and the other is (chain f1 e f2).Thus, the ChainVal rule applies to these configurations.

Subcase 3: One of the configurations is (fut f) and the other is (task v f). Thus,the FutVal rule applies to these configurations.

Lemma 2.5.4 (Permutation). If Γ � e : τ and ∆ is a permutation of Γ, then ∆ � e : τ .Permutation is caused by interchanging any two immediate variables in Γ a certainamount of times. This can also be done between variables and future variables, but notbetween future variables.

Proof. By induction on typing derivations.

Lemma 2.5.5 (Weakening). If Γ � e : τ and x �∈ dom(Γ), then Γ, x:τ � � e : τ

Proof. By induction on typing derivations.

20

Page 27: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Lemma 2.5.6 (Preservation of types under substitution). If Γ, x:τ � � e : τ andΓ � e� : τ �, then Γ � e[e�/x] : τ

Proof. By induction on the depth of a derivation of the statement Γ, x:τ � � e : τ .Case Val c: Here, e = c, so c[e�/x] = c. The desired result is immediate.Case Val f: Here, e = f , so f [e�/x] = f . The desired result is immediate.Case Val x: Here, e = z, with z:τ ∈ (Γ, x:τ �). There are two subcases to consider,

depending on whether z is x or another variable.Subcase 1: If z = x, then z[e�/x] = e�. The required result is then Γ � e� : τ �,

which is among the assumptions of the lemma.Subcase 2: If z �= x, then z[e�/x] = z, and the desired result is immediate.

Case Val Fun: Here, e = λy.e�� with τ = τ2 → τ1 and Γ, x:τ �, y:τ2 � e�� : τ1,by inversion. We assume x �= y and y �∈ FV (e�). Using permutation on the givensubderivation, we obtain Γ, y:τ2, x:τ � � e�� : τ1. Using weakening on the other givensubderivation (Γ � e� : τ �), we obtain Γ, y:τ2 � e� : τ �. Now, by the induction hypothesis,Γ, y:τ2 � e��[e�/x] : τ1. By Val Fun, Γ � λy.e��[e�/x] : τ2 → τ1. But this is precisely theneeded result, since, by the definition of substitution, e[e�/x] = λy.e��[e�/x].

Case Val App: Here e = e1 e2, with Γ, x:τ � � e1 : τ �� → τ and Γ, x:τ � � e2 : τ ��, byinversion. By the induction hypothesis, Γ � e1[e�/x] : τ �� → τ and Γ � e2[e�/x] : τ ��. ByVal App, Γ � e1[e�/x] e2[e�/x] : τ , i.e., Γ � (e1 e2)[e�/x] : τ .

Case Val Async: Here e = async e��, with Γ, x:τ � � e�� : τ �� and τ = Fut τ ��, by inver-sion. By the induction hypothesis, Γ � e��[e�/x] : τ ��. By Val Async, Γ � async e��[e�/x] :τ = Γ � (async e��)[e�/x] : τ .

Case Val Get: Here e = get e��, with Γ, x:τ � � e�� : Fut τ , by inversion. By theinduction hypothesis, Γ � e��[e�/x] : Fut τ . By Val Get, Γ � get e��[e�/x] : τ = Γ �(get e��)[e�/x] : τ .

Case Val Chain: Here e = e1 � e2, with Γ, x:τ � � e1 : Fut τ1, Γ, x:τ � � e2 : τ1 → τ2and τ = Fut τ2. By the induction hypothesis, Γ � e1[e�/x] : Fut τ1 and Γ � e2[e�/x] :τ1 → τ2. By Val Chain, Γ � e1[e�/x] � e2[e�/x] : τ = Γ � (e1 � e2)[e�/x] : τ .

Case Val EmptyContainer: Here e = [], so [][e�/x] = []. The desired result is imme-diate.

Case Val Container: Here e = [e��], with Γ, x:τ � � e�� : τ ��, and τ = Ct τ ��, by inversion.By the induction hypothesis, Γ � e��[e�/x] : τ ��. By Val Container, Γ � [e��[e�/x]] : τ =Γ � ([e��])[e�/x] : τ .

Case Val ContainerAppend: Here e = e1 � e2, with Γ, x:τ � � e1 : Ct τ ��, Γ, x:τ � � e2 :Ct τ �� and τ = Ct τ ��, by inversion. By the induction hypothesis, Γ � e1[e�/x] : Ct τ ��

and Γ � e2[e�/x] : Ct τ ��. By Val ContainerAppend, Γ � e1[e�/x] � e2[e�/x] : τ = Γ �(e1 � e2)[e�/x] : τ .

Case Val EmptyPar: Here e = {}, so {}[e�/x] = {}. The desired result is immediate.Case Val Par: Here e = {e��}, with Γ, x:τ � � e�� : τ ��, and τ = Par τ ��, by inversion.

By the induction hypothesis, Γ � e��[e�/x] : τ ��. By Val Par, Γ � {e��[e�/x]} : τ = Γ �({e��})[e�/x] : τ .

Case ParT Par: Here e = e1 | e2, with Γ, x:τ � � e1 : Par τ ��, Γ, x:τ � � e2 : Par τ ��

and τ = Par τ ��, by inversion. By the induction hypothesis, Γ � e1[e�/x] : Par τ �� and

21

Page 28: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Γ � e2[e�/x] : Par τ ��. By ParT Par, Γ � e1[e�/x] | e2[e�/x] : τ = Γ � (e1 | e2)[e�/x] : τ .Case ParT Seq: Here e = e1 � e2, with Γ, x:τ � � e1 : Par τ �� and Γ, x:τ � � e2 : τ �� → τ ,

by inversion. By the induction hypothesis, Γ � e1[e�/x] : Par τ �� and Γ � e2[e�/x] : τ �� →τ . By ParT Seq, Γ � e1[e�/x] � e2[e�/x] : τ = Γ � (e1 � e2)[e�/x] : τ .

Case ParT Otherwise: Here e = e1 >< e2, with Γ, x:τ � � e1 : Par τ ��, Γ, x:τ � � e2 :Par τ �� and τ = Par τ ��, by inversion. By the induction hypothesis, Γ � e1[e�/x] : Par τ ��

and Γ � e2[e�/x] : Par τ ��. By ParT Otherwise, Γ � e1[e�/x] >< e2[e�/x] : τ = Γ �(e1 >< e2)[e�/x] : τ .

Case ParT LiftFut: Here e = liftf e��, with Γ, x:τ � � e�� : Fut τ �� and τ = Par τ ��,by inversion. By the induction hypothesis, Γ � e��[e�/x] : Fut τ ��. By ParT LiftFut,Γ � liftf e��[e�/x] : τ = Γ � (liftf e��)[e�/x] : τ .

Case ParT Extract: Here e = extract e��, with Γ, x:τ � � e�� : Par τ �� and τ = Ct τ ��,by inversion. By the induction hypothesis, Γ � e��[e�/x] : Par τ ��. By ParT Extract,Γ � extract e��[e�/x] : τ = Γ � (extract e��)[e�/x] : τ .

Case ParT Peek: Here e = peek e��, with Γ, x:τ � � e�� : Par τ �� and τ = Par τ ��,by inversion. By the induction hypothesis, Γ � e��[e�/x] : Par τ ��. By ParT Peek, Γ �peek e��[e�/x] : τ = Γ � (peek e��)[e�/x] : τ .

Case ParT Join: Here e = join e��, with Γ, x:τ � � e�� : Par (Par τ ��) and τ = Par τ ��,by inversion. By the induction hypothesis, Γ � e��[e�/x] : Par (Par τ ��). By ParT Join,Γ � join e��[e�/x] : τ = Γ � (join e��)[e�/x] : τ .

Lemma 2.5.7 (Subexpression well-typedness). If Γ � e : τ , and e� is a subexpres-sion of e, then there exists a Γ� such that Γ� ⊇ Γ, and some type τ �, such that Γ� � e� : τ �.

Proof. By induction on a derivation of Γ � e : τ .1. If e ≡ (fut f e��), then there are two possibilities: e� = e, or e� is a subexpression

of e��. In the first case, we are done by assumption. In the second case, withΓ � (fut f e��) ok and Γ � e�� : τ � by inversion, the result follows by the inductionhypothesis.

2. If e ≡ (task e�� f), then there are two possibilities: e� = e, or e� is a subexpressionof e��. In the first case, we are done by assumption. In the second case, withΓ � (task e�� f) ok and Γ � e�� : τ � by inversion, the result follows by the inductionhypothesis.

3. If e ≡ c, then the only possible case is where e� = e. Thus, we are done byassumption.

4. If e ≡ f , then the only possible case is where e� = e. Thus, we are done byassumption.

5. If e ≡ x, then the only possible case is where e� = e. Thus, we are done byassumption.

6. If e ≡ λx.e��, then there are two possibilities: e� = e, or e� is a subexpressionof e��. In the first case, we are done by assumption. In the second case, withΓ � λx.e�� : τ �� → τ � and Γ, x:τ �� � e�� : τ � by inversion, the result follows by theinduction hypothesis.

22

Page 29: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

7. If e ≡ e1 e2, then there are three possibilities: e� = e, e� is a subexpression of e1,or e� is a subexpression of e2. In the first case, we are done by assumption. In thesecond case, with Γ � e1 : τ �� → τ � by inversion, the result follows by the inductionhypothesis. In the third case, with Γ � e2 : τ �� by inversion, the result follows bythe induction hypothesis.

8. If e ≡ async e��, then there are two possibilities: e� = e, or e� is a subexpressionof e��. In the first case, we are done by assumption. In the second case, withΓ � async e�� : Fut τ � and Γ � e�� : τ � by inversion, the result follows by theinduction hypothesis.

9. If e ≡ get e��, then there are two possibilities: e� = e, or e� is a subexpressionof e��. In the first case, we are done by assumption. In the second case, withΓ � get e�� : τ � and Γ � e�� : Fut τ � by inversion, the result follows by the inductionhypothesis.

10. If e ≡ e1 � e2, then there are three possibilities: e� = e, e� is a subexpression of e1,or e� is a subexpression of e2. In the first case, we are done by assumption. In thesecond case, with Γ � e1 : Fut τ � by inversion, the result follows by the inductionhypothesis. In the third case, with Γ � e2 : τ � → τ �� by inversion, the result followsby the induction hypothesis.

11. If e ≡ [], then the only possible case is where e� = e. Thus, we are done byassumption.

12. If e ≡ [e��], then there are two possibilities: e� = e, or e� is a subexpression of e��. Inthe first case, we are done by assumption. In the second case, with Γ � [e��] : Ct τ �

and Γ � e�� : τ � by inversion, the result follows by the induction hypothesis.13. If e ≡ e1 � e2, then there are three possibilities: e� = e, e� is a subexpression of e1,

or e� is a subexpression of e2. In the first case, we are done by assumption. In thesecond case, with Γ � e1 : Ct τ � by inversion, the result follows by the inductionhypothesis. In the third case, with Γ � e2 : Ct τ � by inversion, the result followsby the induction hypothesis.

14. If e ≡ {}, then the only possible case is where e� = e. Thus, we are done byassumption.

15. If e ≡ {e��}, then there are two possibilities: e� = e, or e� is a subexpressionof e��. In the first case, we are done by assumption. In the second case, withΓ � {e��} : Par τ � and Γ � e�� : τ � by inversion, the result follows by the inductionhypothesis.

16. If e ≡ e1 | e2, then there are three possibilities: e� = e, e� is a subexpression of e1,or e� is a subexpression of e2. In the first case, we are done by assumption. In thesecond case, with Γ � e1 : Par τ � by inversion, the result follows by the inductionhypothesis. In the third case, with Γ � e2 : Par τ � by inversion, the result followsby the induction hypothesis.

17. If e ≡ e1 � e2, then there are three possibilities: e� = e, e� is a subexpression of e1,or e� is a subexpression of e2. In the first case, we are done by assumption. In thesecond case, with Γ � e1 : Par τ � by inversion, the result follows by the inductionhypothesis. In the third case, with Γ � e2 : τ � → τ �� by inversion, the result follows

23

Page 30: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

by the induction hypothesis.18. If e ≡ e1><e2, then there are three possibilities: e� = e, e� is a subexpression of e1,

or e� is a subexpression of e2. In the first case, we are done by assumption. In thesecond case, with Γ � e1 : Par τ � by inversion, the result follows by the inductionhypothesis. In the third case, with Γ � e2 : Par τ � by inversion, the result followsby the induction hypothesis.

19. If e ≡ liftf e��, then there are two possibilities: e� = e, or e� is a subexpressionof e��. In the first case, we are done by assumption. In the second case, withΓ � liftf e�� : Par τ � and Γ � e�� : Fut τ � by inversion, the result follows by theinduction hypothesis.

20. If e ≡ extract e��, then there are two possibilities: e� = e, or e� is a subexpressionof e��. In the first case, we are done by assumption. In the second case, withΓ � extract e�� : Ct τ � and Γ � e�� : Par τ � by inversion, the result follows by theinduction hypothesis.

21. If e ≡ peek e��, then there are two possibilities: e� = e, or e� is a subexpressionof e��. In the first case, we are done by assumption. In the second case, withΓ � e�� : Par τ � by inversion, the result follows by the induction hypothesis.

22. If e ≡ join e��, then there are two possibilities: e� = e, or e� is a subexpressionof e��. In the first case, we are done by assumption. In the second case, withΓ � join e�� : Par τ � and Γ � e�� : Par (Par τ �) by inversion, the result follows bythe induction hypothesis.

Lemma 2.5.8 (Expression substitution). If Γ� � e : τ � is a subderivation of Γ �T [e] ok, and the expression e� satisfies Γ� � e� : τ �, then Γ � T [e�] ok. Similarly, ifΓ� � e : τ � is a subderivation of Γ � C[e] : τ , and the expression e� satisfies Γ� � e� : τ �,then Γ � C[e�] : τ .

Proof. By induction on a derivation of Γ � T [e] ok or Γ � C[e] : τ .1. If T ≡ (task C[ ] f), then Γ � C[e] : τ for some type τ , by inversion. By the

induction hypothesis we have Γ � C[e�] : τ , for some type τ . By C-Task we haveΓ � (task C[e�] f), which is the required result.

2. If C = [ ], since [e] = e, then τ � = τ . Thus, as Γ � e� : τ , then Γ � [e�] : τ , giving usthe required result.

3. If C = C �[ ] e��, then Γ � C �[e] e�� : τ with Γ � C �[e] : τ � → τ by inversion. Bythe induction hypothesis we have Γ � C �[e�] : τ � → τ , and by Val App we haveΓ � C[e�] e�� : τ , as required.

4. If C = e�� C �[ ], then Γ � e�� C �[e] : τ with Γ � C �[e] : τ � by inversion. By theinduction hypothesis we have Γ � C �[e�] : τ �, and by Val App we have Γ � e�� C �[e�] :τ , as required.

5. If C = get C �[ ], then Γ � get C �[e] : τ with Γ � C �[e] : Fut τ by inversion.By the induction hypothesis we have Γ � C �[e�] : Fut τ , and by Val Get we haveΓ � get C �[e�] : τ , as required.

6. If C = C �[ ] � e��, then Γ � C �[e] � e�� : τ with Γ � C �[e] : Fut τ � by inversion.

24

Page 31: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

By the induction hypothesis we have Γ � C �[e�] : Fut τ �, and by Val Chain we haveΓ � C �[e�] � e�� : τ , as required.

7. If C = e�� � C �[ ], then Γ � e�� � C �[e] : Fut τ with Γ � C �[e] : τ � → τ by inversion.By the induction hypothesis we have Γ � C �[e�] : τ � → τ , and by Val Chain we haveΓ � e�� � C �[e�] : Fut τ , as required.

8. If C = [C �[ ]], then Γ � [C �[e]] : Ct τ with Γ � C �[e] : τ by inversion. By theinduction hypothesis we have Γ � C �[e�] : τ , and by Val Container we have Γ �[C �[e�]] : Ct τ , as required.

9. If C = C �[ ] � e��, then Γ � C �[e] � e�� : Ct τ with Γ � C �[e] : Ct τ by inversion. Bythe induction hypothesis we have Γ � C �[e�] : Ct τ , and by Val ContainerAppend

we have Γ � C �[e�] � e�� : Ct τ , as required.10. If C = e�� � C �[ ], then Γ � e�� � C �[e] : τ with Γ � C �[e] : Ct τ by inversion. By the

induction hypothesis we have Γ � C �[e�] : Ct τ , and by Val ContainerAppend wehave Γ � e�� � C �[e�] : Ct τ , as required.

11. If C = {C �[ ]}, then Γ � {C �[e]} : Par τ with Γ � C �[e] : τ by inversion. By theinduction hypothesis we have Γ � C �[e�] : τ , and by Val Par we have Γ � {C �[e�]} :Par τ , as required.

12. If C = C �[ ] | e��, then Γ � C �[e] | e�� : Par τ with Γ � C �[e] : Par τ by inversion.By the induction hypothesis we have Γ � C �[e�] : Par τ , and by ParT Par we haveΓ � C �[e�] | e�� : Par τ , as required.

13. If C = e�� |C �[ ], then Γ � e�� |C �[e] : Par τ with Γ � C �[e] : Par τ by inversion.By the induction hypothesis we have Γ � C �[e�] : Par τ , and by ParT Par we haveΓ � e�� |C �[e�] : Par τ , as required.

14. If C = e�� � C �[ ], then Γ � e�� � C �[e] : Par τ with Γ � C �[e] : τ �� → τ � byinversion. By the induction hypothesis we have Γ � e�� � C �[e�] : Par τ , and byParT Seq we have Γ � C[e�] : Par τ , as required.

15. If C = C �[ ] � e��, then Γ � C �[e] � e�� : Par τ with Γ � C �[e] : Fut τ � by inversion.By the induction hypothesis we have Γ � C �[e�] � e�� : Par τ , and by ParT Seq wehave Γ � C[e�] : Par τ , as required.

16. If C = C �[ ]>< e��, then Γ � C �[e]>< e�� : Par τ with Γ � C �[e] : Par τ by inversion.By the induction hypothesis we have Γ � C �[e�] : Par τ , and by Val Otherwise wehave Γ � C �[e�]>< e�� : Par τ , as required.

17. If C = liftf C �[ ], then Γ � liftf C �[e] : Par τ with Γ � C �[e] : Fut τ by inversion.By the induction hypothesis we have Γ � C �[e�] : Fut τ , and by ParT LiftFut wehave Γ � liftf C �[e�] : Par τ , as required.

18. If C = extract C �[ ], then Γ � extract C �[e] : Ct τ with Γ � C �[e] : Par τ byinversion. By the induction hypothesis we have Γ � C �[e�] : Par τ , and by ParT

Extract we have Γ � extract C �[e�] : Ct τ , as required.19. If C = peek C �[ ], then Γ � peek C �[e] : Par τ with Γ � C �[e] : Par τ by inversion.

By the induction hypothesis we have Γ � C �[e�] : Par τ , and by ParT Peek we haveΓ � peek C �[e�] : Par τ , as required.

20. If C = join C �[ ], then Γ � join C �[e] : Par τ with Γ � C �[e] : Par Par τ byinversion. By the induction hypothesis we have Γ � C �[e�] : Par Par τ , and by

25

Page 32: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

ParT Join we have Γ � join C �[e�] : Par τ , as required.

Theorem 2.5.2 (Preservation). If Γ � config ok and config −→ config�, thenthere exists a Γ� such that Γ� ⊇ Γ and Γ� � config� ok.

Proof. By induction on a derivation of config −→ config�.Case OneConfig: Here, config ≡ config1 config2 and config� ≡ config�1 config2.

There are multiple subcases to be considered:Subcase β-red: Here, config1 ≡ T [(λx.e) v], config�1 ≡ T [e[v/x]], and Γ� = Γ.

By Lemma 2.5.7, we have Γ � (λx.e) v : τ , for some τ , with Γ � λx.e : τ � → τand Γ � v : τ �, by inversion. There is only one typing rule for abstractions,Val Fun, from which we know Γ, x:τ � � e : τ . By the substitution lemma, wehave Γ � e[v/x] : τ . Using Lemma 2.5.8, we have that Γ � config�1 ok, asrequired.

Subcase Async: Here, config1 ≡ T [async e], config�1 ≡ (fut f) (task e f) T [f ]and Γ� = Γ, f :Fut τ . By Lemma 2.5.7, we have Γ � async e : τ , for some τ ,and by inversion τ = Fut τ � for some τ �. By C-Fut, Γ� � (fut f) ok, and by C-

Task, Γ� � (task e f) ok. As futset((fut f)) = f and futset((task e f)) = ∅,by using C-Config, we can conclude that Γ� � config�1 ok.

Subcase Get: Here, config1 ≡ (fut f v) T [get f ], with f :Fut τ ∈ Γ and Γ � v :τ , by inversion, and config�1 ≡ (fut f v) T [v], with Γ� = Γ. By Lemma 2.5.7,we have Γ � get f : τ , for some τ . As this type is equal to v, by usingLemma 2.5.8, we have that Γ � T [v] ok. By C-FutV, we have (fut f v) ok.As futset((fut f v)) = f and futset(T [v]) = ∅, by using C-Config, we canconclude that Γ � config�1 ok.

Subcase Chain: Here, config1 ≡ T [f1 � λx.e] with f1:Fut τ � ∈ Γ, config�1 ≡(fut f2) (chain f1 (λx.e) f2) T [f2], and Γ� = Γ, f2:Fut τ . By Lemma 2.5.7and inversion, we have Γ � f1 � λx.e : Fut τ and Γ � λx.e : τ � → τ . As thisfirst type is equal to f2, by using Lemma 2.5.8, we have that Γ � T [f2] ok.By C-Fut, Γ� � (fut f2) ok, and by C-Chain Γ� � (chain f1 (λx.e) f2) ok. Asfutset((fut f2)) = f2, futset((chain f1 e f2)) = ∅, and futset(T [v]) = ∅, byusing C-Config, we can conclude that Γ� � config�1 ok.

Subcase ChainVal: Here, config1 ≡ (fut f1 v) (chain f1 e f2), with f1:Fut τ � ∈Γ, f2:Fut τ ∈ Γ, Γ � e : τ � → τ and Γ � v : τ �, by inversion. Also, config�1 ≡(task (e v) f2), with Γ� = Γ. By Val App, Γ � (e v) : τ . Thus, by C-Task, wecan conclude that Γ � config�1 ok.

Subcase FutVal: Here, config1 ≡ (fut f) (task v f), with f :Fut τ ∈ Γ andΓ � v : τ , by inversion, and config�1 ≡ (fut f v), with Γ� = Γ. By C-FutV,Γ � config�1 ok, as required.

Subcase SeqLF: Here, config1 ≡ T [(liftf f) � e] and config�1 ≡ liftf (f � e),with Γ� = Γ. By Lemma 2.5.7, we have Γ � (liftf f) � e : Par τ , forsome τ , with Γ � f : Fut τ � and Γ � e : τ � → τ , by inversion. By Val

Chain, we have that Γ � f � e : Fut τ , which with ParT LiftFut gives us

26

Page 33: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Γ � liftf (f � e) : Par τ . Using Lemma 2.5.8, we have that Γ � config�1 ok,as required.

Subcase OtherV: Here, config1 ≡ T [e1 ><e2] and config�1 ≡ T [e1], with Γ� = Γ.By Lemma 2.5.7, we have Γ � e1><e2 : Par τ , for some τ , with Γ � e1 : Par τ ,by inversion. Thus, by using Lemma 2.5.8, we have that Γ � config�1 ok, asrequired.

Subcase OtherH: Here, config1 ≡ T [e1 ><e2] and config�1 ≡ T [e2], with Γ� = Γ.By Lemma 2.5.7, we have Γ � e1><e2 : Par τ , for some τ , with Γ � e2 : Par τ ,by inversion. Thus, by using Lemma 2.5.8, we have that Γ � config�1 ok, asrequired.

Subcase PeekV: Here, config1 ≡ T [peek (e1 | {v} | e2)] and config�1 ≡ T [{v}],with Γ� = Γ. By Lemma 2.5.7, we have Γ � peek (e1 | {v} | e2) : Par τ , forsome τ , with Γ � {v} : Par τ , by inversion. By using Lemma 2.5.8, we havethat Γ � config�1 ok, as required.

Subcase PeekH: Here, config1 ≡ T [peek {}] and config�1 ≡ T [{}], with Γ� = Γ.By Lemma 2.5.7, we have Γ � peek {} : Par τ , for some τ , with Γ � {} :Par τ , by inversion. By using Lemma 2.5.8, we have that Γ � config�1 ok, asrequired.

Subcase ManyPar: Here, T [e1 | e2] ∈ config1 and T [e�1 | e�2] ∈ config�1. By theinduction hypothesis on config1, there exists some Γ� such that Γ� ⊇ Γ andΓ� � config�1 ok. The only semantic rules that introduce new elements intoΓ are Async and Chain. In both of these cases, the new variables are fresh.As the new variables are created in the same reduction step (which could beseen as “at the same time”), the order of their placement at the end of Γshould not matter. Thus, the variables created by the reduction step of T [e1],represented as Γ1, are arbitrarily placed before those created by the reductionstep of T [e2], represented as Γ2, and Γ� = Γ,Γ1,Γ2.

Case ShareConfig: Here, config ≡ config1 config2 config3 and config� ≡config1 config

�2 config

�3. By the induction hypothesis, for i ∈ {2, 3}, there exists some

Γi such that Γi ⊇ Γ and Γi � config�i ok. config1 is not considered, as it does not takea step in ShareConfig. The only semantic rules that introduce new elements into Γ areAsync and Chain. In both of these cases, the new variables are fresh. It is possible to viewΓi as Γi = Γ,Γ�

i, where Γ�i are the new variables concatenated to Γ. As the new variables

are created in the same reduction step (which could be seen as “at the same time”), theorder of their placement at the end of Γ should not matter. Thus, the variables createdby config2 are arbitrarily placed before those of config3, and Γ� = Γ,Γ�

2,Γ�3.

2.6 Deadlock-freedom

Deadlocks occur when it is not possible to take any reduction steps because multipleexpressions are waiting for each other to finish. They appear in this system when thereis a circular dependency among future variables. Simple examples of this could be thefollowing correctly defined configuration:

27

Page 34: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

(fut f1) (fut f2) (task (get f2) f1) (task (get f1) f2) (2.1)

(fut f) (task (get f) f) (2.2)

Configuration 2.1 represents a simple deadlock between two tasks, and configura-tion 2.2 is a self-deadlocked task.

In order to avoid this, an explicit ordering on future variables is introduced into thecontext Γ. This order is expressed as f1 ≺ f2 ≺ · · · ≺ fn, where fi ≺ fi+1 symbolizesthat the future fi was created before fi+1. The binary relation ≺ is transitive, that is,if f1 ≺ f2 and f2 ≺ f3, then f1 ≺ f3, but not reflexive, so the relation fi ≺ fi cannotoccur.

The task that leaves a value in fi may depend on any number of future variables inthe set {f1, . . . , fi−1}. No variable of this set can depend on fi, as doing so may lead toa deadlock.

The order of the future variables is the order of their creation, as expressed in theprogress theorem proof. An example of this future variable order is Γ ≡ f :Fut τ, f �:Fut τ �,which indicates that the ordering is f ≺ f �. That is, f was created before, or at thesame time as, f �.

With parallelism, multiple future variables might be created at the same time. Ifthis were to happen, the order that these futures are introduced into the context Γ doesnot matter. This is because Γ is the same to all configurations in a reduction step, so ifany future variable depends on another for its result, it must already exist in Γ.

Defining a deadlock-freedom invariant for the type soundness proofs would ensurethat as long as this invariant is true, there would be no deadlocks during execution. Aninvariant that expresses the desired property is if f � f �, then f � ∈ Γ and f � ≺ f .

Proof. By induction on a derivation of config −→ config�.

28

Page 35: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

(Env ∅)

∅ � �

(Env x)

x /∈ dom(Γ)

Γ, x:τ � �

(Val c)

c is a constant of type τΓ � c : τ

(Val f)

f :Fut τ ∈ ΓΓ � f : Fut τ

(Val x)

x:τ ∈ ΓΓ � x : τ

(Val Fun)

Γ, x:τ � e : τ �

Γ � λx.e : τ → τ �

(Val App)

Γ � e1 : τ � → τ Γ � e2 : τ �

Γ � e1 e2 : τ

(Val Async)

Γ � e : τΓ � async e : Fut τ

(Val Get)

Γ � e : Fut τΓ � get e : τ

(Val Chain)

Γ � e1 : Fut τ � Γ � e2 : τ � → τΓ � e1 � e2 : Fut τ

(Val EmptyContainer)

Γ � [] : Ct τ

(Val Container)

Γ � e : τΓ � [e] : Ct τ

(Val ContainerAppend)

Γ � e1 : Ct τ Γ � e2 : Ct τΓ � e1 � e2 : Ct τ

(Val EmptyPar)

Γ � {} : Par τ

(Val Par)

Γ � e : τΓ � {e} : Par τ

(ParT Par)

Γ � e1 : Par τ Γ � e2 : Par τΓ � e1 | e2 : Par τ

(ParT Seq)

Γ � e1 : Par τ � Γ � e2 : τ � → τΓ � e1 � e2 : Par τ

(ParT Otherwise)

Γ � e1 : Par τ Γ � e2 : Par τΓ � e1 >< e2 : Par τ

(ParT LiftFut)

Γ � e : Fut τΓ � liftf e : Par τ

(ParT Extract)

Γ � e : Par τΓ � extract e : Ct τ

(ParT Peek)

Γ � e : Par τΓ � peek e : Par τ

(ParT Join)

Γ � e : Par (Par τ)

Γ � join e : Par τ

Figure 2.1: Type rules

29

Page 36: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

(C-Fut)

f ∈ dom(Γ)

Γ � (fut f) ok

(C-FutV)

f :Fut τ ∈ Γ Γ � v : τ

Γ � (fut f v) ok

(C-Task)

f :Fut τ ∈ Γ Γ � e : τ

Γ � (task e f) ok

(C-Chain)

f1:Fut τ1 ∈ Γ f2:Fut τ2 ∈ Γ Γ � e : τ1 → τ2Γ � (chain f1 e f2) ok

(C-Config)

Γ � config1 ok Γ � config2 ok futset(config1) ∩ futset(config2) = ∅Γ � config1 config2 ok

Figure 2.2: Configuration type system

futset(ε) = ∅futset((fut f)) = {f}futset((fut f v)) = {f}futset((task e f)) = ∅futset((chain f1 e f2)) = ∅futset((config1 config2)) = futset((config1)) ∪ futset((config2))

Figure 2.3: Future set function needed for the configuration type system (figure 2.2)

C ::= [ ] | C e | v C| get C | C � e | v � C| [C] | C � e | v � C | {C}| (C | e) | (e |C)| e � C | C � v| C >< e| liftf C| extract C| peek C| join C

T ::= (task C f)

Figure 2.4: Evaluation and task context

(Expressions)

e ≡ e�(C-contexts)

C[e] ≡ C[e�](T-contexts)

T [e] ≡ T [e�]

(CommutativeConfig)

config config� ≡ config� config

(AssociativeConfig)

(config config�) config�� ≡ config (config� config��)

Figure 2.5: Structural congruence

30

Page 37: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

(PeekM) peek (e1 | e2) = peek ((peek e1) | (peek e2))

(PeekS) peek (peek e) = peek e

(LiftFutGet) liftf f = {get f}

Figure 2.6: Parallel combinator equivalences

(SeqE) {} � e� −→ {}

(SeqS) {e} � e� −→ {e� e}

(SeqM) (e1 | e2) � e� −→ (e1 � e�) | (e2 � e�)

(ExtractE) extract {} −→ []

(ExtractS) extract {e} −→ [e]

(ExtractM) extract (e1 | e2) −→ (extract e1) � (extract e2)

(JoinE) join {} −→ {}

(JoinS) join {e} −→ {e}

(JoinM) join (e1 | e2) −→ (join e1) | (join e2)

Figure 2.7: Recursive parallel combinator definitions

31

Page 38: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

(β-red) T [(λx.e) v] −→ T [e[v/x]]

(Async)fresh f

T [async e] −→ (fut f) (task e f) T [f ]

(Get) (fut f v) T [get f ] −→ (fut f v) T [v]

(Chain)fresh f2

T [f1 � λx.e] −→ (fut f2) (chain f1 (λx.e) f2) T [f2]

(ChainVal) (fut f1 v) (chain f1 e f2) −→ (fut f1 v) (task (e v) f2)

(FutVal) (fut f) (task v f) −→ (fut f v)

(SeqLF) T [(liftf f) � e] −→ T [liftf (f � e)]

(OtherV) T [e1 | {v} | e2 >< e3] −→ T [e1 | {v} | e2]

(OtherH) T [{}>< e] −→ T [e]

(PeekV) T [peek (e1 | {v} | e2)] −→ T [{v}]

(PeekH) T [peek {}] −→ T [{}]

(Struct)e ≡ e� e� −→ e�� e�� ≡ e���

e −→ e���

Figure 2.8: Operational semantics

(OneConfig)

config1 −→ config�1config1 config2 −→ config�1 config2

(ShareConfig)

config1 config2 −→ config1 config�2 config1 config3 −→ config1 config

�3

config1 config2 config3 −→ config1 config�2 config

�3

(ManyPar)

T [e1] −→ T [e�1] T [e2] −→ T [e�2]

T [e1 | e2] −→ T [e�1 | e�2]Figure 2.9: Parallel reduction rules

32

Page 39: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

depset(c) = ∅depset(f) = {f}depset(x) = ∅depset(λx.e) = depset(e)depset(e1 e2) = depset(e1) ∪ depset(e2)depset(async e) = depset(e)depset(get e) = depset(e)depset(e1 � e2) = depset(e1) ∪ depset(e2)depset([]) = ∅depset([e]) = depset(e)depset(e1 � e2) = depset(e1) ∪ depset(e2)depset({}) = ∅depset({e}) = depset(e)depset(e1 | e2) = depset(e1) ∪ depset(e2)depset(e1 � e2) = depset(e1) ∪ depset(e2)depset(e1 >< e2) = depset(e1) ∪ depset(e2)depset(liftf e) = depset(e)depset(extract e) = depset(e)depset(peek e) = depset(e)depset(join e) = depset(e)

Figure 2.10: Future dependency set function definition

f4 f3f1

f2 f1

Figure 2.11: Dependency tree of configdep ex

33

Page 40: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Chapter 3

Additional Constructs

The proposed semantics have been kept to a bare minimum core language, which hasshown to be simple, yet expressive enough to solve complicated problems. This sim-plicity, while useful for proving properties of the language extension and building otherelements upon it, makes it hard for a programmer to directly used the proposed syntax.This chapter provides suggestions that could improve the legibility and expressivenessof the parallel combinators extension, which would be of particular use to the users andimplementors of the language.

3.1 Core Language Extensions

One possible way to extend the parallel combinators extension is to enrich the proposedcore language. The following constructs are a few examples that could be particularlyuseful.

3.1.1 Let Bindings

This extension greatly increases the readability of programs by helping the user avoidrepetition. The let binding names an expression, which is then used inside anotherexpression. The following shows the equivalence of this construct if it were built usingthe existing core, as well as its type rule and operational semantics if it were includedin the language.

let x = e1 in e2 ≡ (λx.e2) e1

Γ � e1 : τ � Γ, x:τ � � e2 : τΓ � let x = e1 in e2 : τ

T [let x = v in e] −→ T [e[v/x]]

3.1.2 Wildcards

Encore is a language with side effects, so at times one might want to run a computationand ignore its result. In these cases, choosing an explicit variable name is unneeded,

34

Page 41: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

as it will not be used if the value is unwanted. Wildcard variable binders, writtenas an underscore instead of providing a variable name, provide the ability to assigncomputational results to a character that will never be used.

Γ � e : τΓ � λ .e : τ

T [(λ .e) v] −→ T [e]

3.1.3 Sequencing

Encore, as a non-functional language with side effects, may sequence multiple expres-sions. The intention of this extension is to evaluate the computations of a sequence andonly use the result of the last expression. Sequencing could be expressed by only usingwildcards in the presented subset of Encore, with the inclusion of let bindings to helpincrease the program legibility, yet results in a verbose form of expressing this simpleidea. Another possibility is to introduce syntax that expresses sequencing. The reasonthe Otherwise combinator was designated the >< symbol was to use ; as the syntax forthis extension, a commonly used symbol to express sequencing in other programminglanguages.

e1 ; e2 ≡ let = e1 in e2

Γ � e1 : τ � Γ � e2 : τΓ � e1 ; e2 : τ

C ::= · · · | C ; e | v ;C

3.2 Combining Combinators

The power of the proposed parallel combinators not only comes from the abstractionthey provide to parallel computations, but also to their ability of forming more complexand powerful combinators. This section presents a few combinators that have surfacedfrom example programs built during the course of this thesis.

3.2.1 Prune Combinator

The inspiration for this combinator comes from Orc’s pruning combinator. While thiscombinator was originally part of the proposed syntax, it was removed when realizedthat it could be expressed by using the peek combinator.

e1 � e2 ≡ e1 (async (peek e2))

Γ � e1 : Fut (Par τ �) → Par τ Γ � e2 : Par τ �

Γ � e1 � e2 : Par τ

fresh f

T [λx.e1 � e2] −→ (fut f) (task (peek e2) f) T [e1[f/x]]

35

Page 42: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

As with Orc’s variant of this combinator, this operation executes two expressions inparallel, where the first expression will be suspended if it depends on a value providedby the second expression, and will continue once that value has been fulfilled. What isparticularly different from Orc’s pruning combinator is that this version uses Encore’sfutures as a way to execute both expressions in parallel. This works by letting the firstexpression execute, and only block if it awaits for the future provided by the secondexpression to be fulfilled.

3.2.2 Bind Combinator

This combinator provides a seemingly simple, yet very powerful function: it is capableof taking every expression that will be calculated by one parallel expression, and useits values to initiate separate parallel executions in another expression, and ultimatelyflattening this result.

e1 ≫ e2 ≡ join (e1 � e2)

Γ � e1 : Par τ � Γ � e2 : τ � → Par τΓ � e1 ≫ e2 : Par τ

T [e1 ≫ λx.e2] −→ T [join (e1 � e2)]

3.2.3 Imitating Maybe Types

Encore currently does not provide a notion of maybe types, which are also known asoption types. Maybe types express the possibility of existence of a certain value. UsingHaskell’s syntax, if an expression has the type Maybe Int, the values of that expressioncan be Just v (where v is a number), or Nothing.

Parallel combinators can be used to express this concept, although it is much lesselegant than using real maybe types:

nothing ≡ {} just e ≡ {e}

Γ � nothing : Par τΓ � e : τ

Γ � just e : Par τ

T [nothing] −→ T [{}] T [just e] −→ T [{e}]

The empty nothing constructor of the maybe type can be represented with {}, andthe constructor encapsulating some datum can be represented with {e}.

3.3 Syntactic Sugar for Parallel Combinators

In order to facilitate even further writing programs with this syntax, it should be possibleto express parallel combinators with a reduced syntax. Instead of using the parallel

36

Page 43: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

combinator to continuously combine parallel structures, as has been done throughoutthe thesis ({e1} | · · · | {en}), the following syntax is proposed: {e1, . . . , en}.

To use this syntax, it is also important to incorporate the lifted futures, also givingthe opportunity to easily write them with the rest of elements in a parallel structure.Thus, the proposed syntax to represent liftf e would be {�e }.

With these two proposals, it is easy to provide a desugaring function that wouldtransform the provided shorthand into its core representation.

3.4 Exceptions

Exceptions can occur not only from troublesome computations, such as divisions by zeroor arithmetic overflows, but also from problems that may arise in the machine wherethe computation is being executed, which may include not being able to open a file ormemory corruption. Depending on the severity of the exception, these errors can causethe program to abort. Exceptions provide a way of signaling and recovering from anerror condition during the execution of a program.

As explained previously, one alternative to controlling certain exceptions from oc-curring is through the use of a maybe type. The problem with this solution is that notmuch information is provided with the error. One possible solution would be to usesomething similar to the Either monad in Haskell, which also provides attached datawhen an error takes place.

Using a maybe or an either monad might be cumbersome, as it forces the user tocontinuously check the value of the data that is being carried around in the program.This could be solved by raising an exception during run time, which would providecertain information about the error, and using an error handler around the expressionto be evaluated. The error handler contains an alternative expression to be executed inthe occurrence of an exception. There are multiple possibilities for the semantics of theerror handler, which include catching multiple types of errors and treating them witha single expression, or allowing the execution of a different expression for each caughterror.

In the context of parallel combinators, where an error can occur in any element ofthe parallel structure, choosing one representation of exceptions over another may makeit difficult for users to treat errors, as the user might be seeing an unexpected behaviorfrom their program. Questions that one could ask oneself when exploring options are: Ifone element of a parallel structure raises an error, could this element not be ignored? Ifan error is truly exceptional, should the whole parallel structure not fail? What shouldhappen to the rest of the computations in a parallel structure if an exception is raised?

Various programming languages have various constructs to provide support for ex-ceptions, yet I believe that a try-in-unless approach [43] would be a very elegant additionto the language semantics. The proposed syntax for this construct could be the following:

try x = e in e� unless E1 ⇒ e1 | · · · | En ⇒ en

What this syntax states is that the value of the expression e is bound to the variable x,

37

Page 44: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

and then the expression e� is evaluated. If an exception, Ei, is raised during the evalu-ation of e, then its corresponding expression, ei, is evaluated. Thus, if a single elementin a parallel structure raises an exception, the whole construct would be discarded andhave to be treated. In order to avoid this, it would be possible to use a fine-grainedtry-in-unless approach on each of the individual expressions in a parallel structure.

38

Page 45: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Chapter 4

Conclusions and Future Work

The main contribution of this thesis is the theoretical framework which integrates par-allel computation combinators with futures in the Encore programming language. Itssimplicity makes it a suitable starting point for implementing the parallel combinatorextension not only in Encore, but also in any language with similar semantics. However,due to the theoretical nature of this thesis, it would not be suitable to directly translateits content into a language implementation. Details, such as the data structures that areused or the mechanisms to distribute tasks, have been abstracted away in order to im-prove the clarity of the semantics. This will provide the implementors of this extensionwith a high degree of flexibility when integrating parallel combinators into Encore.

The subtleties of proving the soundness of the extension proved to be much moredifficult than originally thought. Consequently, parts of the formalization, such as theintegration with active objects, were not developed due to time restrictions. This is notto say that there is little left to do, as the work provided here can be extended intomultiple directions. This chapter contains examples of how to continue working on theideas and results produced by this thesis.

4.1 Formalizing Exceptions

Exceptions were discussed in section 3.4, yet they still need to be extended into thesemantics to be fully considered a part of the language. The proposed syntax for excep-tions is easy to understand at a high level, yet may show unexpected behavior if it isnot fully developed and incorporated into the formal semantics. The process of inclusionof this formalization into the language semantics, in addition to proving its soundness,might help find other alternatives to expressing exceptions in the parallel combinators.

4.2 Avoiding Intermediate Parallel Structures

Certain parallel combinators, such as the sequence combinator, create intermediate par-allel structures when they are used. This process is costly at run time, both in space

39

Page 46: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

and time, yet can be optimized through techniques known as operation fusion or de-forestation [44–46]. These techniques try to combine functions in such a way that thecreation of intermediate structures is completely avoided. A simple example of fusionon a sequence combinator could be the following:

parstruct � e1 � e2 ≡ parstruct � λx.e2 (e1 x)

On the left hand side of the equivalence, the expression parstruct � e1 creates anintermediate parallel structure, e�1, which is then used for sequencing with e2: e�1 � e2.The right hand side of the equivalence only creates a single parallel structure, which isthe result of sequencing the original parallel structure with the function composition ofthe expressions e1 and e2.

4.3 Dealing with Unneeded Tasks and Futures

If one were to naıvely implement what is proposed in this thesis, the implementationwould be very inefficient. One of the main reasons for this is due to how tasks ofunneeded futures are handled in this thesis: they simply continue running. The peekcombinator, in particular, may potentially eliminate the need for the majority of thecomputations in a parallel combinator. These unwanted futures and their associatedtasks stay in the complete system configuration, wasting hardware resources to computecalculations that will not be used. It would therefor be interesting to explore differentoptions to remove unneeded tasks and futures during the evaluation of an expression.One possibility would be to formalize and integrate high-level calculi that could expressmemory management strategies in this extension, as well as prove its correctness in thesystem [47].

4.4 (Efficiently) Implementing Parallel Combinators

The formalization of the parallel combinators provides no suggestions on how the ex-tension should be implemented by integrating it into Encore. Furthermore, an efficientimplementation of parallel combinators is not trivial, as there is a myriad of different op-tions to consider, such as deciding what data structures should be used for representingparallel structures or if different data structures should be used in different situations,how should tasks be distributed and managed, what optimizations could be performedon the parallel combinators to reduce the amount of space or time they take to computeresults, etc. A significant amount of research would have to be conducted to ensure goodstrategies for the implementation of the parallel combinators extension.

4.5 Creating Benchmarks for the Parallel Combinators

This option could be combined with the implementation of parallel combinators, as it isinteresting not only to have an implementation of this extension but also to compare it

40

Page 47: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

with other solutions to a set of problems. By benchmarking the parallel combinators, itwould not only be possible to compare their performance with solutions written in otherprogramming languages, but also their expressiveness and legibility.

4.6 Exploring More Combinators

Only a few core combinators have been provided by this thesis, yet it would be beneficialto explore more complex functions that would be useful to programmers. The combi-nators already explored have been tested with a pen-and-paper approach to solving afew problems, a solution that does not scale well as the interaction between combinatorscannot quickly be tested. Once parallel combinators have been implemented and used, itis very likely that more combinators will surface, providing more possibilities to improvetheir usability.

4.7 Integrating the Semantics with Active Objects

A prominent feature of Encore is the use of active objects, yet the proposed semanticsin this thesis make no use of them. Another direction for the future work of this thesiswould be to combine parallel combinators with the semantics of active objects, bringingthe formalization of this extension closer to the formalization of Encore [22].

41

Page 48: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

Bibliography

[1] H. Sutter. “The Free Lunch Is Over: A Fundamental Turn Toward Concurrencyin Software”. In: Dr. Dobb’s Journal 30.3 (2005), pp. 202–210.

[2] A. Vajda. Programming Many-Core Chips. 1st. Springer Publishing Company, In-corporated, 2011.

[3] S. Marlow. “Parallel and Concurrent Programming in Haskell”. In: Proceedings ofthe 4th Summer School Conference on Central European Functional ProgrammingSchool. CEFP’11. Budapest, Hungary: Springer-Verlag, 2012, pp. 339–401.

[4] J. Nickolls et al. “Scalable Parallel Programming with CUDA”. In: Queue 6.2(Mar. 2008), pp. 40–53.

[5] M. P. I. Forum. MPI: A Message-Passing Interface Standard Version 3.1. Chapterauthor for Collective Communication, Process Topologies, and One Sided Com-munications. June 2015.

[6] J. Dean and S. Ghemawat. “MapReduce: Simplified Data Processing on LargeClusters”. In: Commun. ACM 51.1 (Jan. 2008), pp. 107–113.

[7] C. Chambers et al. “FlumeJava: Easy, Efficient Data-parallel Pipelines”. In: Pro-ceedings of the 31st ACM SIGPLAN Conference on Programming Language Designand Implementation. PLDI ’10. Toronto, Ontario, Canada: ACM, 2010, pp. 363–375.

[8] Z. Budimlic et al. “Concurrent Collections”. In: Sci. Program. 18.3-4 (Aug. 2010),pp. 203–217.

[9] A. Prokopec et al. “FlowPools: A Lock-Free Deterministic Concurrent DataflowAbstraction”. English. In: Languages and Compilers for Parallel Computing. Ed.by H. Kasahara and K. Kimura. Vol. 7760. Lecture Notes in Computer Science.Springer Berlin Heidelberg, 2013, pp. 158–173.

[10] S. Peyton Jones et al. “Harnessing the Multicores: Nested Data Parallelism inHaskell”. In: IARCS Annual Conference on Foundations of Software Technologyand Theoretical Computer Science. FSTTCS ’08. 2008, pp. 383–414.

[11] G. E. Blelloch et al. “Implementation of a Portable Nested Data-parallel Lan-guage”. In: Proceedings of the Fourth ACM SIGPLAN Symposium on Principlesand Practice of Parallel Programming. PPOPP ’93. San Diego, California, USA:ACM, 1993, pp. 102–111.

42

Page 49: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

[12] L. Bergstrom et al. “Data-only Flattening for Nested Data Parallelism”. In: Pro-ceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice ofParallel Programming. PPoPP ’13. Shenzhen, China: ACM, 2013, pp. 81–92.

[13] A. Prokopec et al. “A Generic Parallel Collection Framework”. In: Proceedings ofthe 17th International Conference on Parallel Processing - Volume Part II. Euro-Par’11. Bordeaux, France: Springer-Verlag, 2011, pp. 136–147.

[14] D. Kitchin et al. “The Orc Programming Language”. In: Proceedings of the Joint11th IFIP WG 6.1 International Conference FMOODS ’09 and 29th IFIP WG 6.1International Conference FORTE ’09 on Formal Techniques for Distributed Sys-tems. FMOODS ’09/FORTE ’09. Lisboa, Portugal: Springer-Verlag, 2009, pp. 1–25.

[15] D. Kitchin, A. Quark, and J. Misra. “Quicksort: Combining Concurrency, Re-cursion, and Mutable Data Structures”. English. In: Reflections on the Work ofC.A.R. Hoare. Ed. by A. Roscoe, C. B. Jones, and K. R. Wood. Springer London,2010, pp. 229–254.

[16] J. Launchbury and T. Elliott. “Concurrent Orchestration in Haskell”. In: Proceed-ings of the Third ACM Haskell Symposium on Haskell. Haskell ’10. Baltimore,Maryland, USA: ACM, 2010, pp. 79–90.

[17] H. C. Baker Jr. and C. Hewitt. “The Incremental Garbage Collection of Processes”.In: Proceedings of the 1977 Symposium on Artificial Intelligence and ProgrammingLanguages. New York, NY, USA: ACM, 1977, pp. 55–59.

[18] R. H. Halstead Jr. “Multilisp: A Language for Concurrent Symbolic Computation”.In: ACM Trans. Program. Lang. Syst. 7.4 (Oct. 1985), pp. 501–538.

[19] ISO. ISO/IEC 14882:2011 Information technology — Programming languages —C++. Geneva, Switzerland: International Organization for Standardization, Feb. 28,2012, 1338 (est.)

[20] J. Swaine et al. “Back to the Futures: Incremental Parallelization of Existing Se-quential Runtime Systems”. In: Proceedings of the ACM International Conferenceon Object Oriented Programming Systems Languages and Applications. OOPSLA’10. Reno/Tahoe, Nevada, USA: ACM, 2010, pp. 583–597.

[21] A. Welc, S. Jagannathan, and A. Hosking. “Safe Futures for Java”. In: Proceedingsof the 20th Annual ACM SIGPLAN Conference on Object-oriented Programming,Systems, Languages, and Applications. OOPSLA ’05. San Diego, CA, USA: ACM,2005, pp. 439–453.

[22] S. Brandauer et al. “Parallel Objects for Multicores: A Glimpse at the ParallelLanguage Encore”. English. In: Formal Methods for Multicore Programming. Ed.by M. Bernardo and E. B. Johnsen. Vol. 9104. Lecture Notes in Computer Science.Springer International Publishing, 2015, pp. 1–56.

[23] J. Hughes. “Generalising Monads to Arrows”. In: Sci. Comput. Program. 37.1-3(May 2000), pp. 67–111.

43

Page 50: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

[24] K. Claessen and J. Hughes. “QuickCheck: A Lightweight Tool for Random Testingof Haskell Programs”. In: Proceedings of the Fifth ACM SIGPLAN InternationalConference on Functional Programming. ICFP ’00. New York, NY, USA: ACM,2000, pp. 268–279.

[25] G. Hutton. “Higher-order Functions for Parsing”. In: Journal of Functional Pro-gramming 2.3 (July 1992), pp. 323–343.

[26] S. D. Swierstra and L. Duponcheel. “Deterministic, Error-Correcting Combina-tor Parsers”. In: Advanced Functional Programming, Second International School-Tutorial Text. London, UK, UK: Springer-Verlag, 1996, pp. 184–207.

[27] P. W. M. Koopman and M. J. Plasmeijer. “Efficient Combinator Parsers”. In: Se-lected Papers from the 10th International Workshop on 10th International Work-shop. IFL ’98. London, UK, UK: Springer-Verlag, 1999, pp. 120–136.

[28] G. Hutton and E. Meijer. “Monadic Parsing in Haskell”. In: J. Funct. Program.8.4 (July 1998), pp. 437–444.

[29] D. Leijen. Parsec, a fast combinator parser. Tech. rep. 35. Department of ComputerScience, University of Utrecht (RUU), Oct. 2001.

[30] M. Abadi and L. Cardelli. “An Imperative Object Calculus”. In: Proceedings ofthe 6th International Joint Conference CAAP/FASE on Theory and Practice ofSoftware Development. TAPSOFT ’95. London, UK, UK: Springer-Verlag, 1995,pp. 471–485.

[31] A. Igarashi, B. C. Pierce, and P. Wadler. “Featherweight Java: A Minimal CoreCalculus for Java and GJ”. In: ACM Trans. Program. Lang. Syst. 23.3 (May 2001),pp. 396–450.

[32] D. Kitchin, W. R. Cook, and J. Misra. “A Language for Task Orchestration andIts Semantic Properties”. In: Proceedings of the 17th International Conferenceon Concurrency Theory. CONCUR’06. Bonn, Germany: Springer-Verlag, 2006,pp. 477–491.

[33] I. Wehrman et al. “A Timed Semantics of Orc”. In: Theor. Comput. Sci. 402.2-3(July 2008), pp. 234–248.

[34] J. Niehren, J. Schwinghammer, and G. Smolka. “A Concurrent Lambda Calculuswith Futures”. In: Theor. Comput. Sci. 364.3 (Nov. 2006), pp. 338–356.

[35] F. S. de Boer, D. Clarke, and E. B. Johnsen. “A Complete Guide to the Future”. In:Proceedings of the 16th European Conference on Programming. ESOP’07. Braga,Portugal: Springer-Verlag, 2007, pp. 316–330.

[36] C. Flanagan and M. Felleisen. “The Semantics of Future and an Application”. In:J. Funct. Program. 9.1 (Jan. 1999), pp. 1–31.

[37] M. Felleisen and R. Hieb. “The Revised Report on the Syntactic Theories of Se-quential Control and State”. In: Theor. Comput. Sci. 103.2 (Sept. 1992), pp. 235–271.

44

Page 51: Parallel Combinators for the Encore Programming Languageuu.diva-portal.org/smash/get/diva2:913025/FULLTEXT01.pdf · 2016. 3. 18. · Parallel ML was extended to include data-only

[38] G. D. Plotkin. A Structural Approach to Operational Semantics. Tech. rep. DAIMIFN-19. Denmark: Aarhus University, 1981.

[39] R. Milner. “The Polyadic pi-Calculus: A Tutorial”. In: Logic and Algebra of Specifi-cation. Ed. by F. L. Bauer, W. Brauer, and H. Schwichtenberg. Berlin, Heidelberg:Springer, 1993, pp. 203–246.

[40] R. Milner. Functions as processes. Research Report RR-1154. 1990.

[41] A. K. Wright and M. Felleisen. “A Syntactic Approach to Type Soundness”. In:Inf. Comput. 115.1 (Nov. 1994), pp. 38–94.

[42] P. Martin-Lof. “Constructive Mathematics and Computer Programming”. In: Proc.Of a Discussion Meeting of the Royal Society of London on Mathematical Logicand Programming Languages. London, United Kingdom: Prentice-Hall, Inc., 1985,pp. 167–184.

[43] N. Benton and A. Kennedy. “Exceptional Syntax”. In: J. Funct. Program. 11.4(July 2001), pp. 395–410.

[44] P. Wadler. “Deforestation: Transforming Programs to Eliminate Trees”. In: Theor.Comput. Sci. 73.2 (Jan. 1988), pp. 231–248.

[45] A. Gill, J. Launchbury, and S. L. Peyton Jones. “A Short Cut to Deforestation”. In:Proceedings of the Conference on Functional Programming Languages and Com-puter Architecture. FPCA ’93. Copenhagen, Denmark: ACM, 1993, pp. 223–232.

[46] A. Takano and E. Meijer. “Shortcut Deforestation in Calculational Form”. In:Proceedings of the Seventh International Conference on Functional ProgrammingLanguages and Computer Architecture. FPCA ’95. La Jolla, California, USA: ACM,1995, pp. 306–313.

[47] G. Morrisett, M. Felleisen, and R. Harper. “Abstract Models of Memory Man-agement”. In: Proceedings of the Seventh International Conference on FunctionalProgramming Languages and Computer Architecture. FPCA ’95. La Jolla, Califor-nia, USA: ACM, 1995, pp. 66–77.

45