a typed foreign function interface for mlmom22/fp/lec14-ctypes.pdf · foreign function interfaces...

67
A typed foreign function interface for ML Jeremy Yallop 26 November 2013 1 / 18

Upload: voliem

Post on 23-May-2018

225 views

Category:

Documents


1 download

TRANSCRIPT

A typed foreign function

interface for ML

Jeremy Yallop

26 November 2013

1 / 18

Itinerary

Background / using ctypes / inside ctypes

2 / 18

Itinerary

Background / using ctypes / inside ctypes

3 / 18

Foreign function interfaces

Function calls between (e.g.) ML and C

Calling system libraries or other C code from an ML program. RegisteringML functions as callbacks.

Different views of data

ML data is a tagged graph. C data is untagged and essentially flat.

Integration between runtimes

GC vs manual memory management. Possibly different calling conventions,

etc.

4 / 18

Foreign function interfaces

Function calls between (e.g.) ML and CCalling system libraries or other C code from an ML program. RegisteringML functions as callbacks.

Different views of data

ML data is a tagged graph. C data is untagged and essentially flat.

Integration between runtimes

GC vs manual memory management. Possibly different calling conventions,

etc.

4 / 18

Foreign function interfaces

Function calls between (e.g.) ML and CCalling system libraries or other C code from an ML program. RegisteringML functions as callbacks.

Different views of dataML data is a tagged graph. C data is untagged and essentially flat.

Integration between runtimes

GC vs manual memory management. Possibly different calling conventions,

etc.

4 / 18

Foreign function interfaces

Function calls between (e.g.) ML and CCalling system libraries or other C code from an ML program. RegisteringML functions as callbacks.

Different views of dataML data is a tagged graph. C data is untagged and essentially flat.

Integration between runtimesGC vs manual memory management. Possibly different calling conventions,

etc.

4 / 18

OCaml’s FFI

A single value representation

Values are immediates or pointers to blocks, distinguished by the low bits.

Macros for accessing OCaml values

Macros (Val int / Int val) for converting between C integers and taggedintegers. Macros and functions (caml alloc string / String val &c.) forallocating/accessing blocks.

Macros for interacting with the GC

Macros (CAMLParam* / CAMLreturn) for registering/unregistering local

values with the runtime.

5 / 18

OCaml’s FFI

A single value representationValues are immediates or pointers to blocks, distinguished by the low bits.

Macros for accessing OCaml values

Macros (Val int / Int val) for converting between C integers and taggedintegers. Macros and functions (caml alloc string / String val &c.) forallocating/accessing blocks.

Macros for interacting with the GC

Macros (CAMLParam* / CAMLreturn) for registering/unregistering local

values with the runtime.

5 / 18

OCaml’s FFI

A single value representationValues are immediates or pointers to blocks, distinguished by the low bits.

Macros for accessing OCaml valuesMacros (Val int / Int val) for converting between C integers and taggedintegers. Macros and functions (caml alloc string / String val &c.) forallocating/accessing blocks.

Macros for interacting with the GC

Macros (CAMLParam* / CAMLreturn) for registering/unregistering local

values with the runtime.

5 / 18

OCaml’s FFI

A single value representationValues are immediates or pointers to blocks, distinguished by the low bits.

Macros for accessing OCaml valuesMacros (Val int / Int val) for converting between C integers and taggedintegers. Macros and functions (caml alloc string / String val &c.) forallocating/accessing blocks.

Macros for interacting with the GCMacros (CAMLParam* / CAMLreturn) for registering/unregistering local

values with the runtime.

5 / 18

OCaml’s FFI: pitfalls

An example C stub

6 / 18

OCaml’s FFI: pitfalls

char *first line(size t max bytes, const char *filename){

char *buf = malloc(max bytes);if (buf == NULL) return NULL;

FILE *fp = fopen(filename, "r");if (fp != NULL) {

fgets(buf, max bytes, fp);fclose(fp);

}else { free(buf); buf = NULL; }

return buf;}

C code7 / 18

OCaml’s FFI: pitfalls

value first line(value max bytes, value filename){

char *buf = malloc(max bytes);if (buf == NULL) return NULL;

FILE *fp = fopen(filename, "r");if (fp != NULL) {

fgets(buf, max bytes, fp);fclose(fp);

}else { free(buf); buf = NULL; }

return buf;}

Parameters/return values become value7 / 18

OCaml’s FFI: pitfalls

value first line(value max bytes, value filename){

CAMLparam2(filename, max bytes);char *buf = malloc(max bytes);if (buf == NULL) return NULL;

FILE *fp = fopen(filename, "r");if (fp != NULL) {

fgets(buf, max bytes, fp);fclose(fp);

}else { free(buf); buf = NULL; }

CAMLreturn(buf);}

Add GC hooks for parameters7 / 18

OCaml’s FFI: pitfalls

value first line(value max bytes, value filename){

CAMLparam2(filename, max bytes);CAMLlocal1(buf);buf = caml alloc string(max bytes);

FILE *fp = fopen(filename, "r");if (fp != NULL) {

fgets(buf, max bytes, fp);fclose(fp);

}else failwith("fopen failed");

CAMLreturn(buf);}

Allocate the buffer in the OCaml heap

7 / 18

OCaml’s FFI: pitfalls

value first line(value max bytes, value filename){

CAMLparam2(filename, max bytes);const char *c filename = String val(filename);CAMLlocal1(buf);buf = caml alloc string(max bytes);

FILE *fp = fopen(c filename, "r");if (fp != NULL) {

fgets(String val(buf), max bytes, fp);fclose(fp);

}else failwith("fopen failed");

CAMLreturn(buf);}

Extract string addresses to pass to C

7 / 18

OCaml’s FFI: pitfalls

external first line : string −> int −> string = "first_line"

value first line(value max bytes, value filename){

CAMLparam2(filename, max bytes);const char *c filename = String val(filename);CAMLlocal1(buf);buf = caml alloc string(max bytes);

FILE *fp = fopen(c filename, "r");if (fp != NULL) {

fgets(String val(buf), max bytes, fp);fclose(fp);

}else failwith("fopen failed");

CAMLreturn(buf);}

Add an OCaml declaration7 / 18

OCaml’s FFI: pitfalls

external first line : string −> int −> string = "first_line"

value first line(value max bytes, value filename){

CAMLparam2(filename, max bytes);const char *c filename = String val(filename);CAMLlocal1(buf);buf = caml alloc string(max bytes);

FILE *fp = fopen(c filename, "r");if (fp != NULL) {

fgets(String val(buf), max bytes, fp);fclose(fp);

}else failwith("fopen failed");

CAMLreturn(buf);}

Compiles successfully!

7 / 18

OCaml’s FFI: pitfalls

external first line : string −> int −> string = "first_line"

value first line(value max bytes, value filename){

CAMLparam2(filename, max bytes);const char *c filename = String val(filename);CAMLlocal1(buf);buf = caml alloc string(max bytes);

FILE *fp = fopen(c filename, "r");if (fp != NULL) {

fgets(String val(buf), max bytes, fp);fclose(fp);

}else failwith("fopen failed");

CAMLreturn(buf);}

Bug: parameters interchanged (crash!)

7 / 18

OCaml’s FFI: pitfalls

external first line : string −> int −> string = "first_line"

value first line(value max bytes, value filename){

CAMLparam2(filename, max bytes);const char *c filename = String val(filename);CAMLlocal1(buf);buf = caml alloc string(max bytes);

FILE *fp = fopen(c filename, "r");if (fp != NULL) {

fgets(String val(buf), max bytes, fp);fclose(fp);

}else failwith("fopen failed");

CAMLreturn(buf);}

Bug: invalidated pointer (crash?)

7 / 18

OCaml’s FFI: pitfalls

external first line : string −> int −> string = "first_line"

value first line(value max bytes, value filename){

CAMLparam2(filename, max bytes);const char *c filename = String val(filename);CAMLlocal1(buf);buf = caml alloc string(max bytes);

FILE *fp = fopen(c filename, "r");if (fp != NULL) {

fgets(String val(buf), max bytes, fp);fclose(fp);

}else failwith("fopen failed");

CAMLreturn(buf);}

Bug: missing conversion (misbehaviour)

7 / 18

Itinerary

Background / using ctypes / inside ctypes

8 / 18

Outline of the ctypes approach

OCaml, not C

Access C values from OCaml, not vice-versa. Why? abstraction, typesafety, automatic memory management, . . . .

Types, not values

Types are sufficient to determine the interface.

“What,” not “how”

Build a typed embedded DSL, separating construction from interpretation.

9 / 18

Outline of the ctypes approach

OCaml, not CAccess C values from OCaml, not vice-versa. Why? abstraction, typesafety, automatic memory management, . . . .

Types, not values

Types are sufficient to determine the interface.

“What,” not “how”

Build a typed embedded DSL, separating construction from interpretation.

9 / 18

Outline of the ctypes approach

OCaml, not CAccess C values from OCaml, not vice-versa. Why? abstraction, typesafety, automatic memory management, . . . .

Types, not valuesTypes are sufficient to determine the interface.

“What,” not “how”

Build a typed embedded DSL, separating construction from interpretation.

9 / 18

Outline of the ctypes approach

OCaml, not CAccess C values from OCaml, not vice-versa. Why? abstraction, typesafety, automatic memory management, . . . .

Types, not valuesTypes are sufficient to determine the interface.

“What,” not “how”Build a typed embedded DSL, separating construction from interpretation.

9 / 18

Ctypes in action

[demo: hello, world]

10 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Ctypes in action

11 / 18

Embedded DSLs

Tailored to a specific domain

Parsing, database queries, music, financial contracts, graphics, &c.

Host language functions for building terms

Can also borrow host language binding constructs.

Host language types for typing terms

For example, use a subset of ML types to type SQL tables or C types.

Separate building terms from interpretation

The meaning of “declarative.” Allows multiple interpretations.

12 / 18

Embedded DSLs

Tailored to a specific domainParsing, database queries, music, financial contracts, graphics, &c.

Host language functions for building terms

Can also borrow host language binding constructs.

Host language types for typing terms

For example, use a subset of ML types to type SQL tables or C types.

Separate building terms from interpretation

The meaning of “declarative.” Allows multiple interpretations.

12 / 18

Embedded DSLs

Tailored to a specific domainParsing, database queries, music, financial contracts, graphics, &c.

Host language functions for building termsCan also borrow host language binding constructs.

Host language types for typing terms

For example, use a subset of ML types to type SQL tables or C types.

Separate building terms from interpretation

The meaning of “declarative.” Allows multiple interpretations.

12 / 18

Embedded DSLs

Tailored to a specific domainParsing, database queries, music, financial contracts, graphics, &c.

Host language functions for building termsCan also borrow host language binding constructs.

Host language types for typing termsFor example, use a subset of ML types to type SQL tables or C types.

Separate building terms from interpretation

The meaning of “declarative.” Allows multiple interpretations.

12 / 18

Embedded DSLs

Tailored to a specific domainParsing, database queries, music, financial contracts, graphics, &c.

Host language functions for building termsCan also borrow host language binding constructs.

Host language types for typing termsFor example, use a subset of ML types to type SQL tables or C types.

Separate building terms from interpretationThe meaning of “declarative.” Allows multiple interpretations.

12 / 18

Embedded DSLs: multiple

interpretations in ctypes

Interpretation

Dynamic binding, dynamic call construction. Interactive, but withinterpretative overhead and some loss of safety.

Compilation

Generation of C stubs from ctypes values. Type safe and efficient but withsome complexity in the build system.

Multi-process implementation

Sandbox C libraries to contain memory corruption. Intriguing possibilities:fork-based debugging, improved parallelism, . . .

etc.13 / 18

Embedded DSLs: multiple

interpretations in ctypes

InterpretationDynamic binding, dynamic call construction. Interactive, but withinterpretative overhead and some loss of safety.

Compilation

Generation of C stubs from ctypes values. Type safe and efficient but withsome complexity in the build system.

Multi-process implementation

Sandbox C libraries to contain memory corruption. Intriguing possibilities:fork-based debugging, improved parallelism, . . .

etc.13 / 18

Embedded DSLs: multiple

interpretations in ctypes

InterpretationDynamic binding, dynamic call construction. Interactive, but withinterpretative overhead and some loss of safety.

CompilationGeneration of C stubs from ctypes values. Type safe and efficient but withsome complexity in the build system.

Multi-process implementation

Sandbox C libraries to contain memory corruption. Intriguing possibilities:fork-based debugging, improved parallelism, . . .

etc.13 / 18

Embedded DSLs: multiple

interpretations in ctypes

InterpretationDynamic binding, dynamic call construction. Interactive, but withinterpretative overhead and some loss of safety.

CompilationGeneration of C stubs from ctypes values. Type safe and efficient but withsome complexity in the build system.

Multi-process implementationSandbox C libraries to contain memory corruption. Intriguing possibilities:fork-based debugging, improved parallelism, . . .

etc.13 / 18

ctypes back in action

[demo: multi-process implementation]

14 / 18

ctypes back in action

15 / 18

ctypes back in action

15 / 18

ctypes back in action

15 / 18

ctypes back in action

15 / 18

ctypes back in action

15 / 18

ctypes back in action

15 / 18

ctypes back in action

15 / 18

ctypes back in action

15 / 18

Itinerary

Background / using ctypes / inside ctypes

16 / 18

Levels of type safety

algebraic data typesunsafe interface, unsafe implementation.

phantom typessafe interface, unsafe implementation.

generalized algebraic data typessafe interface, safe (and efficient!) implementation.

17 / 18

Other Embedded DSLs

parsing (parsec)

SQL

financial contracts

music

graphics

etc.

18 / 18