rubyslava slides-26.09.2013

24
Rubyslava / PyVo #32 26.09.2013 Imperative versus Functional Programming Jan Herich @janherich @itedge

Upload: jan-herich

Post on 05-Jul-2015

291 views

Category:

Technology


0 download

DESCRIPTION

Presentation about big benefits of immutable data and functional approach to programming in contrast with more traditional imperative style. Comparison of the code trying to solve the same problem written in imperative-style Javascript and then rewritten in Clojure

TRANSCRIPT

Page 1: Rubyslava slides-26.09.2013

Rubyslava / PyVo #32

26.09.2013

Imperative versus Functional Programming

Jan Herich

@janherich@itedge

Page 2: Rubyslava slides-26.09.2013

Core aspects of Imperative Programming

Page 3: Rubyslava slides-26.09.2013

Core aspects of Imperative Programming

● Emphasis on mutable state➢ In place modification of variables (memory locations)➢ The flow of the program is determined by directly checking

those memory locations → typical example is imperative looping : for(int i=1; i<11; i++) { System.out.println("Count is: " + i); }

Page 4: Rubyslava slides-26.09.2013

Core aspects of Imperative Programming

● Emphasis on mutable state➢ In place modification of variables (memory locations)➢ The flow of the program is determined by directly checking

those memory locations → typical example is imperative looping : for(int i=1; i<11; i++) { System.out.println("Count is: " + i); }

● Rooted in single-threaded premise➢ Assuming that there is only one thread of execution➢ So the world is effectively “stopped” when you look at it or

change it

Page 5: Rubyslava slides-26.09.2013

Core aspects of Imperative Programming

● Emphasis on mutable state➢ In place modification of variables (memory locations)➢ The flow of the program is determined by directly checking

those memory locations → typical example is imperative looping : for(int i=1; i<11; i++) { System.out.println("Count is: " + i); }

● Rooted in single-threaded premise➢ Assuming that there is only one thread of execution➢ So the world is effectively “stopped” when you look at it or

change it

● Prevalent in most OO languages➢ C++, Java, C#, Python, Ruby, etc.

Page 6: Rubyslava slides-26.09.2013

What's wrong withImperative programming ?

Page 7: Rubyslava slides-26.09.2013

What's wrong withImperative programming ?

● Uncoordinated mutation ➢ No built-in facilities in the language to coordinate changes➢ It could result in brittle systems, even without concurrency➢ Add concurrency/parallelism and everything only get worse

Page 8: Rubyslava slides-26.09.2013

What's wrong withImperative programming ?

● Uncoordinated mutation ➢ No built-in facilities in the language to coordinate changes➢ It could result in brittle systems, even without concurrency➢ Add concurrency/parallelism and everything only get worse

● Complecting the state and identity in OO➢ Object reference → Identity and value mixed together➢ Object (identity) is a pointer to the memory that contains the

value of its state➢ There is no way to observe a stable state (even to copy it)

without blocking others from changing it➢ There is no way to associate the identity's state with a different

value other than in-place memory mutation

Page 9: Rubyslava slides-26.09.2013

Example of harmful mutation Our initial objects:

var record1 = {state_id:'S2',county_id:'C1',population:3439,area:97};var record2 = {state_id:'S5',county_id:'C2',population:85345,area:128};var record3 = {state_id:'S2',county_id:'C3',population:7435,area:157};

Page 10: Rubyslava slides-26.09.2013

Example of harmful mutation

Reasonably nice function without side effects:

var groupRecords = function(records,key) { var groups = {}; for (var i = 0; i < records.length; i++) { var current = records[i]; var keyValue = current[key]; if (!groups.hasOwnProperty(keyValue)) { groups[keyValue] = []; } groups[keyValue].push(current); } return groups;};

Our initial objects:

var record1 = {state_id:'S2',county_id:'C1',population:3439,area:97};var record2 = {state_id:'S5',county_id:'C2',population:85345,area:128};var record3 = {state_id:'S2',county_id:'C3',population:7435,area:157};

New grouped datastructure:

var grouped = groupRecords([record1, record2, record3], 'county_id');

Page 11: Rubyslava slides-26.09.2013

Example of harmful mutation

Reasonably nice function without side effects:

var groupRecords = function(records,key) { var groups = {}; for (var i = 0; i < records.length; i++) { var current = records[i]; var keyValue = current[key]; if (!groups.hasOwnProperty(keyValue)) { groups[keyValue] = []; } groups[keyValue].push(current); } return groups;};

Ugly side-effect function, written by unexperiencedprogrammer, who doesn't know much about objectreferences:

var sumRecords = function(records) { var first = records[0]; for (var i = 1; i < records.length; i++) { var current = records[i]; first.population += current.population; first.area += current.area; } return first;};

Our initial objects:

var record1 = {state_id:'S2',county_id:'C1',population:3439,area:97};var record2 = {state_id:'S5',county_id:'C2',population:85345,area:128};var record3 = {state_id:'S2',county_id:'C3',population:7435,area:157};

New grouped datastructure:

var grouped = groupRecords([record1, record2, record3], 'county_id');

We don't expect that this function call will mutate the former object record1:

var state2summed = sumRecords(grouped.S2);

Page 12: Rubyslava slides-26.09.2013

How can we do better ?

Page 13: Rubyslava slides-26.09.2013

How can we do better ?

● What if every datastructure in your program would be immutable ?➢ It would solve our problem with leaking mutable references

all over the codebase

Page 14: Rubyslava slides-26.09.2013

How can we do better ?

● What if every datastructure in your program would be immutable ?➢ It would solve our problem with leaking mutable references

all over the codebase● But how can we model any progress if everything is static ?

➢ It's nice to be safe, but how can we actually accomplish anything if everything is immutable so we can't change it ? It turns out we can, if we use persistent data structures → that we can't change something in place doesn't mean that we can't model progress

Page 15: Rubyslava slides-26.09.2013

How can we do better ?

● What if every datastructure in your program would be immutable ?➢ It would solve our problem with leaking mutable references

all over the codebase● But how can we model any progress if everything is static ?

➢ It's nice to be safe, but how can we actually accomplish anything if everything is immutable so we can't change it ? It turns out we can, if we use persistent data structures → that we can't change something in place doesn't mean that we can't model progress

● What is a persistent data structure ?➢ Persistent data structure is a data structure that always preserves

the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure

Page 16: Rubyslava slides-26.09.2013

How persistent data structures work

● Creation of new datastructures must be fast and efficient.

● This is achieved by structural sharing

● Obviously, the garbage collection is a must in this case

xs

d

b g

a c f h

ys

d'

g'

f'

e

Page 17: Rubyslava slides-26.09.2013

Our example revisited Our initial references to VALUES:

(def record-1 {:state-id "S2" :county-id "C1" :population 3439 :area 97})(def record-2 {:state-id "S5" :county-id "C2" :population 85345 :area 128})(def record-3 {:state-id "S2" :county-id "C3" :population 7435 :area 157})

Page 18: Rubyslava slides-26.09.2013

Our example revisited

(defn group-records [records key] (reduce (fn [accumulator record] (let [key-val (get record key) subrecords (get accumulator key-val [])] (assoc accumulator key-val (conj subrecords record)))) {} records))

Our initial references to VALUES:

(def record-1 {:state-id "S2" :county-id "C1" :population 3439 :area 97})(def record-2 {:state-id "S5" :county-id "C2" :population 85345 :area 128})(def record-3 {:state-id "S2" :county-id "C3" :population 7435 :area 157})

New merged VALUE:

(def grouped (group-records [record-1 record-2 record-3] :state-id))

Page 19: Rubyslava slides-26.09.2013

Our example revisited

(defn group-records [records key] (reduce (fn [accumulator record] (let [key-val (get record key) subrecords (get accumulator key-val [])] (assoc accumulator key-val (conj subrecords record)))) {} records))

Our inexperienced programmer and his function again:

(defn alter-records [records] ;; the first record remains unchaged ;; because it is a reference ;; to value, and values don't ;; change :) (assoc (first records) :population 0))

Our initial references to VALUES:

(def record-1 {:state-id "S2" :county-id "C1" :population 3439 :area 97})(def record-2 {:state-id "S5" :county-id "C2" :population 85345 :area 128})(def record-3 {:state-id "S2" :county-id "C3" :population 7435 :area 157})

New merged VALUE:

(def grouped (group-records [record-1 record-2 record-3] :state-id))

We created new VALUE state-2-altered, but our record-1remains unchanged, even if those two VALUES partiallyshare their structure:

(def state-2-altered (alter-records (grouped "S2")))

Page 20: Rubyslava slides-26.09.2013

Key points from our Imperative/Functional comparision

● In our imperative example, we created many more identities then we really needed.

● The identities we created (record1, record2, record2) would be better modeled as values instead.

● The value of values should not be undervalued :)

Page 21: Rubyslava slides-26.09.2013

But what if we really need identities ?

● Most programs need identities ➢ There are programs resembling huge functions such as

compilers, but most other programs need to model identities

● It's worthwhile to separate identity and state➢ Instead of thinking about identity states as a contents of the

particular memory block, it's better to think about it as a value currently associated with the identity

➢ The identity can be in different states in different times, but the state itself doesn't change

➢ Thus, the identity is not a state, the identity has a state, exactly one at any point of time

Page 22: Rubyslava slides-26.09.2013

How do we model identities ?

● We need atomic references to values ➢ Because every 'value-swap' of the identity (remember, every

identity has a state, which is immutable value) needs to be atomic (similar to atomic database commits, always resulting in consistent database state)

➢ In Clojure, those changes to references are controlled and coordinated by the system – so the cooperation is not optional and not manual

➢ The world moves forward due to the cooperative efforts of its participants and the programming language/system, Clojure, is in charge of world consistency management. The value of a reference (state of an identity) is always observable without coordination, and freely shareable between threads

Page 23: Rubyslava slides-26.09.2013

Example of atomic reference in Clojure

We define out initial cache data:(def initial-data ["D1" "D2" "D3"])

Now we create an special atomic reference – named cache:(def cache (atom initial-data)) We read the cache into some intermediate variable:(def cache-data (deref cache)) The result is true:(= initial-data cache-data) We swap the cache for different value -> we add "D4" and "D5" in this case:(swap! cache conj ["D4" "D5"]) Whenever the cache is dereferenced, we get the new data:(deref cache) But the former cache reading cache-data is still unchanged, so this remains true:(= initial-data cache-data)

Page 24: Rubyslava slides-26.09.2013

There is a lot more to discover

● There are more reference types in Clojure, but that is out of scope of this talk

● And of course not only that, there are many, many more cool features in Clojure which you wouldn't find in other languages such as for example multimethods or true macros for metaprogramming

● Visit http://clojure.org/ to learn more