wtf is a method cache?

58
BitLove What the fuck is a method cache? James Golick Thursday, 17 July, 14

Upload: james-golick

Post on 10-May-2015

582 views

Category:

Documents


0 download

DESCRIPTION

Talk about the method caching patches I wrote that led to jamesgolick ruby. Given at RuPy 2013 in Budapest.

TRANSCRIPT

Page 1: WTF is a Method Cache?

BitLove

What the fuck is a method cache?

James Golick

Thursday, 17 July, 14

Page 2: WTF is a Method Cache?

James Golickwriting: http://jamesgolick.com

code: https://github.com/jamesgolickshit talk: https://twitter.com/jamesgolick

podcast: http://realtalk.io

Thursday, 17 July, 14

Page 3: WTF is a Method Cache?

@jamesgolick

Thursday, 17 July, 14

Page 4: WTF is a Method Cache?

BitLove

DISCLAIMERC O M P U T O L O G Y A H E A D

Thursday, 17 July, 14

Page 5: WTF is a Method Cache?

Thursday, 17 July, 14

Page 6: WTF is a Method Cache?

Big “O” Notation

Thursday, 17 July, 14

Page 7: WTF is a Method Cache?

stuff.each do |thing| # ...end

Variable Time

Thursday, 17 July, 14

Page 8: WTF is a Method Cache?

a = stuff.popa = !!astuff.unshift a

Constant Time

Thursday, 17 July, 14

Page 9: WTF is a Method Cache?

BitLove

What the fuckis a method cache?

Thursday, 17 July, 14

Page 10: WTF is a Method Cache?

BitLove

1. Background

Thursday, 17 July, 14

Page 11: WTF is a Method Cache?

struct RClass { struct RClass super; struct st_table m_tbl;};

Thursday, 17 July, 14

Page 12: WTF is a Method Cache?

Method Resolution

Thursday, 17 July, 14

Page 13: WTF is a Method Cache?

class A def a puts 'Hi!' endend

class B < A; endclass C < B; endclass D < C; endclass E < D; endclass F < E; end

F.new.a

Thursday, 17 July, 14

Page 14: WTF is a Method Cache?

rb_method_entry_tvm_resolve_method(struct RClass klass, symbol_t method_name){ rb_method_entry_t ent = st_lookup(klass.method_tbl, method_name);

if (ent) { return ent; } else { if (klass->super) { return vm_resolve_method(klass.super, method_name); } else { return NULL; } }}

Thursday, 17 July, 14

Page 15: WTF is a Method Cache?

Module Inclusion

Thursday, 17 July, 14

Page 16: WTF is a Method Cache?

module A def a "Hello, World!" endend

module B; include A; endmodule C; include B; end

class D include Cend

D.new.a

Thursday, 17 July, 14

Page 17: WTF is a Method Cache?

module A ICLASS ICLASS ICLASS

module B ICLASS ICLASS

module C ICLASS

class D

Thursday, 17 July, 14

Page 18: WTF is a Method Cache?

irb> ActiveRecord::Base.included_modules.length=> 71

Thursday, 17 July, 14

Page 19: WTF is a Method Cache?

Summary

• Methods are stored in a hashtable on the class where they’re defined.

• Method resolution is a variable time algorithm whose complexity depends on the depth of your class hierarchy.

• Module inclusion substantially increases the depth of your class hierarchy, especially if those modules themselves include modules.

• Method resolution is expensive.

Thursday, 17 July, 14

Page 20: WTF is a Method Cache?

BitLove

What the fuck is a method cache?

Thursday, 17 July, 14

Page 21: WTF is a Method Cache?

BitLove

2. Method Cachingin the pre-

jamesgolick era

Thursday, 17 July, 14

Page 22: WTF is a Method Cache?

Instruction Caches

Thursday, 17 July, 14

Page 23: WTF is a Method Cache?

static uint global_vm_state = 0;

Thursday, 17 July, 14

Page 24: WTF is a Method Cache?

struct inline_cache { struct RClass klass; uint vm_state; rb_method_entry_t me;}

Thursday, 17 July, 14

Page 25: WTF is a Method Cache?

rb_method_entry_tvm_search_method(struct RClass klass, rb_symbol_t method_name, struct inline_cache ic){ rb_method_entry_t me; if (is_valid_cache_entry(ic, cache)) { me = ic.me; } else { me = vm_resolve_method(klass, method_name); ic.me = me; ic.vm_state = GET_VM_STATE(); ic.klass = klass; } return me;}

Thursday, 17 July, 14

Page 26: WTF is a Method Cache?

intis_valid_cache_entry(struct inline_cache ent, struct RClass klass){ return ent.klass == klass && ent.vm_state = GET_VM_STATE();}

Thursday, 17 July, 14

Page 27: WTF is a Method Cache?

Global Method Cache

Thursday, 17 July, 14

Page 28: WTF is a Method Cache?

instruction cache

instruction cache

instruction cache

global cache

method resolution

Thursday, 17 July, 14

Page 29: WTF is a Method Cache?

instruction cache

instruction cache

instruction cache

global cache

method resolution

Thursday, 17 July, 14

Page 30: WTF is a Method Cache?

struct method_cache_entry { struct RClass klass; uint vm_state; rb_method_entry_t me;}

Thursday, 17 July, 14

Page 31: WTF is a Method Cache?

#define METHOD_CACHE_SIZE 2048

static struct rb_method_cache_entry method_cache[METHOD_CACHE_SIZE];

Thursday, 17 July, 14

Page 32: WTF is a Method Cache?

rb_method_entry_t *vm_resolve_method(struct RClass *klass, symbol_t method_name){ struct method_cache_entry ent; rb_method_entry_t *me; ent = method_cache[method_name % METHOD_CACHE_SIZE]; if (is_valid_cache_entry(ent, klass)) { me = cache_entry.me; } else { me = vm_resolve_method_without_cache(klass, method_name); cache_entry.me = me; cache_entry.vm_state = GET_VM_STATE(); cache_entry.klass = klass; } return me;}

Thursday, 17 July, 14

Page 33: WTF is a Method Cache?

intis_valid_cache_entry(struct method_cahe_entry ent, struct RClass klass){ return ent.klass == klass && ent.vm_state = GET_VM_STATE();}

Thursday, 17 July, 14

Page 34: WTF is a Method Cache?

Cache Invalidation

Thursday, 17 July, 14

Page 35: WTF is a Method Cache?

static uint64_t global_vm_state = 0;

#define INC_VM_STATE global_vm_state++

voidrb_define_method(struct RClass *klass, symbol_t name, rb_method_entry_t *me){ // ... INC_VM_STATE(); // ...}

Thursday, 17 July, 14

Page 36: WTF is a Method Cache?

Defining methods.

Aliasing methods.

Removing methods.

Setting or removing constants.

Defining a class.

Defining a module.

Including a module.

things that bust the cache

Thursday, 17 July, 14

Page 37: WTF is a Method Cache?

Extending a module.

Using a refinement. (Ruby 2.0)

Garbage collecting a class.

Garbage collecting a module.

Changing the visibility of a constant.

Marshal loading an extended constant.

Autoload.

Built-in non-blocking IO methods.

things that bust the cache

Thursday, 17 July, 14

Page 38: WTF is a Method Cache?

OpenStruct instantiation.

things that bust the cache

Thursday, 17 July, 14

Page 39: WTF is a Method Cache?

Summary

• Method resolutions are cached in two places.

• Instruction caches are structs attached to the send instruction.

• The global method cache is a hash table fixed at 2048 entries with no collision semantics and a random eviction policy.

• Method cache entries are valid if their `vm_state` property is the same as the current value of the `global_vm_state` counter.

Thursday, 17 July, 14

Page 40: WTF is a Method Cache?

Summary

• Method cache invalidation is always global, and happens frequently in most ruby code.

• Method cache invalidation is constant time.

Thursday, 17 July, 14

Page 41: WTF is a Method Cache?

Numbers

Thursday, 17 July, 14

Page 42: WTF is a Method Cache?

BitLove

3. jamesgolick Method Caching

Thursday, 17 July, 14

Page 43: WTF is a Method Cache?

struct RClass { struct RClass super; struct st_table m_tbl; struct st_table mc_tbl; uint64_t seq; subclass_list_entry_t subclasses;};

Thursday, 17 July, 14

Page 44: WTF is a Method Cache?

static uint64_t rb_vm_sequence = 0;

#define NEXT_SEQ() ++rb_vm_sequence

Thursday, 17 July, 14

Page 45: WTF is a Method Cache?

struct RClassclass_alloc(...){ struct RClass klass; // ... klass.seq = NEXT_SEQ(); // ... return klass;}

Thursday, 17 July, 14

Page 46: WTF is a Method Cache?

struct inline_cache { uint64_t seq; rb_method_entry_t me;}

Thursday, 17 July, 14

Page 47: WTF is a Method Cache?

rb_method_entry_t *vm_search_method(struct RClass klass, rb_symbol_t method_name, struct inline_cache ic){ rb_method_entry_t me; if (ic.seq == klass.seq) { me = ic.me; } else { me = vm_resolve_method(klass, method_name); ic.me = me; ic.seq = klass.seq; } return me;}

Thursday, 17 July, 14

Page 48: WTF is a Method Cache?

struct method_cache_entry { uint64_t seq; rb_method_entry_t me;}

rb_method_entry_t *vm_resolve_method(struct RClass klass, symbol_t method_name){ struct method_cache_entry ent; rb_method_entry_t me; ent = vm_get_method_cache_entry(klass, method_name); if (ent.seq == seq) { me = cache_entry.me; } else { me = vm_resolve_method_without_cache(klass, method_name); cache_entry.me = me; cache_entry.seq = klass.seq; } return me;}

Thursday, 17 July, 14

Page 49: WTF is a Method Cache?

voidrb_clear_cache_by_class(struct RClass klass){ subclass_list_entry_t ent; klass.seq = NEXT_SEQ(); ent = klass.subclasses; while(ent != NULL) { rb_clear_cache_by_class(ent.klass); ent = ent.next; }}

Thursday, 17 July, 14

Page 50: WTF is a Method Cache?

Object

User

ActionController::Base

ActiveRecord::Base

UsersController

SessionsController Group

Thursday, 17 July, 14

Page 51: WTF is a Method Cache?

Object

User

ActionController::Base

ActiveRecord::Base

UsersController

SessionsController Group

Thursday, 17 July, 14

Page 52: WTF is a Method Cache?

Summary

• Both types of method cache entries now only need to store a seq and method entry.

• Method caches are now stored with the RClass structs and are !effectively" unbounded in size.

• Each RClass has a globally unique 64bit identifier.

• Method cache entries are tagged with the sequence of their target klass at the time the cache entry was filled.

Thursday, 17 July, 14

Page 53: WTF is a Method Cache?

Summary

• Entries are valid if their filled entry sequence is the same as the current sequence identifier of the klass that is the target of the invocation.

• Method caches are invalidated by assigning a new sequence value to a klass.

• When changes are made to a klass, we traverse all of its descendents and assign them new sequence values.

• This traversal is unfortunately a variable time algorithm, and can be quite expensive.

Thursday, 17 July, 14

Page 54: WTF is a Method Cache?

BitLove

4. rvm install jamesgolick

Thursday, 17 July, 14

Page 55: WTF is a Method Cache?

Dat Patch

• Top-down class hierarchy tracking.

• Class#subclasses

• Module#included_in

• Possible future bug fixes.

• Hierarchical method cache invalidation.

• Method cache instrumentation.

Thursday, 17 July, 14

Page 56: WTF is a Method Cache?

Instrumentation

• RubyVM::MethodCache.hits

• RubyVM::MethodCache.misses

• RubyVM::MethodCache.miss_time

• RubyVM::MethodCache.invalidation_time

Thursday, 17 July, 14

Page 57: WTF is a Method Cache?

Get The Code

• rvm install jamesgolick

• git clone git://github.com/jamesgolick/ruby.git

• https://github.com/jamesgolick/ruby

Thursday, 17 July, 14

Page 58: WTF is a Method Cache?

Questions?

Thursday, 17 July, 14