cint function stub removal
DESCRIPTION
ROOT Team Meeting CERN Leandro Franco (Joint work with Diego Marcos) 18-06-07. CInt Function Stub Removal. Modifying CInt?. A.K.A : The Pi ñata paradigm. CInt. Experienced Programmers. Newbie. Goal: Obtain the candies from the pi ñata... without breaking anybody's head. Simple Idea. - PowerPoint PPT PresentationTRANSCRIPT
CInt Function Stub Removal
ROOT Team MeetingCERN
Leandro Franco (Joint work with Diego Marcos)
18-06-07
Modifying CInt? ...
A.K.A : The Piñata paradigm
CInt
NewbieExperiencedProgrammers
Goal: Obtain the candies from the piñata... without breaking anybody's head.
Simple Idea
● The dictionaries are big: around 52% of the total library size.
● Why don't we just wipe them off from the face of earth?
● Short answer: we can't do it yet, but we will try.● Long answer: the whole topic of these slides ;)
First steps
● One good way to shrink the dictionaries is to remove the stub functions.
● Such functions come from the need of having a generic way to call a function in Cint and from the impossibility of doing a proper name mangling to find such function (i.e. Cint must behave as a compiler but doesn't have the means to do so).
Stub Functions
● To be able to solve the name mangling problem a traditional approach was taken:
“Any problem in computer science can be solved with another layer of indirection”
Wheeler's Law
Stub Functions
Compiler/CInt Library function
CInt Library functionDictionary
manglingcompiling time
pseudo manglingrunning time
manglingcompiling time
The dictionary could be seen as a bijective function that maps c++ function declarations to a certain string (string which will be associated to the symbol by the compiler)
Stub Functions
● The idea is to avoid that layer of indirection.– We still don't how to do the mangling.
● But we know how to do the demangling (or at least, we know who to call to do it ;) ).
function header (X) library (Y)
A::A() _ZN1AC1Ev
A::HiA() _ZN1A3HiAEv
Instead of going from set X to set Yfor a given x in X
function header (X) library (Y)
A::A() _ZN1AC1Ev
A::HiA() _ZN1A3HiAEv
Go from set Y to set Xfor all y in Y
Stub Functions
● This approach writes in stone the biggest side effect:– We will need to demangle ALL the symbols in a
library just to be able to call 1 function.
● The demangling process might not be too expensive but what happens when we have thousands and thousands of symbols in a library?
Efficiency
● Since we have to demangle all the symbols from the library at least once we could cache this result– Expensive approach: libCore has 21000 symbols with
an average length of 46 characters when demangled (i.e 614 KB in cache).
● Try to demangle as less as possible. Don't do it more than once or twice and don't even try it if the symbols have been registered.
● I'm not even mentioning the parsing needed between the demangling and the registering.
Are we winning the fight?
● CVS version of ROOT
– Libs size: 74.67 MB
– Objects size (dictionaries): 47.71 MB
– Source size (dictionaries): 50.37 MB
● Current status of pre-experimental version
– Libs size: 65.46 MB ( -9.21 MB, 12%)
– Objects size (dict): 36.42 MB (-11.29 MB, 24%)
– Source size (dicti): 37.25 MB (-13.12 MB, 26%)
In all war sacrifices must be made: space and time overhead
Let's start with a “normal” sesion
Real time: 0.37 s Real time: 21.72 sRootmarks: 341.97
First Algorithm: be stupid.
Initial attempt: demangle all the symbols in a library for every used class
Real time: 0.76 s Real time: 38.95 sRootmarks: 184
Spikes due to the silliness of the algorithm. First demangle everything and the register it.
Second Algorithm: don't be so stupid
At least remember the classes thathave already been registered
Real time: 0.77 s Real time: 37.29 sRootmarks: 183.68
Spikes due to the silliness of the algorithm. First demangle everything and the register it.
Third Algorithm: use the RAM
Demangle the symbols once and keep them in a cache
Real time: 0.69 s Real time: 28.48 sRootmarks: 200.95
Fourth Algorithm: Axel's idea
Keep a pointer to the mangled name and demangle twice (when needed)
Real time: 0.68 s Real time: 26.97 sRootmarks: 205.16
Fifth Algorithm: some tuning
A bit of optimization with the structures
Real time: 0.56 s Real time: 26.51 sRootmarks: 200.1
Algorithms Comparison
How much are we willing to pay for this feature???Demangling takes 15% of the time at startup (100ms).
Which means there is still some room for improvement.
Problems so far... a plethora
● Easy ones
– ellipsis
– parameters by default
– free standing functions
– weird types like va_list
– many more...
● Not so easy:
– virtual functions... a real pain in the neck
– constructors, destructors (in-charge, deleting, etc)
– inline functions
– non-member operators
– ...
Work to be done
● Certain stub functions are not out of the dictionary yet:
– Constructors and destructors (Diego is working on it)
– Non-member operators
– Certain cases for std templates
● Without stubs we can also take the setup_memfunc calls out of the dictionary.
● What else can we take out?– Shadow classes? Show members? Streamers?
– Class Inheritance info? typedef? data members info? ...?
Future is always bright (dict source)
● CVS Version: 50.37MB
● Actual status: 37.25MB (-13.12 MB, 26.0%)
● No cons, dests: 30.09MB (-20.28 MB, 40.2%)
– Should be there soon enough.
● No memfuncs: 17.40MB (-32.97 MB, 65.4%)
– We still need the info (in a root file for instance).
● No memvars: 14.72MB (-35.65 MB, 70.7%)
● No inline issue: 13.89MB (-36.47 MB, 72.4%)
Conclusions
● We have gained a better understanding of C++.● As my mother used to say:
– He who knows not the way, walks with desperation.
(fortunately, we finally have an idea of what we are doing and where we want to go)
● A lot of tuning is being done to bring times and memory down to something acceptable.
● We need a considerable amount of time to deal with a myriad of small (and not so small) issues.