ruby hacking guide

885
Ruby Hacking Guide Preface This book explores several themes with the following goals in mind: To have knowledge of the structure of ruby To gain knowledge about language processing systems in general To acquire skills in reading source code Ruby is an object-oriented language developed by Yukihiro Matsumoto. The official implementation of the Ruby language is called ruby. It is actively developed and maintained by the open source community. Our first goal is to understand the inner- workings of the ruby implementation. This book is going to investigate ruby as a whole. Secondly, by knowing about the implementation of Ruby, we will be able to know about other language processing systems. I tried to cover all topics necessary for implementing a language, such as hash table, scanner and parser, evaluation procedure, and many others. Because this book is not intended as a text book, going through entire areas and ideas without any lack was not

Upload: truongthuy

Post on 14-Feb-2017

501 views

Category:

Documents


26 download

TRANSCRIPT

Page 1: Ruby Hacking Guide

RubyHackingGuide

Preface

Thisbookexploresseveralthemeswiththefollowinggoalsinmind:

TohaveknowledgeofthestructureofrubyTogainknowledgeaboutlanguageprocessingsystemsingeneralToacquireskillsinreadingsourcecode

Rubyisanobject-orientedlanguagedevelopedbyYukihiroMatsumoto.TheofficialimplementationoftheRubylanguageiscalledruby.Itisactivelydevelopedandmaintainedbytheopensourcecommunity.Ourfirstgoalistounderstandtheinner-workingsoftherubyimplementation.Thisbookisgoingtoinvestigaterubyasawhole.

Secondly,byknowingabouttheimplementationofRuby,wewillbeabletoknowaboutotherlanguageprocessingsystems.Itriedtocoveralltopicsnecessaryforimplementingalanguage,suchashashtable,scannerandparser,evaluationprocedure,andmanyothers.Becausethisbookisnotintendedasatextbook,goingthroughentireareasandideaswithoutanylackwasnot

Page 2: Ruby Hacking Guide

reasonable.Howeverthepartsrelatingtotheessentialstructuresofalanguageimplementationareadequatelyexplained.AndabriefsummaryofRubylanguageitselfisalsoincludedsothatreaderswhodon’tknowaboutRubycanreadthisbook.

Themainthemesofthisbookarethefirstandthesecondpointabove.Though,whatIwanttoemphasizethemostisthethirdone:Toacquireskillinreadingsourcecode.Idaretosayit’sa“hidden”theme.IwillexplainwhyIthoughtitisnecessary.

Itisoftensaid“Tobeaskilledprogrammer,youshouldreadsourcecodewrittenbyothers.”Thisiscertainlytrue.ButIhaven’tfoundabookthatexplainshowyoucanactuallydoit.TherearemanybooksthatexplainOSkernelsandtheinterioroflanguageprocessingsystemsbyshowingtheconcretestructureor“theanswer,”buttheydon’texplainthewaytoreachthatanswer.It’sclearlyone-sided.

Canyou,perhaps,naturallyreadcodejustbecauseyouknowhowtowriteaprogram?Isittruethatreadingcodesissoeasythatallpeopleinthisworldcanreadcodewrittenbyotherswithnosweat?Idon’tthinkso.Readingprogramsiscertainlyasdifficultaswritingprograms.

Therefore,thisbookdoesnotsimplyexplainrubyassomethingalreadyknown,ratherdemonstratetheanalyzingprocessasgraphicaspossible.ThoughIthinkI’mareasonablyseasonedRubyprogrammer,IdidnotfullyunderstandtheinnerstructureofrubyatthetimewhenIstartedtowritethisbook.Inotherwords,

Page 3: Ruby Hacking Guide

regardingthecontentofruby,Istartedfromthepositionascloseaspossibletoreaders.Thisbookisthesummaryofboththeanalyzingprocessstartedfromthatpointanditsresult.

IaskedYukihiroMatsumoto,theauthorofruby,forsupervision.ButIthoughtthespiritofthisbookwouldbelostifeachanalysiswasmonitoredbytheauthorofthelanguagehimself.ThereforeIlimitedhisreviewtothefinalstageofwriting.Inthisway,withoutloosingthesenseofactuallyreadingthesourcecodes,IthinkIcouldalsoassurethecorrectnessofthecontents.

Tobehonest,thisbookisnoteasy.Intheveryleast,itislimitedinitssimplicitybytheinherentcomplexityofitsaim.However,thiscomplexitymaybewhatmakesthebookinterestingtoyou.Doyoufinditinterestingtobechatteringaroundapieceofcake?Doyoutaketoyourdesktosolveapuzzlethatyouknowtheanswertoinaheartbeat?Howaboutasuspensenovelwhosecriminalyoucanguesshalfwaythrough?Ifyoureallywanttocometonewknowledge,youneedtosolveaproblemengagingallyourcapacities.Thisisthebookthatletsyoupracticesuchidealismexhaustively.“It’sinterestingbecauseit’sdifficult.”I’mgladifthenumberofpeoplewhothinksowillincreasebecauseofthisbook.

Targetaudience

Firstly,knowledgeabouttheRubylanguageisn’trequired.

Page 4: Ruby Hacking Guide

However,sincetheknowledgeoftheRubylanguageisabsolutelynecessarytounderstandcertainexplanationsofitsstructure,supplementaryexplanationsofthelanguageareinsertedhereandthere.

KnowledgeabouttheClanguageisrequired,tosomeextent.Iassumeyoucanallocatesomestructswithmalloc()atruntimetocreatealistorastackandyouhaveexperienceofusingfunctionpointersatleastafewtimes.

Also,sincethebasicsofobject-orientedprogrammingwillnotbeexplainedsoseriously,withouthavinganyexperienceofusingatleastoneofobject-orientedlanguages,youwillprobablyhaveadifficulttime.Inthisbook,ItriedtousemanyexamplesinJavaandC++.

Structureofthisbook

Thisbookhasfourmainparts:

Part1:ObjectsPart2:SyntacticanalysisPart3:EvaluationPart4:Peripheralaroundtheevaluator

Supplementarychaptersareincludedatthebeginningofeachpartwhennecessary.Theseprovideabasicintroductionforthosewho

Page 5: Ruby Hacking Guide

arenotfamiliarwithRubyandthegeneralmechanismofalanguageprocessingsystem.

Now,wearegoingthroughtheoverviewofthefourmainparts.Thesymbolinparenthesesaftertheexplanationindicatesthedifficultygauge.Theyare(C),(B),(A)inorderofeasytohard,(S)beingthehighest.

Part1:ObjectChapter1 FocusesthebasicsofRubytogetreadytoaccomplishPart1.(C)Chapter2 GivesconcreteinnerstructureofRubyobjects.(C)Chapter3 Statesabouthashtable.(C)

Chapter4WritesaboutRubyclasssystem.Youmayreadthroughthischapterquicklyatfirst,becauseittellsplentyofabstractstories.(A)

Chapter5Showsthegarbagecollectorwhichisresponsibleforgeneratingandreleasingobjects.Thefirststoryinlow-levelseries.(B)

Chapter6Describestheimplementationofglobalvariables,classvariables,andconstants.(C)Chapter7 OutlineofthesecurityfeaturesofRuby.(C)

Part2:SyntacticanalysisChapter8 TalksaboutalmostcompletespecificationoftheRuby

language,inordertoprepareforPart2andPart3.(C)

Chapter9 Introductiontoyaccrequiredtoreadthesyntaxfileatleast.(B)

Chapter10 Lookthroughtherulesandphysicalstructureoftheparser.(A)

Page 6: Ruby Hacking Guide

Chapter11Explorearoundtheperipheralsoflex_state,whichisthemostdifficultpartoftheparser.Themostdifficultpartofthisbook.(S)

Chapter12 FinalizationofPart2andconnectiontoPart3.(C)

Part3:EvaluatorChapter13 Describethebasicmechanismoftheevaluator.(C)

Chapter14 ReadstheevaluationstackthatcreatesthemaincontextofRuby.(A)Chapter15 Talksaboutsearchandinitializationofmethods.(B)

Chapter16Defiestheimplementationoftheiterator,themostcharacteristicfeatureofRuby.(A)Chapter17 Describetheimplementationoftheevalmethods.(B)

Part4:PeripheralaroundtheevaluatorChapter18 Run-timeloadingoflibrariesinCandRuby.(B)

Chapter19 Describestheimplementationofthreadattheendofthecorepart.(A)

Environment

Thisbookdescribesonruby1.7.32002-09-12version.It’sattachedontheCD-ROM.Chooseanyoneofruby-rhg.tar.gz,ruby-rhg.lzh,orruby-rhg.zipaccordingtoyourconvenience.Contentisthesameforall.Alternativelyyoucanobtainfromthesupportsite(footnote{http://i.loveruby.net/ja/rhg/})ofthisbook.

Page 7: Ruby Hacking Guide

Forthepublicationofthisbook,thefollowingbuildenvironmentwaspreparedforconfirmationofcompilingandtestingthebasicoperation.Thedetailsofthisbuildtestaregivenindoc/buildtest.htmlintheattachedCD-ROM.However,itdoesn’tnecessarilyassumetheprobabilityoftheexecutionevenunderthesameenvironmentlistedinthetable.Theauthordoesn’tguaranteeinanyformtheexecutionofruby.

BeOS5PersonalEdition/i386DebianGNU/Linuxpotato/i386DebianGNU/Linuxwoody/i386DebianGNU/Linuxsid/i386FreeBSD4.4-RELEASE/Alpha(Requiresthelocalpatchforthisbook)FreeBSD4.5-RELEASE/i386FreeBSD4.5-RELEASE/PC98FreeBSD5-CURRENT/i386HP-UX10.20HP-UX11.00(32bitmode)HP-UX11.11(32bitmode)MacOSX10.2NetBSD1.6F/i386OpenBSD3.1PlamoLinux2.0/i386LinuxforPlayStation2Release1.0RedhatLinux7.3/i386Solaris2.6/SparcSolaris8/Sparc

Page 8: Ruby Hacking Guide

UX/4800VineLinux2.1.5VineLinux2.5VineSeedWindows98SE(Cygwin,MinGW+Cygwin,MinGW+MSYS)WindowsMe(BorlandC++Compiler5.5,Cygwin,MinGW+Cygwin,MinGW+MSYS,VisualC++6)WindowsNT4.0(Cygwin,MinGW+Cygwin)Windows2000(BorlandC++Compiler5.5,VisualC++6,VisualC++.NET)WindowsXP(VisualC++.NET,MinGW+Cygwin)

Thesenumeroustestsaren’tofaloneeffortbytheauthor.Thosetestbuildcouldn’tbeachievedwithoutmagnificentcooperationsbythepeoplelistedbelow.

I’dliketoextendwarmestthanksfrommyheart.

TietewkjananyasusakazukiMasahiroSatoKenichiTamuraMorikyuYuyaKatoYasuhiroKuboKentaroGotoTomoyukiShimomura

Page 9: Ruby Hacking Guide

MasakiSukedaKojiAraiKazuhiroNishiyamaShinyaKawajiTetsuyaWatanabeNaokuniFujimoto

However,theauthorowestheresponsibilityforthistest.Pleaserefrainfromattemptingtocontactthesepeopledirectly.Ifthere’sanyflawinexecution,pleasebeadvisedtocontacttheauthorbye-mail:[email protected].

Website

Thewebsiteforthisbookishttp://i.loveruby.net/ja/rhg/.Iwilladdinformationaboutrelatedprogramsandadditionaldocumentation,aswellaserrata.Inaddition,I’mgoingtopublisizethefirstfewchaptersofthisbookatthesametimeoftherelease.Iwilllookforacertaincircumstancetopublicizemorechapters,andthewholecontentsofthebookwillbeatthiswebsiteattheend.

Acknowledgment

Firstofall,IwouldliketothankMr.YukihiroMatsumoto.Heis

Page 10: Ruby Hacking Guide

theauthorofRuby,andhemadeitinpublicasanopensourcesoftware.Notonlyhewillinglyapprovedmetopublishabookaboutanalyzingruby,butalsoheagreedtosupervisethecontentofit.Inaddition,hehelpedmystayinFloridawithsimultaneoustranslation.ThereareplentyofthingsbeyondenumerationIhavetosaythankstohim.Insteadofwritingallthethings,Igivethisbooktohim.

Next,Iwouldliketothankarton,whoproposedmetopublishthisbook.Thewordsofartonalwaysmovesme.OneofthethingsI’mcurrentlystruggledduetohiswordsisthatIhavenoreasonIdon’tgeta.NETmachine.

KojiArai,the‘captain’ofdocumentationintheRubysociety,conductedascrutinyreviewasifhebecametheofficialeditorofthisbookwhileIwasnottoldso.Ithankallhisreview.

AlsoI’dliketomentionthosewhogavemecomments,pointedoutmistakesandsubmittedproposalsabouttheconstructionofthebookthroughoutallmywork.

Tietew,Yuya,Kawaji,Gotoken,Tamura,Funaba,Morikyu,Ishizuka,Shimomura,Kubo,Sukeda,Nishiyama,Fujimoto,Yanagawa,(I’msorryifthere’sanypeoplemissing),Ithankallthosepeoplecontributed.

Asafinalnote,IthankOtsuka,Haruta,andKanemitsuwhoyouforarrangingeverythingdespitemybrokedeadlineasmuchasfourtimes,andthatthemanuscriptexceeded200pagesthan

Page 11: Ruby Hacking Guide

originallyplanned.

Icannotexpandthefulllistheretomentionthenameofallpeoplecontributedtothisbook,butIsaythatIcouldn’tsuccessfullypublishthisbookwithoutsuchassistance.Letmetakethisplacetoexpressmyappreciation.Thankyouverymuch.

MineroAoki

Ifyouwanttosendremarks,suggestionsandreportsoftypographcalerrors,pleaseaddresstoMineroAoki<[email protected]>.

“Rubyソースコード完全解説”canbereserved/orderedatImpressDirect.(Jumptotheintroductionpage)

Copyright©2002-2004MineroAoki,Allrightsreserved.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 12: Ruby Hacking Guide

RubyHackingGuide

Introduction

CharacteristicsofRuby

SomeofthereadersmayhavealreadybeenfamiliarwithRuby,but(Ihope)therearealsomanyreaderswhohavenot.Firstlet’sgothougharoughsummaryofthecharacteristicsofRubyforsuchpeople.

Hereaftercapital“Ruby”referstoRubyasalanguagespecification,andlowercase“ruby”referstorubycommandasanimplementation.

DevelopmentstyleRubyisalanguagethatisbeingdeveloppedbythehandofYukihiroMatsumotoasanindividual.UnlikeCorJavaorScheme,itdoesnothaveanystandard.Thespecificationismerelyshownasanimplementationasruby,anditsvaryingcontinuously.Forgood

Page 13: Ruby Hacking Guide

orbad,it’sfree.

Furthermorerubyitselfisafreesoftware.It’sprobablynecessarytomentionatleastthetwopointshere:Thesourcecodeisopeninpublicanddistributedfreeofcharge.Thankstosuchcondition,anattemptlikethisbookcanbeapproved.

Ifyou’dliketoknowtheexactlisence,youcanreadREADMEandLEGAL.Forthetimebeing,I’dlikeyoutorememberthatyoucandoatleastthefollowingthings:

YoucanredistributesourcecodeofrubyYoucanmodifysourcecodeofrubyYoucanredistributeacopyofsourcecodewithyourmodification

Thereisnoneedforspecialpermissionandpaymentinallthesecases.

Bytheway,thepurposeofthisbookistoreadtheoriginalruby,thusthetargetsourceistheonenotmodifiedunlessitisparticularyspecified.However,whitespaces,newlinesandcommentswereaddedorremovedwithoutasking.

It’sconservativeRubyisaveryconservativelanguage.Itisequippedwithonlycarefullychosenfeaturesthathavebeentestedandwashedoutinavarietyoflanguages.Thereforeitdoesn’thaveplentyoffreshand

Page 14: Ruby Hacking Guide

experimentalfeaturesverymuch.Soithasatendencytoappealtoprogrammerswhoputimportanceonpracticalfunctionalities.Thedyed-in-the-woolhackerslikeSchemeandHaskellloversdon’tseemtofindappealinruby,atleastinashortglance.

Thelibraryisconservativeinthesameway.Clearandunabbreviatednamesaregivenfornewfunctions,whilenamesthatappearsinCandPerllibrarieshavebeentakenfromthem.Forexample,printf,getpwent,sub,andtr.

Itisalsoconservativeinimplementation.Assemblerisnotitsoptionforseekingspeed.Portabilityisalwaysconsideredahigherprioritywhenitconflictswithspeed.

Itisanobject-orientedlanguageRubyisanobject-orientedlanguage.ItisabsolutelyimpossibletoexcludeitfromthefeaturesofRuby.

Iwillnotgiveapagetothisbookaboutwhatanobject-orientedlanguageis.Totellaboutanobject-orientedfeatureaboutRuby,theexpressionofthecodethatjustgoingtobeexplainedistheexactsample.

ItisascriptlanguageRubyisascriptlanguage.ItseemsalsoabsolutelyimpossibletoexcludethisfromthefeaturesofRuby.Togainagreementofeveryone,anintroductionofRubymustinclude“object-oriented”

Page 15: Ruby Hacking Guide

and“scriptlanguage”.

However,whatisa“scriptlanguage”forexample?Icouldn’tfigureoutthedefinitionsuccessfully.Forexample,JohnK.Ousterhout,theauthorofTcl/Tk,givesadefinitionas“executablelanguageusing#!onUNIX”.Thereareotherdefinitionsdependingontheviewpoints,suchasonethatcanexpressausefulprogramwithonlyoneline,orthatcanexecutethecodebypassingaprogramfilefromthecommandline,etc.

However,Idaretouseanotherdefinition,becauseIdon’tfindmuchinterestin“what”ascriptlanguage.Ihavetheonlyonemeasuretodecidetocallitascriptlanguage,thatis,whethernoonewouldcomplainaboutcallingitascriptlanguage.Tofulfillthisdefinition,Iwoulddefinethemeaningof“scriptlanguage”asfollows.

Alanguagethatitsauthorcallsita“scriptlanguage”.

I’msurethisdefinitionwillhavenofailure.AndRubyfulfillsthispoint.ThereforeIcallRubya“scriptlanguage”.

It’saninterpreterrubyisaninterpreter.That’sthefact.Butwhyit’saninterpreter?Forexample,couldn’titbemadeasacompiler?Itmustbebecauseinsomepointsbeinganinterpreterisbetterthanbeingacompiler…atleastforruby,itmustbebetter.Well,whatisgoodaboutbeinganinterpreter?

Page 16: Ruby Hacking Guide

Asapreparationsteptoinvestigatingintoit,let’sstartbythinkingaboutthedifferencebetweenaninterpreterandacompiler.Ifthematteristoattemptatheoreticalcomparisonintheprocesshowaprogramisexecuted,there’snodifferencebetweenaninterpreterlanguageandacompilelanguage.BecauseitworksbylettingCPUinterpretthecodecompiledtothemachinelanguage,itmaybepossibletosayitworksasaninterpretor.Thenwhereistheplacethatactuallymakesadifference?Itisamorepracticalplace,intheprocessofdevelopment.

Iknowsomebody,assoonashearing“intheprocessofdevelopment”,wouldclaimusingastereotypicalphrase,thataninterpreterreduceseffortofcompilationthatmakesthedevelopmentprocedureeasier.ButIdon’tthinkit’saccurate.Alanguagecouldpossiblybeplannedsothatitwon’tshowtheprocessofcompilation.Actually,DelphicancompileaprojectbyhittingjustF5.Aclaimaboutalongtimeforcompilationisderivedfromthesizeoftheprojectoroptimizationofthecodes.Compilationitselfdoesn’toweanegativeside.

Well,whypeopleperceiveaninterpreterandcompilersomuchdifferentlikethis?Ithinkthatitisbecausethelanguagedeveloperssofarhavechoseneitherimplementationbasedonthetraitofeachlanguage.Inotherwords,ifitisalanguageforacomparativelysmallpurposesuchasadailyroutine,itwouldbeaninterpretor.Ifitisforalargeprojectwhereanumberofpeopleareinvolvedinthedevelopmentandaccuracyisrequired,itwouldbeacompiler.Thatmaybebecauseofthespeed,aswellastheeaseof

Page 17: Ruby Hacking Guide

creatingalanguage.

Therefore,Ithink“it’shandybecauseit’saninterpreter”isanoutsizedmyth.Beinganinterpreterdoesn’tnecessarilycontributethereadinessinusage;seekingreadinessinusagenaturallymakesyourpathtowardbuildinganinterpreterlanguage.

Anyway,rubyisaninterpreter;ithasanimportantfactaboutwherethisbookisfacing,soIemphasizeithereagain.ThoughIdon’tknowabout“it’shandybecauseitisaninterpreter”,anywayrubyisimplementedasaninterpreter.

HighportabilityEvenwithaproblemthatfundamentallytheinterfacesareUnix-centered,Iwouldinsistrubypossessesahighportability.Itdoesn’trequireanyextremelyunfamiliarlibrary.Ithasonlyafewpartswritteninassembler.Thereforeportingtoanewplatformiscomparativelyeasy.Namely,itworksonthefollowingplatformscurrently.

LinuxWin32(Windows95,98,Me,NT,2000,XP)CygwindjgppFreeBSDNetBSDOpenBSD

Page 18: Ruby Hacking Guide

BSD/OSMacOSXSolarisTru64UNIXHP-UXAIXVMSUX/4800BeOSOS/2(emx)Psion

IheardthatthemainmachineoftheauthorMatsumotoisLinux.ThuswhenusingLinux,youwillnotfailtocompileanytime.

Furthermore,youcanexpectastablefunctionalityona(typical)Unixenvironment.Consideringthereleasecycleofpackages,theprimaryoptionfortheenvironmenttohitaroundrubyshouldfallonabranchofPCUNIX,currently.

Ontheotherhand,theWin32environmenttendstocauseproblemsdefinitely.ThelargegapsinthetargetingOSmodeltendtocauseproblemsaroundthemachinestackandthelinker.Yet,recentlyWindowshackershavecontributedtomakebettersupport.IuseanativerubyonWindows2000andMe.Onceitgetssuccessfullyrun,itdoesn’tseemtoshowspecialconcernslikefrequentcrashing.ThemainproblemsonWindowsmaybethegapsinthespecifications.

Page 19: Ruby Hacking Guide

AnothertypeofOSthatmanypeoplemaybeinterestedinshouldprobablybeMacOS(priortov9)andhandheldOSlikePalm.

Aroundruby1.2andbefore,itsupportedlegacyMacOS,butthedevelopmentseemstobeinsuspension.Evenacompilingcan’tgetthrough.ThebiggestcauseisthatthecompilerenvironmentoflegacyMacOSandthedecreaseofdevelopers.TalkingaboutMacOSX,there’snoworriesbecausethebodyisUNIX.

ThereseemtobediscussionstheportabilitytoPalmseveralbranches,butIhaveneverheardofasuccessfulproject.Iguessthedifficultyliesinthenecessityofsettlingdownthespecification-levelstandardssuchasstdioonthePalmplatform,ratherthantheprocessesofactualimplementation.WellIsawaportingtoPsionhasbeendone.([ruby-list:36028]).

HowabouthotstoriesaboutVMseeninJavaand.NET?BecauseI’dliketotalkaboutthemcombiningtogetherwiththeimplementation,thistopicwillbeinthefinalchapter.

AutomaticmemorycontrolFunctionallyit’scalledGC,orGarbageCollection.SayingitinC-language,thisfeatureallowsyoutoskipfree()aftermalloc().Unusedmemoryisdetectedbythesystemautomatically,andwillbereleased.It’ssoconvenientthatonceyougetusedtoGCyouwon’tbewillingtodosuchmanualmemorycontrolagain.

ThetopicsaboutGChavebeencommonbecauseofitspopularity

Page 20: Ruby Hacking Guide

inrecentlanguageswithGCasastandardset,anditisfunthatitsalgorithmscanstillbeimprovedfurther.

TypelessvariablesThevariablesinRubydon’thavetypes.Thereasonisprobablytypelessvariablesconformsmorewithpolymorphism,whichisoneofthestrongestadvantagesofanobject-orientedlanguage.Ofcoursealanguagewithvariabletypehasawaytodealwithpolymorphism.WhatImeanhereisatypelessvariableshavebetterconformance.

Thelevelof“betterconformance”inthiscasereferstosynonymslike“handy”.It’ssometimescorrespondstocrucialimportance,sometimesitdoesn’tmatterpractically.Yet,thisiscertainlyanappealingpointifalanguageseeksfor“handyandeasy”,andRubydoes.

MostofsyntacticelementsareexpressionsThistopicisprobablydifficulttounderstandinstantlywithoutalittlesupplementalexplanation.Forexample,thefollowingC-languageprogramresultsinasyntacticerror.

result=if(cond){process(val);}else{0;}

BecausetheC-languagesyntaxdefinesifasastatement.Butyou

Page 21: Ruby Hacking Guide

canwriteitasfollows.

result=cond?process(val):0;

Thisrewriteispossiblebecausetheconditionaloperator(a?b:c)isdefinedasanexpression.

Ontheotherhand,inRuby,youcanwriteasfollowsbecauseifisanexpression.

result=ifcondthenprocess(val)elsenilend

Roughlyspeaking,ifitcanbeanargumentofafunctionoramethod,youcanconsideritasanexpression.

Ofcourse,thereareotherlanguageswhosesyntacticelementsaremostlyexpressions.Lispisthebestexample.Becauseofthecharacteristicaroundthis,thereseemsmanypeoplewhofeellike“RubyissimilartoLisp”.

IteratorsRubyhasiterators.Whatisaniterator?Beforegettingintoiterators,Ishouldmentionthenecessityofusinganalternativeterm,becausetheword“iterator”isdislikedrecently.However,Idon’thaveagoodalternative.Soletuskeepcallingit“iterator”forthetimebeing.

Page 22: Ruby Hacking Guide

Wellagain,whatisaniterator?Ifyouknowhigher-orderfunction,forthetimebeing,youcanregarditassomethingsimilartoit.InC-language,thecounterpartwouldbepassingafunctionpointerasanargument.InC++,itwouldbeamethodtowhichtheoperationpartofSTL’sIteratorisenclosed.IfyouknowshorPerl,it’sgoodtoimaginesomethinglikeacustomforstatementwhichwecandefine.

Yet,theabovearemerelyexamplesof“similar”concepts.Allofthemaresimilar,buttheyarenotidenticaltoRuby’siterator.Iwillexpandtheprecisestorywhenit’sagoodtimelater.

WritteninC-languageBeingwritteninC-languageisnotnotablethesedays,butit’sstillacharacteristicforsure.AtleastitisnotwritteninHaskellorPL/I,thusthere’sthehighpossibilitythattheordinarypeoplecanreadit.(Whetheritistrulyso,I’dlikeyouconfirmitbyyourself.)

Well,Ijustsaidit’sinC-language,buttheactuallanguageversionwhichrubyistargettingisbasicallyK&RC.Untilalittlewhileago,therewereadecentnumberof–notplentythough–K&R-only-environment.Butrecently,thereareafewenvironmentswhichdonotacceptprogramswritteninANSIC,technicallythere’snoproblemtomoveontoANSIC.However,alsobecauseoftheauthorMatsumoto’spersonalpreference,itisstillwritteninK&Rstyle.

Page 23: Ruby Hacking Guide

Forthisreason,thefunctiondefinitionisallinK&Rstyle,andtheprototypedeclarationsarenotsoseriouslywritten.Ifyoucarelesslyspecify-Walloptionofgcc,therewouldbeplentyofwarningsshown.IfyoutrytocompileitwithaC++compiler,itwouldwarnprototypemismatchandcouldnotcompile.…Thesekindofstoriesareoftenreportedtothemailinglist.

ExtensionlibraryWecanwriteaRubylibraryinCandloaditatruntimewithoutrecompilingRuby.Thistypeoflibraryiscalled“Rubyextensionlibrary”orjust“Extensionlibrary”.

NotonlythefactthatwecanwriteitinC,buttheverysmalldifferenceinthecodeexpressionbetweenRuby-levelandC-levelisalsoasignificanttrait.AsfortheoperationsavailableinRuby,wecanalsousetheminCinthealmostsameway.Seethefollowingexample.

#Methodcallobj.method(arg)#Rubyrb_funcall(obj,rb_intern("method"),1,arg);#C

#Blockcallyieldarg#Rubyrb_yield(arg);#C

#RaisingexceptionraiseArgumentError,'wrongnumberofarguments'#Rubyrb_raise(rb_eArgError,"wrongnumberofarguments");#C

#Generatinganobject

Page 24: Ruby Hacking Guide

arr=Array.new#RubyVALUEarr=rb_ary_new();#C

It’sgoodbecauseitprovideseasinessincomposinganextensionlibrary,andactuallyitmakesanindispensableprominenceofruby.However,it’salsoaburdenforrubyimplementation.Youcanseetheaffectsofitinmanyplaces.TheaffectstoGCandthread-processingiseminent.

ThreadRubyisequippedwiththread.Assumingaveryfewpeopleknowingnoneaboutthreadthesedays,Iwillomitanexplanationaboutthethreaditself.Iwillstartastoryindetail.

ruby’sthreadisauser-levelthreadthatisoriginallywritten.Thecharacteristicofthisimplementationisaveryhighportabilityinbothspecificationandimplementation.SurprisinglyaMS-DOScanrunthethread.Furthermore,youcanexpectthesameresponseinanyenvironment.Manypeoplementionthatthispointisthebestfeatureofruby.

However,asatradeoffforsuchanextremenessofportability,rubyabandonsthespeed.It’s,say,probablytheslowestofalluser-levelthreadimplementationsinthisworld.Thetendencyofrubyimplementationmaybeseenherethemostclearly.

Page 25: Ruby Hacking Guide

Techniquetoreadsourcecode

Well.Afteranintroductionofruby,weareabouttostartreadingsourcecode.Butwait.

Anyprogrammerhastoreadasourcecodesomewhere,butIguesstherearenotmanyoccasionsthatsomeoneteachesyoutheconcretewayshowtoread.Why?Doesitmeanyoucannaturallyreadaprogramifyoucanwriteaprogram?

ButIcan’tthinkreadingtheprogramwrittenbyotherpeopleissoeasy.Inthesamewayaswritingprograms,theremustbetechniquesandtheoriesinreadingprograms.Andtheyarenecessary.Therefore,beforestartingtoreadyruby,I’dliketoexpandageneralsummaryofanapproachyouneedtotakeinreadingasourcecode.

PrinciplesAtfirst,Imentiontheprinciple.

DecideagoalAnimportantkeytoreadingthesourcecodeistosetaconcretegoal.

ThisisawordbytheauthorofRuby,Matsumoto.Indeed,hiswordisveryconvincingforme.Whenthemotivationisaspontaneous

Page 26: Ruby Hacking Guide

idea“MaybeIshouldreadakernel,atleast…”,youwouldgetsourcecodeexpandedorexplanatorybooksreadyonthedesk.Butnotknowingwhattodo,thestudiesaretobeleftuntouched.Haven’tyou?Ontheotherhand,whenyouhaveinmind“I’msurethereisabugsomewhereinthistool.Ineedtoquicklyfixitandmakeitwork.OtherwiseIwillnotbeabletomakethedeadline…”,youwillprobablybeabletofixthecodeinablink,evenifit’swrittenbysomeoneelse.Haven’tyou?

Thedifferenceinthesetwocasesismotivationyouhave.Inordertoknowsomething,youatleasthavetoknowwhatyouwanttoknow.Therefore,thefirststepofallistofigureoutwhatyouwanttoknowinexplicitwords.

However,ofcoursethisisnotallneededtomakeityourown“technique”.Because“technique”needstobeacommonmethodthatanybodycanmakeuseofitbyfollowingit.Inthefollowingsection,Iwillexplainhowtobringthefirststepintothelandingplacewhereyouachievethegoalfinally.

VisualisingthegoalNowletussupposethatourfinalgoalisset“Understandallaboutruby”.Thisiscertainlyconsideredas“onesetgoal”,butapparentlyitwillnotbeusefulforreadingthesourcecodeactually.Itwillnotbeatriggerofanyconcreteaction.Therefore,yourfirstjobwillbetodragdownthevaguegoaltothelevelofaconcretething.

Thenhowcanwedoit?Thefirstwayisthinkingasifyouarethe

Page 27: Ruby Hacking Guide

personwhowrotetheprogram.Youcanutilizeyourknowledgeinwritingaprogram,inthiscase.Forexample,whenyouarereadingatraditional“structured”programmingbysomebody,youwillanalyzeithiringthestrategiesofstructuredprogrammingtoo.Thatis,youwilldividethetargetintopieces,littlebylittle.IfitissomethingcirculatinginaeventloopsuchasaGUIprogram,firstroughlybrowsetheeventloopthentrytofindouttheroleofeacheventhandler.Or,trytoinvestigatethe“M”ofMVC(ModelViewController)first.

Second,it’sgoodtobeawareofthemethodtoanalyze.Everybodymighthavecertainanalysismethods,buttheyareoftendonerelyingonexperienceorintuition.Inwhatwaycanwereadsourcecodeswell?Thinkingaboutthewayitselfandbeingawareofitarecruciallyimportant.

Well,whataresuchmethodslike?Iwillexplainitinthenextsection.

AnalysismethodsThemethodstoreadsourcecodecanberoughlydividedintotwo;oneisastaticmethodandtheotherisdynamicmethod.Staticmethodistoreadandanalyzethesourcecodewithoutrunningtheprogram.Dynamicmethodistowatchtheactualbehaviorusingtoolslikeadebugger.

It’sbettertostartstudyingaprogrambydynamicanalysis.Thatis

Page 28: Ruby Hacking Guide

becausewhatyoucanseethereisthe“fact”.Theresultsfromstaticanalysis,duetothefactofnotrunningtheprogramactually,maywellbe“prediction”toagreaterorlesserextent.Ifyouwanttoknowthetruth,youshouldstartfromwatchingthefact.

Ofcourse,youdon’tknowwhethertheresultsofdynamicanalysisarethefactreally.Thedebuggercouldrunwithabug,ortheCPUmaynotbeworkingproperlyduetooverheat.Theconditionsofyourconfigurationcouldbewrong.However,theresultsofstaticanalysisshouldatleastbeclosertothefactthandynamicanalysis.

Dynamicanalysis

UsingthetargetprogramYoucan’tstartwithoutthetargetprogram.Firstofall,youneedtoknowinadvancewhattheprogramislike,andwhatareexpectedbehaviors.

FollowingthebehaviorusingthedebuggerIfyouwanttoseethepathsofcodeexecutionandthedatastructureproducedasaresult,it’squickertolookattheresultbyrunningtheprogramactuallythantoemulatethebehaviorinyourbrain.Inordertodosoeasily,usethedebugger.

Iwouldbemorehappyifthedatastructureatruntimecanbeseen

Page 29: Ruby Hacking Guide

asapicture,butunfortunatelywecannearlyscarcelyfindatoolforthatpurpose(especiallyfewtoolsareavailableforfree).Ifitisaboutasnapshotofthecomparativelysimplerstructure,wemightbeabletowriteitoutasatextandconvertittoapicturebyusingatoollikegraphviz\footnote{graphviz……Seedoc/graphviz.htmlintheattachedCD-ROM}.Butit’sverydifficulttofindawayforgeneralpurposeandrealtimeanalysis.

TracerYoucanusethetracerifyouwanttotracetheproceduresthatcodegoesthrough.IncaseofC-language,thereisatoolnamedctrace\footnote{ctrace……http://www.vicente.org/ctrace}.Fortracingasystemcall,youcanusetoolslikestrace\footnote{strace……http://www.wi.leidenuniv.nl/~wichert/strace/},truss,andktrace.

PrinteverywhereThereisaword“printfdebugging”.Thismethodalsoworksforanalysisotherthandebugging.Ifyouarewatchingthehistoryofonevariable,forexample,itmaybeeasiertounderstandtolookatthedumpoftheresultoftheprintstatementsembed,thantotrackthevariablewithadebugger.

ModifyingthecodeandrunningitSayforexample,intheplacewhereit’snoteasytounderstandits

Page 30: Ruby Hacking Guide

behavior,justmakeasmallchangeinsomepartofthecodeoraparticularparameterandthenre-runtheprogram.Naturallyitwouldchangethebehavior,thusyouwouldbeabletoinferthemeaningofthecodefromit.

Itgoeswithoutsaying,youshouldalsohaveanoriginalbinaryanddothesamethingonbothofthem.

Staticanalysis

TheimportanceofnamesStaticanalysisissimplysourcecodeanalysis.Andsourcecodeanalysisisreallyananalysisofnames.Filenames,functionnames,variablenames,typenames,membernames—Aprogramisabunchofnames.

Thismayseemobviousbecauseoneofthemostpowerfultoolsforcreatingabstractionsinprogrammingisnaming,butkeepingthisinmindwillmakereadingmuchmoreefficient.

Also,we’dliketoknowaboutcodingrulesbeforehandtosomeextent.Forexample,inClanguage,externfunctionoftenusesprefixtodistinguishthetypeoffunctions.Andinobject-orientedprograms,functionnamessometimescontaintheinformationaboutwheretheybelongtoinprefixes,anditbecomesvaluableinformation(e.g.rb_str_length).

Page 31: Ruby Hacking Guide

ReadingdocumentsSometimesadocumentdescribestheinternalstructureisincluded.EspeciallybecarefulofafilenamedHACKINGetc.

ReadingthedirectorystructureLookingatinwhatpolicythedirectoriesaredivided.Graspingtheoverviewsuchashowtheprogramisstructured,andwhatthepartsare.

ReadingthefilestructureWhilebrowsing(thenamesof)thefunctions,alsolookingatthepolicyofhowthefilesaredivided.Youshouldpayattentiontothefilenamesbecausetheyarelikecommentswhoselifetimeisverylong.

Additionally,ifafilecontainssomemodulesinit,foreachmodulethefunctionstocomposeitshouldbegroupedtogether,soyoucanfindoutthemodulestructurefromtheorderofthefunctions.

InvestigatingabbreviationsAsyouencounterambiguousabbreviations,makealistofthemandinvestigateeachofthemasearlyaspossible.Forexample,whenitiswritten“GC”,thingswillbeverydifferentdependingonwhetheritmeans“GarbageCollection”or“GraphicContext”.

Page 32: Ruby Hacking Guide

Abbreviationsforaprogramaregenerallymadebythemethodsliketakingtheinitiallettersordroppingthevowels.Especially,popularabbreviationsinthefieldsofthetargetprogramareusedunconditionally,thusyoushouldbefamiliarwiththematanearlystage.

UnderstandingdatastructureIfyoufindbothdataandcode,youshouldfirstinvestigatethedatastructure.Inotherwords,whenexploringcodeinC,it’sbettertostartwithheaderfiles.Andinthiscase,let’smakethemostofourimaginationfromtheirfilenames.Forexample,ifyoufindframe.h,itwouldprobablybethestackframedefinition.

Also,youcanunderstandmanythingsfromthemembernamesofastructandtheirtypes.Forexample,ifyoufindthemembernext,whichpointstoitsowntype,thenitwillbealinkedlist.Similarly,whenyoufindmemberssuchasparent,children,andsibling,thenitmustbeatreestructure.Whenprev,itwillbeastack.

UnderstandingthecallingrelationshipbetweenfunctionsAfternames,thenextmostimportantthingtounderstandistherelationshipsbetweenfunctions.Atooltovisualizethecallingrelationshipsisespeciallycalleda“callgraph”,andthisisveryuseful.Forthis,we’dliketoutilizetools.

Page 33: Ruby Hacking Guide

Atext-basedtoolissufficient,butit’sevenbetterifatoolcangeneratediagrams.Howeversuchtoolisseldomavailable(especiallyfewtoolsareforfree).WhenIanalyzedrubytowritethisbook,IwroteasmallcommandlanguageandaparserinRubyandgenerateddiagramshalf-automaticallybypassingtheresultstothetoolnamedgraphviz.

ReadingfunctionsReadinghowitworkstobeabletoexplainthingsdonebythefunctionconcisely.It’sgoodtoreaditpartbypartaslookingatthefigureofthefunctionrelationships.

Whatisimportantwhenreadingfunctionsisnot“whattoread”but“whatnottoread”.Theeaseofreadingisdecidedbyhowmuchwecancutoutthecodes.Whatshouldexactlybecutout?Itishardtounderstandwithoutseeingtheactualexample,thusitwillbeexplainedinthemainpart.

Additionally,whenyoudon’tlikeitscodingstyle,youcanconvertitbyusingthetoollikeindent.

ExperimentingbymodifyingitasyoulikeIt’samysteryofhumanbody,whensomethingisdoneusingalotofpartsofyourbody,itcaneasilypersistinyourmemory.Ithinkthereasonwhynotafewpeoplepreferusingmanuscriptpaperstoakeyboardisnotonlytheyarejustnostalgicbutsuchfactisalso

Page 34: Ruby Hacking Guide

related.

Therefore,becausemerelyreadingonamonitorisveryineffectivetorememberwithourbodies,rewriteitwhilereading.Thiswayoftenhelpsourbodiesgetusedtothecoderelativelysoon.Iftherearenamesorcodeyoudon’tlike,rewritethem.Ifthere’sacrypticabbreviation,substituteitsothatitwouldbenolongerabbreviated.

However,itgoeswithoutsayingbutyoushouldalsokeeptheoriginalsourceasideandchecktheoriginalonewhenyouthinkitdoesnotmakesensealongtheway.Otherwise,youwouldbewonderingforhoursbecauseofasimpleyourownmistake.Andsincethepurposeofrewritingisgettingusedtoandnotrewritingitself,pleasebecarefulnottobeenthusiasticverymuch.

ReadingthehistoryAprogramoftencomeswithadocumentwhichisaboutthehistoryofchanges.Forexample,ifitisasoftwareofGNU,there’salwaysafilenamedChangeLog.Thisisthebestresourcetoknowabout“thereasonwhytheprogramisasitis”.

Alternatively,whenaversioncontrolsystemlikeCVSorSCCSisusedandyoucanaccessit,itsutilityvalueishigherthanChangeLog.TakingCVSasanexample,cvsannotate,whichdisplaystheplacewhichmodifiedaparticularline,andcvsdiff,whichtakesdifferencefromthespecifiedversion,andsoonareconvenient.

Page 35: Ruby Hacking Guide

Moreover,inthecasewhenthere’samailinglistoranewsgroupfordevelopers,youshouldgetthearchivessothatyoucansearchoverthemanytimebecauseoftenthere’stheinformationabouttheexactreasonofacertainchange.Ofcourse,ifyoucansearchonline,it’salsosufficient.

ThetoolsforstaticanalysisSincevarioustoolsareavailableforvariouspurposes,Ican’tdescribethemasawhole.ButifIhavetochooseonlyoneofthem,I’drecommendglobal.Themostattractivepointisthatitsstructureallowsustoeasilyuseitfortheotherpurposes.Forinstance,gctags,whichcomeswithit,isactuallyatooltocreatetagfiles,butyoucanuseittocreatealistofthefunctionnamescontainedinafile.

~/src/ruby%gctagsclass.c|awk'{print$1}'SPECIAL_SINGLETONSPECIAL_SINGLETONclone_methodinclude_class_newins_methods_iins_methods_priv_iins_methods_prot_imethod_list::

Thatsaid,butthisisjustarecommendationofthisauthor,youasareadercanusewhichevertoolyoulike.Butinthatcase,youshouldchooseatoolequippedwithatleastthefollowingfeatures.

Page 36: Ruby Hacking Guide

listupthefunctionnamescontainedinafilefindthelocationfromafunctionnameoravariablename(It’smorepreferableifyoucanjumptothelocation)functioncross-reference

Build

TargetversionTheversionofrubydescribedinthisbookis1.7(2002-09-12).Regardingruby,itisastableversionifitsminorversionisanevennumber,anditisadevelopingversionifitisanoddnumber.Hence,1.7isadevelopingversion.Moreover,9/12doesnotindicateanyparticularperiod,thusthisversionisnotdistributedasanofficialpackage.Therefore,inordertogetthisversion,youcangetfromtheCD-ROMattachedtothisbookorthesupportsite\footnote{Thesupportsiteofthisbook……http://i.loveruby.net/ja/rhg/}oryouneedtousetheCVSwhichwillbedescribedlater.

Therearesomereasonswhyitisnot1.6,whichisthestableversion,but1.7.Onethingisthat,becauseboththespecificationandtheimplementationareorganized,1.7iseasiertodealwith.Secondly,it’seasiertouseCVSifitistheedgeofthedevelopingversion.Additionally,itislikelythat1.8,whichisthenextstableversion,willbeoutinthenearfuture.Andthelastoneis,

Page 37: Ruby Hacking Guide

investigatingtheedgewouldmakeourmoodmorepleasant.

GettingthesourcecodeThearchiveofthetargetversionisincludedintheattachedCD-ROM.InthetopdirectoryoftheCD-ROM,

ruby-rhg.tar.gzruby-rhg.zipruby-rhg.lzh

thesethreeversionsareplaced,soI’dlikeyoutousewhicheveronethatisconvenientforyou.Ofcourse,whicheveroneyouchoose,thecontentisthesame.Forexample,thearchiveoftar.gzcanbeextractedasfollows.

~/src%mount/mnt/cdrom~/src%gzip-dc/mnt/cdrom/ruby-rhg.tar.gz|tarxf-~/src%umount/mnt/cdrom

CompilingJustbylookingatthesourcecode,youcan“read”it.Butinordertoknowabouttheprogram,youneedtoactuallyuseit,remodelitandexperimentwithit.Whenexperimenting,there’snomeaningifyoudidn’tusethesameversionyouarelookingat,thusnaturallyyou’dneedtocompileitbyyourself.

Therefore,fromnowon,I’llexplainhowtocompile.First,let’sstartwiththecaseofUnix-likeOS.There’sseveralthingsto

Page 38: Ruby Hacking Guide

consideronWindows,soitwillbedescribedinthenextsectionaltogether.However,CygwinisonWindowsbutalmostUnix,thusI’dlikeyoutoreadthissectionforit.

BuildingonaUnix-likeOSWhenitisaUnix-likeOS,becausegenerallyitisequippedwithaCcompiler,byfollowingthebelowprocedures,itcanpassinmostcases.Letussuppose~/src/rubyistheplacewherethesourcecodeisextracted.

~/src/ruby%./configure~/src/ruby%make~/src/ruby%su~/src/ruby#makeinstall

Below,I’lldescribeseveralpointstobecarefulabout.

OnsomeplatformslikeCygwin,UX/4800,youneedtospecifythe--enable-sharedoptionatthephaseofconfigure,oryou’dfailtolink.--enable-sharedisanoptiontoputthemostofrubyoutofthecommandassharedlibraries(libruby.so).

~/src/ruby%./configure--enable-shared

Thedetailedtutorialaboutbuildingisincludedindoc/build.htmloftheattachedCD-ROM,I’dlikeyoutotryasreadingit.

BuildingonWindows

Page 39: Ruby Hacking Guide

Ifthethingistobuildonwindows,itbecomeswaycomplicated.Thesourceoftheproblemis,therearemultiplebuildingenvironments.

VisualC++MinGWCygwinBorlandC++Compiler

First,theconditionoftheCygwinenvironmentisclosertoUNIXthanWindows,youcanfollowthebuildingproceduresforUnix-likeOS.

Ifyou’dliketocompilewithVisualC++,VisualC++5.0andlaterisrequired.There’sprobablynoproblemifitisversion6or.NET.

MinGWorMinimalistGNUforWindows,itiswhattheGNUcompilingenvironment(Namely,gccandbinutils)isportedonWindows.CygwinportsthewholeUNIXenvironment.Onthecontrary,MinGWportsonlythetoolstocompile.Moreover,aprogramcompiledwithMinGWdoesnotrequireanyspecialDLLatruntime.Itmeans,therubycompiledwithMinGWcanbetreatedcompletelythesameastheVisualC++version.

Alternatively,ifitispersonaluse,youcandownloadtheversion5.5ofBorlandC++CompilerforfreefromthesiteofBoarland.\footnote{TheBorlandsite:http://www.borland.co.jp}Becauserubystartedtosupportthisenvironmentfairlyrecently,there’smoreorlessanxiety,buttherewasnotanyparticularproblemonthebuild

Page 40: Ruby Hacking Guide

testdonebeforethepublicationofthisbook.

Then,amongtheabovefourenvironments,whichoneshouldwechoose?First,basicallytheVisualC++versionisthemostunlikelytocauseaproblem,thusIrecommendit.IfyouhaveexperiencedwithUNIX,installingthewholeCygwinandusingitisgood.IfyouhavenotexperiencedwithUNIXandyoudon’thaveVisualC++,usingMinGWisprobablygood.

Below,I’llexplainhowtobuildwithVisualC++andMinGW,butonlyabouttheoutlines.FormoredetailedexplanationsandhowtobuildwithBorlandC++Compiler,theyareincludedindoc/build.htmloftheattachedCD-ROM,thusI’dlikeyoutocheckitwhenitisnecessary.

VisualC++ItissaidVisualC++,butusuallyIDEisnotused,we’llbuildfromDOSprompt.Inthiscase,firstweneedtoinitializeenvironmentvariablestobeabletorunVisualC++itself.SinceabatchfileforthispurposecamewithVisualC++,let’sexecuteitfirst.

C:\>cd"\ProgramFiles\MicrosoftVisualStudio.NET\Vc7\bin"C:\ProgramFiles\MicrosoftVisualStudio.NET\Vc7\bin>vcvars32

ThisisthecaseofVisualC++.NET.Ifitisversion6,itcanbefoundinthefollowingplace.

C:\ProgramFiles\MicrosoftVisualStudio\VC98\bin\

Page 41: Ruby Hacking Guide

Afterexecutingvcvars32,allyouhavetodoistomovetothewin32\folderofthesourcetreeofrubyandbuild.Below,letussupposethesourcetreeisinC:\src.

C:\>cdsrc\rubyC:\src\ruby>cdwin32C:\src\ruby\win32>configureC:\src\ruby\win32>nmakeC:\src\ruby\win32>nmakeDESTDIR="C:\ProgramFiles\ruby"install

Then,rubycommandwouldbeinstalledinC:\ProgramFiles\ruby\bin\,andRubylibrarieswouldbeinC:\ProgramFiles\ruby\lib\.Becauserubydoesnotuseregistriesandsuchatall,youcanuninstallitbydeletingC:\ProgramFiles\rubyandbelow.

MinGWAsdescribedbefore,MinGWisonlyanenvironmenttocompile,thusthegeneralUNIXtoolslikesedorsharenotavailable.However,becausetheyarenecessarytobuildruby,youneedtoobtainitfromsomewhere.Forthis,therearealsotwomethods:CygwinandMSYS(MinimalSYStem).

However,Ican’trecommendMSYSbecausetroubleswerecontinuouslyhappenedatthebuildingcontestperformedbeforethepublicationofthisbook.Onthecontrary,inthewayofusingCygwin,itcanpassverystraightforwardly.Therefore,inthisbook,I’llexplainthewayofusingCygwin.

Page 42: Ruby Hacking Guide

First,installMinGWandtheentiredevelopingtoolsbyusingsetup.exeofCygwin.BothCygwinandMinGWarealsoincludedintheattachedCD-ROM.\footnote{CygwinandMinGW……Seealsodoc/win.htmloftheattachedCD-ROM}Afterthat,allyouhavetodoistotypeasfollowsfrombashpromptofCygwin.

~/src/ruby%./configure--with-gcc='gcc-mno-cygwin'\--enable-sharedi386-mingw32~/src/ruby%make~/src/ruby%makeinstall

That’sit.Herethelineofconfigurespansmulti-linesbutinpracticewe’dwriteitononelineandthebackslashisnotnecessary.Theplacetoinstallis\usr\local\andbelowofthedriveonwhichitiscompiled.Becausereallycomplicatedthingsoccuraroundhere,theexplanationwouldbefairlylong,soI’llexplainitcomprehensivelyindoc/build.htmloftheattachedCD-ROM.

BuildingDetails

Untilhere,ithasbeentheREADME-likedescription.Thistime,let’slookatexactlywhatisdonebywhatwehavebeendone.However,thetalksherepartiallyrequireveryhigh-levelknowledge.Ifyoucan’tunderstand,I’dlikeyoutoskipthisanddirectlyjumptothenextsection.Thisshouldbewrittensothatyoucanunderstandbycomingbackafterreadingtheentirebook.

Page 43: Ruby Hacking Guide

Now,onwhicheverplatform,buildingrubyisseparatedintothreephases.Namely,configure,makeandmakeinstall.Asconsideringtheexplanationaboutmakeinstallunnecessary,I’llexplaintheconfigurephaseandthemakephase.

configure

First,configure.Itscontentisashellscript,andwedetectthesystemparametersbyusingit.Forexample,“whetherthere’stheheaderfilesetjmp.h”or“whetheralloca()isavailable”,thesethingsarechecked.Thewaytocheckisunexpectedlysimple.

Targettocheck Method

commands executeitactuallyandthencheck$?headerfiles if[-f$includedir/stdio.h]

functions compileasmallprogramandcheckwhetherlinkingissuccess

Whensomedifferencesaredetected,somehowitshouldbereportedtous.Thewaytoreportis,thefirstwayisMakefile.IfweputaMakefile.ininwhichparametersareembeddedintheformof@param@,itwouldgenerateaMakefileinwhichtheyaresubstitutedwiththeactualvalues.Forexample,asfollows,

Makefile.in:CFLAGS=@CFLAGS@↓Makefile:CFLAGS=-g-O2

Alternatively,itwritesouttheinformationabout,forinstance,

Page 44: Ruby Hacking Guide

whethertherearecertainfunctionsorparticularheaderfiles,intoaheaderfile.Becausetheoutputfilenamecanbechanged,itisdifferentdependingoneachprogram,butitisconfig.hinruby.I’dlikeyoutoconfirmthisfileiscreatedafterexecutingconfigure.Itscontentissomethinglikethis.

▼config.h

::#defineHAVE_SYS_STAT_H1#defineHAVE_STDLIB_H1#defineHAVE_STRING_H1#defineHAVE_MEMORY_H1#defineHAVE_STRINGS_H1#defineHAVE_INTTYPES_H1#defineHAVE_STDINT_H1#defineHAVE_UNISTD_H1#define_FILE_OFFSET_BITS64#defineHAVE_LONG_LONG1#defineHAVE_OFF_T1#defineSIZEOF_INT4#defineSIZEOF_SHORT2::

Eachmeaningiseasytounderstand.HAVE_xxxx_Hprobablyindicateswhetheracertainheaderfileexists,SIZEOF_SHORTmustindicatethesizeoftheshorttypeofC.Likewise,SIZEOF_INTindicatesthebytelengthofint,HAVE_OFF_Tindicateswhethertheoffset_ttypeisdefinedornot.

Aswecanunderstandfromtheabovethings,configuredoesdetectthedifferencesbutitdoesnotautomaticallyabsorbthedifferences.

Page 45: Ruby Hacking Guide

Bridgingthedifferenceislefttoeachprogrammer.Forexample,asfollows,

▼AtypicalusageoftheHAVE_macro

24#ifdefHAVE_STDLIB_H25#include<stdlib.h>26#endif

(ruby.h)

autoconf

configureisnotaruby-specifictool.Whethertherearefunctions,thereareheaderfiles,…itisobviousthatthesetestshaveregularity.Itiswastefulifeachpersonwhowritesaprogramwroteeachowndistincttool.

Hereatoolnamedautoconfcomesin.Inthefilesnamedconfigure.inorconfigure.ac,writeabout“I’dliketodothesechecks”,processitwithautoconf,thenanadequateconfigurewouldbegenerated.The.inofconfigure.inisprobablyanabbreviationofinput.It’sthesameastherelationshipbetweenMakefileandMakefile.in..acis,ofcourse,anabbreviationofAutoConf.

Toillustratethistalkupuntilhere,itwouldbelikeFigure1.

Page 46: Ruby Hacking Guide

Figure1:TheprocessuntilMakefileiscreated

Forthereaderswhowanttoknowmoredetails,Irecommend“GNUAutoconf/Automake/Libtool”GaryV.Vaughan,BenElliston,TomTromey,IanLanceTaylor.

Bytheway,ruby‘sconfigureis,assaidbefore,generatedbyusingautoconf,butnotalltheconfigureinthisworldaregeneratedwithautoconf.Itcanbewrittenbyhandoranothertooltoautomaticallygeneratecanbeused.Anyway,it’ssufficientifultimatelythereareMakefileandconfig.handmanyothers.

make

Atthesecondphase,make,whatisdone?Ofcourse,itwouldcompilethesourcecodeofruby,butwhenlookingattheoutputofmake,Ifeelliketherearemanyotherthingsitdoes.I’llbrieflyexplaintheprocessofit.

1. compilethesourcecodecomposingrubyitself2. createthestaticlibrarylibruby.agatheringthecrucialpartsof

ruby

3. create“miniruby”,whichisanalwaysstatically-linkedruby

Page 47: Ruby Hacking Guide

4. createthesharedlibrarylibruby.sowhen--enable-shared5. compiletheextensionlibraries(underext/)byusingminiurby6. Atlast,generatetherealruby

Therearetworeasonswhyitcreatesminirubyandrubyseparately.Thefirstoneisthatcompilingtheextensionlibrariesrequiresruby.Inthecasewhen--enable-shared,rubyitselfisdynamicallylinked,thusthere’sapossibilitynotbeabletoruninstantlybecauseoftheloadpathsofthelibraries.Therefore,createminiruby,whichisstaticallylinked,anduseitduringthebuildingprocess.

Thesecondreasonis,inaplatformwherewecannotusesharedlibraries,there’sacasewhentheextensionlibrariesarestaticallylinkedtorubyitself.Inthiscase,itcannotcreaterubybeforecompilingallextensionlibraries,buttheextensionlibrariescannotbecompiledwithoutruby.Inordertoresolvethisdilemma,itusesminiruby.

CVS

TherubyarchiveincludedintheattachedCD-ROMis,asthesameastheofficialreleasepackage,justasnapshotwhichisanappearanceatjustaparticularmomentofruby,whichisacontinuouslychangingprogram.Howrubyhasbeenchanged,whyithasbeenso,thesethingsarenotdescribedthere.Thenwhatis

Page 48: Ruby Hacking Guide

thewaytoseetheentirepictureincludingthepast.WecandoitbyusingCVS.

AboutCVSCVSisshortlyanundolistofeditors.IfthesourcecodeisunderthemanagementofCVS,thepastappearancecanberestoredanytime,andwecanunderstandwhoandwhereandwhenandhowchangeditimmediatelyanytime.GenerallyaprogramdoingsuchjobiscalledsourcecodemanagementsystemandCVSisthemostfamousopen-sourcesourcecodemanagementsysteminthisworld.

SincerubyisalsomanagedwithCVS,I’llexplainalittleaboutthemechanismandusageofCVS.First,themostimportantideaofCVSisrepositoryandworking-copy.IsaidCVSissomethinglikeanundolistofeditor,inordertoarchivethis,therecordsofeverychanginghistoryshouldbesavedsomewhere.Theplacetostoreallofthemis“CVSrepository”.

Directlyspeaking,repositoryiswhatgathersallthepastsourcecodes.Ofcourse,thisisonlyaconcept,inreality,inordertosavespaces,itisstoredintheformofonerecentappearanceandthechangingdifferences(namely,batches).Inanyways,itissufficientifwecanobtaintheappearanceofaparticularfileofaparticularmomentanytime.

Ontheotherhand,“workingcopy”istheresultoftakingfilesfromtherepositorybychoosingacertainpoint.There’sonlyone

Page 49: Ruby Hacking Guide

repository,butyoucanhavemultipleworkingcopies.(Figure2)

Figure2:Repositoryandworkingcopies

Whenyou’dliketomodifythesourcecode,firsttakeaworkingcopy,edititbyusingeditorandsuch,and“return”it.Then,thechangeisrecordedtotherepository.Takingaworkingcopyfromtherepositoryiscalled“checkout”,returningiscalled“checkin”or“commit”(Figure3).Bycheckingin,thechangeisrecordedtotherepository,thenwecanobtainitanytime.

Figure3:CheckinandCheckout

ThebiggesttraitofCVSiswecanaccessitoverthenetworks.Itmeans,ifthere’sonlyoneserverwhichholdstherepository,everyonecancheckin/checkoutovertheinternetanytime.Butgenerallytheaccesstocheckinisrestrictedandwecan’tdoitfreely.

Page 50: Ruby Hacking Guide

RevisionHowcanwedotoobtainacertainversionfromtherepository?Onewayistospecifywithtime.Byrequiring“givemetheedgeversionofthattime”,itwouldselectit.Butinpractice,werarelyspecifywithtime.Mostcommonly,weusesomethingnamed“revision”.

“Revision”and“Version”havethealmostsamemeaning.Butusually“version”isattachedtotheprojectitself,thususingtheword“version”canbeconfusing.Therefore,theword“revision”isusedtoindicateabitsmallerunit.

InCVS,thefilejuststoredintherepositoryisrevision1.1.Checkingoutit,modifyingit,checkinginit,thenitwouldberevision1.2.Nextitwouldbe1.3then1.4.

AsimpleusageexampleofCVSKeepinginmindtheabovethings,I’lltalkabouttheusageofCVSveryverybriefly.First,cvscommandisessential,soI’dlikeyoutoinstallitbeforehand.ThesourcecodeofcvsisincludedintheattachedCD-ROM\footnote{cvs:archives/cvs-1.11.2.tar.gz}.Howtoinstallcvsisreallyfarfromthemainline,thusitwon’tbeexplainedhere.

Afterinstallingit,let’scheckoutthesourcecodeofrubyasanexperiment.Typethefollowingcommandswhenyouareonline.

Page 51: Ruby Hacking Guide

%cvs-d:pserver:[email protected]:/srcloginCVSPassword:anonymous%cvs-d:pserver:[email protected]:/srccheckoutruby

Anyoptionswerenotspecified,thustheedgeversionwouldbeautomaticallycheckedout.Thetrulyedgeversionofrubymustappearunderruby/.

Additionally,ifyou’dliketoobtaintheversionofacertainday,youcanuse-Doptionofcvscheckout.Bytypingasfollows,youcanobtainaworkingcopyoftheversionwhichisbeingexplainedbythisbook.

%cvs-d:pserver:[email protected]:/srccheckout-D2002-09-12ruby

Atthismoment,youhavetowriteoptionsimmediatelyaftercheckout.Ifyouwrote“ruby”first,itwouldcauseastrangeerrorcomplaining“missingamodule”.

And,withtheanonymousaccesslikethisexample,wecannotcheckin.Inordertopracticecheckingin,it’sgoodtocreatea(local)repositoryandstorea“Hello,World!”programinit.Theconcretewaytostoreisnotexplainedhere.Themanualcomingwithcvsisfairlyfriendly.RegardingbookswhichyoucanreadinJapanese,Irecommendtranslated“OpenSourceDevelopmentwithCVS”KarlFogel,MosheBar.

Page 52: Ruby Hacking Guide

Thecompositionofruby

ThephysicalstructureNowitistimetostarttoreadthesourcecode,butwhatisthethingweshoulddofirst?Itislookingoverthedirectorystructure.Inmostcases,thedirectorystructure,meaningthesourcetree,directlyindicatethemodulestructureoftheprogram.Abruptlysearchingmain()byusinggrepandreadingfromthetopinitsprocessingorderisnotsmart.Ofcoursefindingoutmain()isalsoimportant,butfirstlet’staketimetodolsorheadtograspthewholepicture.

BelowistheappearanceofthetopdirectoryimmediatelyaftercheckingoutfromtheCVSrepository.Whatendwithaslasharesubdirectories.

COPYINGcompar.cgc.cnumeric.csample/COPYING.jaconfig.guesshash.cobject.csignal.cCVS/config.subinits.cpack.csprintf.cChangeLogconfigure.ininstall-shparse.yst.cGPLcygwin/instruby.rbprec.cst.hLEGALdefines.hintern.hprocess.cstring.cLGPLdir.cio.crandom.cstruct.cMANIFESTdjgpp/keywordsrange.ctime.cMakefile.indln.clex.cre.cutil.cREADMEdln.hlib/re.hutil.hREADME.EXTdmyext.cmain.cregex.cvariable.cREADME.EXT.jadoc/marshal.cregex.hversion.cREADME.jaenum.cmath.cruby.1version.hToDoenv.hmisc/ruby.cvms/array.cerror.cmissing/ruby.hwin32/bcc32/eval.cmissing.hrubyio.hx68/

Page 53: Ruby Hacking Guide

bignum.cext/mkconfig.rbrubysig.hclass.cfile.cnode.hrubytest.rb

Recentlythesizeofaprogramitselfhasbecomelarger,andtherearemanysoftwareswhosesubdirectoriesaredividedintopieces,butrubyhasbeenconsistentlyusedthetopdirectoryforalongtime.Itbecomesproblematiciftherearetoomanyfiles,butwecangetusedtothisamount.

Thefilesatthetoplevelcanbecategorizedintosix:

documentsthesourcecodeofrubyitselfthetooltobuildrubystandardextensionlibrariesstandardRubylibrariestheothers

Thesourcecodeandthebuildtoolareobviouslyimportant.Asidefromthem,I’lllistupwhatseemsusefulforus.

ChangeLog

Therecordsofchangesonruby.Thisisveryimportantwheninvestigatingthereasonofacertainchange.

README.EXTREADME.EXT.ja

Howtocreateanextensionlibraryisdescribed,butinthecourseofit,thingsrelatingtotheimplementationofrubyitselfarealso

Page 54: Ruby Hacking Guide

written.

DissectingSourceCodeFromnowon,I’llfurthersplitthesourcecodeofrubyitselfintomoretinypieces.Asforthemainfiles,itscategorizationisdescribedinREADME.EXT,thusI’llfollowit.Regardingwhatisnotdescribed,Icategorizeditbymyself.

RubyLanguageCoreclass.c classrelatingAPIerror.c exceptionrelatingAPIeval.c evaluatorgc.c garbagecollectorlex.c reservedwordtableobject.c objectsystemparse.y parservariable.c constants,globalvariables,classvariablesruby.h Themainmacrosandprototypesofruby

intern.htheprototypesofCAPIofruby.internseemstobeanabbreviationofinternal,butthefunctionswrittenherecanbeusedfromextensionlibraries.

rubysig.h theheaderfilecontainingthemacrosrelatingtosignalsnode.h thedefinitionsrelatingtothesyntaxtreenodes

env.h thedefinitionsofthestructstoexpressthecontextoftheevaluator

Thepartstocomposethecoreoftherubyinterpretor.Themostofthefileswhichwillbeexplainedinthisbookarecontainedhere.If

Page 55: Ruby Hacking Guide

youconsiderthenumberofthefilesoftheentireruby,itisreallyonlyafew.Butifyouthinkbasedonthebytesize,50%oftheentireamountisoccupiedbythesefiles.Especially,eval.cis200KB,parse.yis100KB,thesefilesarelarge.

Utilitydln.c dynamicloaderregex.c regularexpressionenginest.c hashtableutil.c librariesforradixconversionsandsortandsoon

Itmeansutilityforruby.However,someofthemaresolargethatyoucannotimagineitfromtheword“utility”.Forinstance,regex.cis120KB.

Implementationofrubycommanddmyext.c dummyoftheroutinetoinitializeextensionlibraries(

DumMYEXTension)

inits.c theentrypointforcoreandtheroutinetoinitializeextensionlibraries

main.c theentrypointofrubycommand(thisisunnecessaryforlibruby)

ruby.c themainpartofrubycommand(thisisalsonecessaryforlibruby)

version.c theversionofruby

Theimplementationofrubycommand,whichisofwhentypingrubyonthecommandlineandexecuteit.Thisisthepart,forinstance,tointerpretthecommandlineoptions.Asidefromruby

Page 56: Ruby Hacking Guide

command,asthecommandsutilizingrubycore,therearemod_rubyandvim.Thesecommandsarefunctioningbylinkingtothelibrubylibrary(.a/.so/.dllandsoon).

ClassLibrariesarray.c classArraybignum.c classBignumcompar.c moduleComparabledir.c classDirenum.c moduleEnumerablefile.c classFilehash.c classHash(Itsactualbodyisst.c)io.c classIOmarshal.c moduleMarshalmath.c moduleMathnumeric.c classNumeric,Integer,Fixnum,Floatpack.c Array#pack,String#unpackprec.c modulePrecisionprocess.c moduleProcessrandom.c Kernel#srand(),rand()range.c classRangere.c classRegexp(Itsactualbodyisregex.c)signal.c moduleSignalsprintf.c ruby-specificsprintf()string.c classStringstruct.c classStructtime.c classTime

TheimplementationsoftheRubyclasslibraries.WhatlistedherearebasicallyimplementedinthecompletelysamewayastheordinaryRubyextensionlibraries.Itmeansthattheselibrariesare

Page 57: Ruby Hacking Guide

alsoexamplesofhowtowriteanextensionlibrary.

Filesdependingonaparticularplatformbcc32/ BorlandC++(Win32)beos/ BeOScygwin/ Cygwin(theUNIXsimulationlayeronWin32)djgpp/ djgpp(thefreedevelopingenvironmentforDOS)vms/ VMS(anOShadbeenreleasedbyDECbefore)win32/ VisualC++(Win32)x68/ SharpX680x0series(OSisHuman68k)

Eachplatform-specificcodeisstored.

fallbackfunctionsmissing/

Filestooffsetthefunctionswhicharemissingoneachplatform.Mainlyfunctionsoflibc.

LogicalStructureNow,therearetheabovefourgroupsandthecorecanbedividedfurtherintothree:First,“objectspace”whichcreatestheobjectworldofRuby.Second,“parser”whichconvertsRubyprograms(intext)totheinternalformat.Third,“evaluator”whichdrivesRubyprograms.Bothparserandevaluatorarecomposedaboveobjectspace,parserconvertsaprogramintotheinternalformat,andevaluatoractuatestheprogram.Letmeexplaintheminorder.

Page 58: Ruby Hacking Guide

ObjectSpaceThefirstoneisobjectspace.Thisisveryeasytounderstand.Itisbecauseallofwhatdealtwithbythisarebasicallyonthememory,thuswecandirectlyshowormanipulatethembyusingfunctions.Therefore,inthisbook,theexplanationwillstartwiththispart.Part1isfromchapter2tochapter7.

ParserThesecondoneisparser.Probablysomepreliminaryexplanationsarenecessaryforthis.

rubycommandistheinterpretorofRubylanguage.Itmeansthatitanalyzestheinputwhichisatextoninvocationandexecutesitbyfollowingit.Therefore,rubyneedstobeabletointerpretthemeaningoftheprogramwrittenasatext,butunfortunatelytextisveryhardtounderstandforcomputers.Forcomputers,textfilesaremerelybytesequencesandnothingmorethanthat.Inordertocomprehendthemeaningoftextfromit,somespecialgimmickisnecessary.Andthegimmickisparser.Bypassingthroughparser,(atextas)aRubyprogramwouldbeconvertedintotheruby-specificinternalexpressionwhichcanbeeasilyhandledfromtheprogram.

Theinternalexpressioniscalled“syntaxtree”.Syntaxtreeexpressesaprogrambyatreestructure,forinstance,figure4showshowanifstatementisexpressed.

Page 59: Ruby Hacking Guide

Figure4:anifstatementanditscorrespondingsyntaxtree

ParserwillbedescribedinPart2“SyntacticAnalysis”.Part2isfromchapter10tochapter12.Itstargetfileisonlyparse.y.

EvaluatorObjectsareeasytounderstandbecausetheyaretangible.Alsoregardingparser,Whatitdoesisultimatelyconvertingadataformatintoanotherone,soit’sreasonablyeasytounderstand.However,thethirdone,evaluator,thisiscompletelyelusive.

Whatevaluatordoesis“executing”aprogrambyfollowingasyntaxtree.Thissoundseasy,butwhatis“executing”?Toanswerthisquestionpreciselyisfairlydifficult.Whatis“executinganifstatement”?Whatis“executingawhilestatement”?Whatdoes“assigningtoalocalvariable”mean?Wecannotunderstandevaluatorwithoutansweringallofsuchquestionsclearlyand

Page 60: Ruby Hacking Guide

precisely.

Inthisbook,evaluatorwillbediscussedinPart3“Evaluate”.Itstargetfileiseval.c.evalisanabbreviationof“evaluator”.

Now,I’vedescribedbrieflyaboutthestructureofruby,howevereventhoughtheideaswereexplained,itdoesnotsomuchhelpusunderstandthebehaviorofprogram.Inthenextchapter,we’llstartwithactuallyusingruby.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 61: Ruby Hacking Guide

RubyHackingGuide

TranslatedbySebastianKrause

Page 62: Ruby Hacking Guide

Chapter1:Introduction

AMinimalIntroductiontoRuby

HeretheRubyprerequisitesareexplained,whichoneneedstoknowinordertounderstandthefirstsection.Iwon’tpointoutprogrammingtechniquesorpointsoneshouldbecarefulabout.Sodon’tthinkyou’llbeabletowriteRubyprogramsjustbecauseyoureadthischapter.ReaderswhohavepriorexperiencewithRubycanskipthischapter.

Wewilltalkaboutgrammarextensivelyinthesecondsection,henceIwon’tdelveintothefinerpointsofgrammarhere.FromhashliteralsandsuchI’llshowonlythemostwidelyusednotations.OnprincipleIwon’tomitthingsevenifIcan.Thiswaythesyntaxbecomesmoresimple.Iwon’talwayssay“Wecanomitthis”.

Objects

Strings

Page 63: Ruby Hacking Guide

EverythingthatcanbemanipulatedinaRubyprogramisanobject.TherearenoprimitivesasJava’sintandlong.Forinstanceifwewriteasbelowitdenotesastringobjectwithcontentcontent.

"content"

Icasuallycalleditastringobjectbuttobeprecisethisisanexpressionwhichgeneratesastringobject.Thereforeifwewriteitseveraltimeseachtimeanotherstringobjectisgenerated.

"content""content""content"

Herethreestringobjectswithcontentcontentaregenerated.

Bytheway,objectsjustexistingtherecan’tbeseenbyprogrammers.Let’sshowhowtoprintthemontheterminal.

p("content")#Shows"content"

Everythingafteran#isacomment.Fromnowon,I’llputtheresultofanexpressioninacommentbehind.

p(……)callsthefunctionp.Itdisplaysarbitraryobjects“assuch”.It’sbasicallyadebuggingfunction.

Preciselyspeaking,therearenofunctionsinRuby,butjustfornowwecanthinkofitasafunction.Youcanusefunctionswhereveryouare.

Page 64: Ruby Hacking Guide

VariousLiteralsNow,let’sexplainsomemoretheexpressionswhichdirectlygenerateobjects,theso-calledliterals.Firsttheintegersandfloatingpointnumbers.

#Integer121009999999999999999999999999#Arbitrarilybigintegers

#Float1.099.9991.3e4#1.3×10^4

Don’tforgetthattheseareallexpressionswhichgenerateobjects.I’mrepeatingmyselfbuttherearenoprimitivesinRuby.

Belowanarrayobjectisgenerated.

[1,2,3]

Thisprogramgeneratesanarraywhichconsistsofthethreeintegers1,2and3inthatorder.Astheelementsofanarraycanbearbitraryobjectsthefollowingisalsopossible.

[1,"string",2,["nested","array"]]

Andfinally,ahashtableisgeneratedbytheexpressionbelow.

Page 65: Ruby Hacking Guide

{"key"=>"value","key2"=>"value2","key3"=>"value3"}

Ahashtableisastructurewhichexpressesone-to-onerelationshipsbetweenarbitraryobjects.Theabovelinecreatesatablewhichstoresthefollowingrelationships.

"key"→"value""key2"→"value2""key3"→"value3"

Ifweaskahashtablecreatedinthisway“What’scorrespondingtokey?”,it’llanswer“That’svalue.”Howcanweask?Weusemethods.

MethodCallsWecancallmethodsonanobject.InC++Jargontheyarememberfunctions.Idon’tthinkit’snecessarytoexplainwhatamethodis.I’lljustexplainthenotation.

"content".upcase()

Heretheupcasemethodiscalledonastringobject(withcontentcontent).Asupcaseisamethodwhichreturnsanewstringwiththesmalllettersreplacedbycapitalletters,wegetthefollowingresult.

p("content".upcase())#Shows"CONTENT"

Methodcallscanbechained.

"content".upcase().downcase()

Page 66: Ruby Hacking Guide

Herethemethoddowncaseiscalledonthereturnvalueof"content".upcase().

Therearenopublicfields(membervariables)asinJavaorC++.Theobjectinterfaceconsistsofmethodsonly.

TheProgram

TopLevelInRubywecanjustwriteexpressionsanditbecomesaprogram.Onedoesn’tneedtodefineamain()asinC++orJava.

p("content")

ThisisacompleteRubyprogram.Ifweputthisintoafilecalledfirst.rbwecanexecuteitfromthecommandlineasfollows.

%rubyfirst.rb"content"

Withthe-eoptionoftherubyprogramwedon’tevenneedtocreateafile.

%ruby-e'p("content")'"content"

Page 67: Ruby Hacking Guide

Bytheway,theplacewherepiswrittenisthelowestnestingleveloftheprogram,itmeansthehighestlevelfromtheprogram’sstandpoint,thusit’scalled“top-level”.Havingtop-levelisacharacteristictraitofRubyasascriptinglanguage.

InRuby,onelineisusuallyonestatement.Asemicolonattheendisn’tnecessary.Thereforetheprogrambelowisinterpretedasthreestatements.

p("content")p("content".upcase())p("CONTENT".downcase())

Whenweexecuteititlookslikethis.

%rubysecond.rb"content""CONTENT""content"

LocalVariablesInRubyallvariablesandconstantsstorereferencestoobjects.That’swhyonecan’tcopythecontentbyassigningonevariabletoanothervariable.VariablesoftypeObjectinJavaorpointerstoobjectsinC++aregoodtothinkof.However,youcan’tchangethevalueofeachpointeritself.

InRubyonecantelltheclassification(scope)ofavariablebythebeginningofthename.Localvariablesstartwithasmallletteror

Page 68: Ruby Hacking Guide

anunderscore.Onecanwriteassignmentsbyusing“=”.

str="content"arr=[1,2,3]

Aninitialassignmentservesasdeclaration,anexplicitdeclarationisnotnecessary.Becausevariablesdon’thavetypes,wecanassignanykindofobjectsindiscriminately.Theprogrambelowiscompletelylegal.

lvar="content"lvar=[1,2,3]lvar=1

Butevenifwecan,wedon’thavetodoit.Ifdifferentkindofobjectsareputinonevariable,ittendstobecomedifficulttoread.InarealworldRubyprogramonedoesn’tdothiskindofthingswithoutagoodreason.Theabovewasjustanexampleforthesakeofit.

Variablereferencehasalsoaprettysensiblenotation.

str="content"p(str)#Shows"content"

Inadditionlet’scheckthepointthatavariableholdareferencebytakinganexample.

a="content"b=ac=b

Page 69: Ruby Hacking Guide

Afterweexecutethisprogramallthreelocalvariablesabcpointtothesameobject,astringobjectwithcontent"content"createdonthefirstline(Figure1).

Figure1:Rubyvariablesstorereferencestoobjects

Bytheway,asthesevariablesarecalledlocal,theyshouldbelocaltosomewhere,butwecannottalkaboutthisscopewithoutreadingabitfurther.Let’ssayfornowthatthetoplevelisonelocalscope.

ConstantsConstantsstartwithacapitalletter.Theycanonlybeassignedonce(attheircreation).

Const="content"PI=3.1415926535

Page 70: Ruby Hacking Guide

p(Const)#Shows"content"

I’dliketosaythatifweassigntwiceanerroroccurs.Butthereisjustawarning,notanerror.ItisinthiswayinordertoavoidraisinganerrorevenwhenthesamefileisloadedtwiceinapplicationsthatmanipulateRubyprogramitself,forinstanceindevelopmentenvironments.Therefore,itisallowedduetopracticalrequirementsandthere’snootherchoice,butessentiallythereshouldbeanerror.Infact,upuntilversion1.1therereallywasanerror.

C=1C=2#Thereisawarningbutideallythereshouldbeanerror.

Alotofpeoplearefooledbythewordconstant.Aconstantonlydoesnotswitchobjectsonceitisassigned.Butitdoesnotmeanthepointedobjectitselfwon’tchange.Theterm“readonly”mightcapturetheconceptbetterthan“constant”.

Bytheway,toindicatethatanobjectitselfshouldn’tbechangedanothermeansisused:freeze.

Figure2:constantmeansreadonly

Andthescopeofconstantsisactuallyalsocannotbedescribedyet.Itwillbediscussedlaterinthenextsectionmixingwithclasses.

Page 71: Ruby Hacking Guide

ControlStructuresSinceRubyhasawideabundanceofcontrolstructures,justliningupthemcanbeahugetask.Fornow,Ijustmentionthatthereareifandwhile.

ifi<10then#bodyend

whilei<10do#bodyend

Inaconditionalexpression,onlythetwoobjects,falseandnil,arefalseandallothervariousobjectsaretrue.0ortheemptystringarealsotrueofcourse.

Itwouldn’tbewiseiftherewerejustfalse,thereisalsotrue.Anditisofcoursetrue.

ClassesandMethods

ClassesInobjectorientedsystem,essentiallymethodsbelongtoobjects.Itcanholdonlyinaidealworld,though.Inanormalprogramtherearealotofobjectswhichhavethesamesetofmethods,itwouldbe

Page 72: Ruby Hacking Guide

anenormousworkifeachobjectrememberthesetofcallablemethods.Usuallyamechanismlikeclassesormultimethodsisusedtogetridoftheduplicationofdefinitions.

InRuby,asthetraditionalwaytobindobjectsandmethodstogether,theconceptofclassesisused.Namelyeveryobjectbelongstoaclass,themethodswhichcanbecalledaredeterminedbytheclass.Andinthisway,anobjectiscalled“aninstanceoftheXXclass”.

Forexamplethestring"str"isaninstanceoftheStringclass.AndonthisStringclassthemethodsupcase,downcase,stripandmanyothersaredefined.Soitlooksasifeachstringobjectcanrespondtoallthesemethods.

#TheyallbelongtotheStringclass,#hencethesamemethodsaredefined"content".upcase()"Thisisapen.".upcase()"chapterII".upcase()

"content".length()"Thisisapen.".length()"chapterII".length()

Bytheway,whathappensifthecalledmethodisn’tdefined?InastaticlanguageacompilererroroccursbutinRubythereisaruntimeexception.Let’stryitout.Forthiskindofprogramsthe-eoptionishandy.

%ruby-e'"str".bad_method()'-e:1:undefinedmethod`bad_method'for"str":String(NoMethodError)

Page 73: Ruby Hacking Guide

Whenthemethodisn’tfoundthere’sapparentlyaNoMethodError.

Alwayssaying“theupcasemethodofString”andsuchiscumbersome.Let’sintroduceaspecialnotationString#upcasereferstothemethodupcasedefinedintheclassString.

Bytheway,ifwewriteString.upcaseithasacompletelydifferentmeaningintheRubyworld.Whatcouldthatbe?Iexplainitinthenextparagraph.

ClassDefinitionUptonowwetalkedaboutalreadydefinedclasses.Wecanofcoursealsodefineourownclasses.Todefineclassesweusetheclassstatement.

classCend

ThisisthedefinitionofanewclassC.Afterwedefineditwecanuseitasfollows.

classCendc=C.new()#createaninstanceofCandassignittothevariablec

NotethatthenotationforcreatinganewinstanceisnotnewC.Theastutereadermightthink:Hmm,thisC.new()reallylookslikeamethodcall.InRubytheobjectgeneratingexpressionsareindeed

Page 74: Ruby Hacking Guide

justmethods.

InRubyclassnamesandconstantnamesarethesame.Then,whatisstoredintheconstantwhosenameisthesameasaclassname?Infact,it’stheclass.InRubyallthingswhichaprogramcanmanipulateareobjects.Soofcourseclassesarealsoexpressedasobjects.Let’scalltheseclassobjects.EveryclassisaninstanceoftheclassClass.

Inotherwordsaclassstatementcreatesanewclassobjectanditassignsaconstantnamedwiththeclassnametotheclass.Ontheotherhandthegenerationofaninstancereferencesthisconstantandcallsamethodonthisobject(usuallynew).Ifwelookattheexamplebelow,it’sprettyobviousthatthecreationofaninstancedoesn’tdifferfromanormalmethodcall.

S="content"classCend

S.upcase()#GettheobjecttheconstantSpointstoandcallupcaseC.new()#GettheobjecttheconstantCpointstoandcallnew

SonewisnotareservedwordinRuby.

Andwecanalsousepforaninstanceofaclassevenimmediatelyafteritscreation.

classCend

c=C.new()

Page 75: Ruby Hacking Guide

p(c)##<C:0x2acbd7e4>

Itwon’tdisplayasnicelyasastringoranintegerbutitshowsitsrespectiveclassandit’sinternalID.ThisIDisthepointervaluewhichpointstotheobject.

Oh,Icompletelyforgottomentionaboutthenotationofmethodnames:Object.newmeanstheclassobjectObjectandthenewmethodcalledontheclassitself.SoObject#newandObject.newarecompletelydifferentthings,wehavetoseparatethemstrictly.

obj=Object.new()#Object.newobj.new()#Object#new

InpracticeamethodObject#newisalmostneverdefinedsothesecondlinewillreturnanerror.Pleaseregardthisasanexampleofthenotation.

MethodDefinitionEvenifwecandefineclasses,itisuselessifwecannotdefinemethods.Let’sdefineamethodforourclassC.

classCdefmyupcase(str)returnstr.upcase()endend

Todefineamethodweusethedefstatement.Inthisexamplewe

Page 76: Ruby Hacking Guide

definedthemethodmyupcase.Thenameoftheonlyparameterisstr.Aswithvariables,it’snotnecessarytowriteparametertypesorthereturntype.Andwecanuseanynumberofparameters.

Let’susethedefinedmethod.Methodsareusuallycalledfromtheoutsidebydefault.

c=C.new()result=c.myupcase("content")p(result)#Shows"CONTENT"

Ofcourseifyougetusedtoityoudon’tneedtoassigneverytime.Thelinebelowgivesthesameresult.

p(C.new().myupcase("content"))#Alsoshows"CONTENT"

self

Duringtheexecutionofamethodtheinformationaboutwhoisitself(theinstanceonwhichthemethodwascalled)isalwayssavedandcanbepickedupinself.LikethethisinC++orJava.Let’scheckthisout.

classCdefget_self()returnselfendend

c=C.new()p(c)##<C:0x40274e44>p(c.get_self())##<C:0x40274e44>

Page 77: Ruby Hacking Guide

Aswesee,theabovetwoexpressionsreturntheexactsameobject.Wecouldconfirmthatselfiscduringthemethodcallonc.

Thenwhatisthewaytocallamethodonitself?Whatfirstcomestomindiscallingviaself.

classCdefmy_p(obj)self.real_my_p(obj)#calledamethodagainstoneselfend

defreal_my_p(obj)p(obj)endend

C.new().my_p(1)#Output1

Butalwaysaddingtheselfwhencallinganownmethodistedious.Hence,itisdesignedsothatonecanomitthecalledmethod(thereceiver)wheneveronecallsamethodonself.

classCdefmy_p(obj)real_my_p(obj)#Youcancallwithoutspecifyingthereceiverend

defreal_my_p(obj)p(obj)endend

C.new().my_p(1)#Output1

InstanceVariables

Page 78: Ruby Hacking Guide

Asthereareasaying“Objectsaredataandcode”,justbeingabletodefinemethodsalonewouldbenotsouseful.Eachobjectmustalsobeabletotostoredata.Inotherwordsinstancevariables.OrinC++jargonmembervariables.

InthefashionofRuby’svariablenamingconvention,thevariabletypecanbedeterminedbythefirstafewcharacters.Forinstancevariablesit’san@.

classCdefset_i(value)@i=valueend

defget_i()return@iendend

c=C.new()c.set_i("ok")p(c.get_i())#Shows"ok"

Instancevariablesdifferabitfromthevariablesseenbefore:Wecanreferencethemwithoutassigning(defining)them.Toseewhathappensweaddthefollowinglinestothecodeabove.

c=C.new()p(c.get_i())#Showsnil

Callinggetwithoutsetgivesnil.nilistheobjectwhichindicates“nothing”.It’smysteriousthatthere’sreallyanobjectbutitmeansnothing,butthat’sjustthewayitis.

Page 79: Ruby Hacking Guide

Wecanusenillikealiteralaswell.

p(nil)#Showsnil

initialize

Aswesawbefore,whenwecall‘new’onafreshlydefinedclass,wecancreateaninstance.That’ssure,butsometimeswemightwanttohaveapeculiarinstantiation.Inthiscasewedon’tchangethenewmethod,wedefinetheinitializemethod.Whenwedothis,itgetscalledwithinnew.

classCdefinitialize()@i="ok"enddefget_i()return@iendendc=C.new()p(c.get_i())#Shows"ok"

Strictlyspeakingthisisthespecificationofthenewmethodbutnotthespecificationofthelanguageitself.

InheritanceClassescaninheritfromotherclasses.ForinstanceStringinheritsfromObject.Inthisbook,we’llindicatethisrelationbyaverticalarrowasinFig.3.

Page 80: Ruby Hacking Guide

Figure3:Inheritance

Inthecaseofthisillustration,theinheritedclass(Object)iscalledsuperclassorsuperiorclass.Theinheritingclass(String)iscalledsubclassorinferiorclass.ThispointdiffersfromC++jargon,becareful.Butit’sthesameasinJava.

Anywaylet’stryitout.Letourcreatedclassinheritfromanotherclass.Toinheritfromanotherclass(ordesignateasuperclass)writethefollowing.

classC<SuperClassNameend

WhenweleaveoutthesuperclasslikeinthecasesbeforetheclassObjectbecomestacitlythesuperclass.

Now,whyshouldwewanttoinherit?Ofcoursetohandovermethods.Handingovermeansthatthemethodswhichweredefinedinthesuperclassalsoworkinthesubclassasiftheyweredefinedinthereoncemore.Let’scheckitout.

classCdefhello()return"hello"endend

Page 81: Ruby Hacking Guide

classSub<Cend

sub=Sub.new()p(sub.hello())#Shows"hello"

hellowasdefinedintheclassCbutwecouldcallitonaninstanceoftheclassSubaswell.Ofcoursewedon’tneedtoassignvariables.Theaboveisthesameasthelinebelow.

p(Sub.new().hello())

Bydefiningamethodwiththesamename,wecanoverwritethemethod.InC++andObjectPascal(Delphi)it’sonlypossibletooverwritefunctionsexplicitlydefinedwiththekeywordvirtualbutinRubyeverymethodcanbeoverwrittenunconditionally.

classCdefhello()return"Hello"endend

classSub<Cdefhello()return"HellofromSub"endend

p(Sub.new().hello())#Shows"HellofromSub"p(C.new().hello())#Shows"Hello"

Wecaninheritoverseveralsteps.ForinstanceasinFig.4FixnuminheritseverymethodfromObject,NumericandInteger.Whenthere

Page 82: Ruby Hacking Guide

aremethodswiththesamenamethenearerclassestakepreference.Astypeoverloadingisn’tthereatalltherequisitesareextremelystraightforward.

Figure4:Inheritanceovermultiplesteps

InC++it’spossibletocreateaclasswhichinheritsnothing.WhileinRubyonehastoinheritfromtheObjectclasseitherdirectlyorindirectly.InotherwordswhenwedrawtheinheritancerelationsitbecomesasingletreewithObjectatthetop.Forexample,whenwedrawatreeoftheinheritancerelationsamongtheimportantclassesofthebasiclibrary,itwouldlooklikeFig.5.

Page 83: Ruby Hacking Guide

Figure5:Ruby’sclasstree

Oncethesuperclassisappointed(inthedefinitionstatement)it’simpossibletochangeit.Inotherwords,onecanaddanewclasstotheclasstreebutcannotchangeapositionordeleteaclass.

InheritanceofVariables……?InRuby(instance)variablesaren’tinherited.Eventhoughtryingtoinherit,aclassdoesnotknowaboutwhatvariablesaregoingtobeused.

Butwhenaninheritedmethodiscalled(inaninstanceofasubclass),assignmentofinstancevariableshappens.Whichmeanstheybecomedefined.Then,sincethenamespaceofinstancevariablesiscompletelyflatbasedoneachinstance,itcanbe

Page 84: Ruby Hacking Guide

accessedbyamethodofwhicheverclass.

classAdefinitialize()#calledfromwhenprocessingnew()@i="ok"endend

classB<Adefprint_i()p(@i)endend

B.new().print_i()#Shows"ok"

Ifyoucan’tagreewiththisbehavior,let’sforgetaboutclassesandinheritance.Whenthere’saninstanceobjoftheclassC,thenthinkasifallthemethodsofthesuperclassofCaredefinedinC.Ofcoursewekeeptheoverwriteruleinmind.ThenthemethodsofCgetattachedtotheinstanceobj(Fig.6).ThisstrongpalpabilityisaspecialtyofRuby’sobjectorientation.

Figure6:AconceptionofaRubyobject

Page 85: Ruby Hacking Guide

ModulesOnlyasinglesuperclasscanbedesignated.SoRubylookslikesingleinheritance.Butbecauseofmodulesithasinpracticetheabilitywhichisidenticaltomultipleinheritance.Let’sexplainthesemodulesnext.

Inshort,modulesareclassesforwhichasuperclasscannotbedesignatedandinstancescannotbecreated.Forthedefinitionwewriteasfollows.

moduleMend

HerethemoduleMwasdefined.Methodsaredefinedexactlythesamewayasforclasses.

moduleMdefmyupcase(str)returnstr.upcase()endend

Butbecausewecannotcreateinstances,wecannotcallthemdirectly.Todothat,weusethemoduleby“including”itintootherclasses.Thenwebecometobeabletodealwithitasifaclassinheritedthemodule.

moduleMdefmyupcase(str)returnstr.upcase()end

Page 86: Ruby Hacking Guide

end

classCincludeMend

p(C.new().myupcase("content"))#"CONTENT"isshown

EventhoughnomethodwasdefinedintheclassCwecancallthemethodmyupcase.Itmeansit“inherited”themethodofthemoduleM.Inclusionisfunctionallycompletelythesameasinheritance.There’snolimitondefiningmethodsoraccessinginstancevariables.

Isaidwecannotspecifyanysuperclassofamodule,butothermodulescanbeincluded.

moduleMend

moduleM2includeMend

Inotherwordsit’sfunctionallythesameasappointingasuperclass.Butaclasscannotcomeaboveamodule.Onlymodulesareallowedabovemodules.

Theexamplebelowalsocontainstheinheritanceofmethods.

moduleOneMoredefmethod_OneMore()p("OneMore")end

Page 87: Ruby Hacking Guide

end

moduleMincludeOneMore

defmethod_M()p("M")endend

classCincludeMend

C.new().method_M()#Output"M"C.new().method_OneMore()#Output"OneMore"

AswithclasseswhenwesketchinheritanceitlookslikeFig.7

Figure7:multilevelinclusion

Besides,theclassCalsohasasuperclass.Howisitsrelationshiptomodules?Forinstance,let’sthinkofthefollowingcase.

#modcls.rb

classClsdeftest()return"class"end

Page 88: Ruby Hacking Guide

end

moduleModdeftest()return"module"endend

classC<ClsincludeModend

p(B.new().test())#"class"?"module"?

CinheritsfromClsandincludesMod.Whichwillbeshowninthiscase,"class"or"module"?Inotherwords,whichoneis“closer”,classormodule?We’dbetteraskRubyaboutRuby,thuslet’sexecuteit:

%rubymodcls.rb"module"

Apparentlyamoduletakespreferencebeforethesuperclass.

Ingeneral,inRubywhenamoduleisincluded,itwouldbeinheritedbygoinginbetweentheclassandthesuperclass.AsapictureitmightlooklikeFig.8.

Figure8:Therelationbetweenmodulesandclasses

Page 89: Ruby Hacking Guide

Andifwealsotakingthemodulesincludedinthemoduleintoaccounts,itwouldlooklikeFig.9.

Figure9:Therelationbetweenmodulesandclasses(2)

TheProgramrevisited

Caution.Thissectionisextremelyimportantandexplainingtheelementswhicharenoteasytomixwithforprogrammerswhohaveonlyusedstaticlanguagesbefore.Forotherpartsjustskimmingissufficient,butforonlythispartI’dlikeyoutoreaditcarefully.Theexplanationwillalsoberelativelyattentive.

NestingofConstantsFirstarepetitionofconstants.Asaconstantbeginswithacapitalletterthedefinitiongoesasfollows.

Const=3

Page 90: Ruby Hacking Guide

Nowwereferencetheconstantinthisway.

p(Const)#Shows3

Actuallywecanalsowritethis.

p(::Const)#Shows3inthesameway.

The::infrontshowsthatit’saconstantdefinedatthetoplevel.Youcanthinkofthepathinafilesystem.Assumethereisafilevmunixintherootdirectory.Beingat/onecanwritevmunixtoaccessthefile.Onecanalsowrite/vmunixasitsfullpath.It’sthesamewithConstand::Const.Attoplevelit’sokaytowriteonlyConstortowritethefullpath::Const

Andwhatcorrespondstoafilesystem’sdirectoriesinRuby?Thatshouldbeclassandmoduledefinitionstatements.Howevermentioningbothiscumbersome,soI’lljustsubsumethemunderclassdefinition.Whenoneentersaclassdefinitionthelevelforconstantsrises(asifenteringadirectory).

classSomeClassConst=3end

p(::SomeClass::Const)#Shows3p(SomeClass::Const)#Thesame.Shows3

SomeClassisdefinedattoplevel.HenceonecanreferenceitbywritingeitherSomeClassor::SomeClass.AndastheconstantConst

Page 91: Ruby Hacking Guide

nestedintheclassdefinitionisaConst“insideSomeClass”,Itbecomes::SomeClass::Const.

Aswecancreateadirectoryinadirectory,wecancreateaclassinsideaclass.Forinstancelikethis:

classC#::CclassC2#::C::C2classC3#::C::C2::C3endendend

Bytheway,foraconstantdefinedinaclassdefinitionstatement,shouldwealwayswriteitsfullname?Ofcoursenot.Aswiththefilesystem,ifoneisinsidethesameclassdefinitiononecanskipthe::.Itbecomeslikethat:

classSomeClassConst=3p(Const)#Shows3.end

“What?”youmightthink.Surprisingly,evenifitisinaclassdefinitionstatement,wecanwriteaprogramwhichisgoingtobeexecuted.Peoplewhoareusedtoonlystaticlanguageswillfindthisquiteexceptional.IwasalsoflabbergastedthefirsttimeIsawit.

Let’saddthatwecanofcoursealsoviewaconstantinsideamethod.Thereferencerulesarethesameaswithintheclassdefinition(outsidethemethod).

Page 92: Ruby Hacking Guide

classCConst="ok"deftest()p(Const)endend

C.new().test()#Shows"ok"

EverythingisexecutedLookingatthebigpictureIwanttowriteonemorething.InRubyalmostthewholepartsofprogramis“executed”.Constantdefinitions,classdefinitionsandmethoddefinitionsandalmostalltherestisexecutedintheapparentorder.

Lookforinstanceatthefollowingcode.Iusedvariousconstructionswhichhavebeenusedbefore.

1:p("first")2:3:classC<Object4:Const="inC"5:6:p(Const)7:8:defmyupcase(str)9:returnstr.upcase()10:end11:end12:13:p(C.new().myupcase("content"))

Thisprogramisexecutedinthefollowingorder:

Page 93: Ruby Hacking Guide

1:p("first") Shows"first"

3:<Object TheconstantObjectisreferencedandtheclassobjectObjectisgained

3:classC AnewclassobjectwithsuperclassObjectisgenerated,andassignedtotheconstantC

4:Const="inC" Assigningthevalue"inC"totheconstant::C::Const

6:p(Const) Showingtheconstant::C::Consthence"inC"

8:defmyupcase(...)...end DefineC#myupcase13:C.new().myupcase(...)

RefertheconstantC,callthemethodnewonit,andthenmyupcaseonthereturnvalue

9:returnstr.upcase() Returns"CONTENT"13:p(...) Shows"CONTENT"

TheScopeofLocalVariablesAtlastwecantalkaboutthescopeoflocalvariables.

Thetoplevel,theinteriorofaclassdefinition,theinteriorofamoduledefinitionandamethodbodyareallhaveeachcompletelyindependentlocalvariablescope.Inotherwords,thelvarvariablesinthefollowingprogramarealldifferentvariables,andtheydonotinfluenceeachother.

lvar='toplevel'

classClvar='inC'defmethod()lvar='inC#method'

Page 94: Ruby Hacking Guide

endend

p(lvar)#Shows"toplevel"

moduleMlvar='inM'end

p(lvar)#Shows"toplevel"

selfascontextPreviously,Isaidthatduringmethodexecutiononeself(anobjectonwhichthemethodwascalled)becomesself.That’struebutonlyhalftrue.ActuallyduringtheexecutionofaRubyprogram,selfisalwayssetwhereveritis.Itmeansthere’sselfalsoatthetoplevelorinaclassdefinitionstatement.

Forinstancetheselfatthetoplevelismain.It’saninstanceoftheObjectclasswhichisnothingspecial.mainisprovidedtosetupselfforthetimebeing.There’snodeepermeaningattachedtoit.

Hencethetoplevel’sselfi.e.mainisaninstanceofObject,suchthatonecancallthemethodsofObjectthere.AndinObjectthemoduleKernelisincluded.Intherethefunction-flavormethodslikepandputsaredefined(Fig.10).That’swhyonecancallputsandpalsoatthetoplevel.

Page 95: Ruby Hacking Guide

Figure10:main,ObjectandKernel

Thuspisn’tafunction,it’samethod.JustbecauseitisdefinedinKernelandthuscanbecalledlikeafunctionas“itsown”methodwhereveritisornomatterwhattheclassofselfis.Therefore,therearen’tfunctionsinthetruesense,thereareonlymethods.

Bytheway,besidespandputstherearethefunction-flavormethodsprint,puts,printf,sprintf,gets,fork,andexecandmanymorewithsomewhatfamiliarnames.WhenyoulookatthechoiceofnamesyoumightbeabletoimagineRuby’scharacter.

Well,sinceselfissetupeverywhere,selfshouldalsobeinaclassdefinitioninthesameway.Theselfintheclassdefinitionistheclassitself(theclassobject).Henceitwouldlooklikethis.

classCp(self)#Cend

Whatshouldthisbegoodfor?Infact,we’vealreadyseenanexampleinwhichitisveryuseful.Thisone.

moduleMend

Page 96: Ruby Hacking Guide

classCincludeMend

ThisincludeisactuallyamethodcalltotheclassobjectC.Ihaven’tmentionedityetbuttheparenthesesaroundargumentscanbeomittedformethodcalls.AndIomittedtheparenthesesaroundincludesuchthatitdoesn’tlooklikeamethodcallbecausewehavenotfinishedthetalkaboutclassdefinitionstatement.

LoadingInRubytheloadingoflibrariesalsohappensatruntime.Normallyonewritesthis.

require("library_name")

Theimpressionisn’tfalse,requireisamethod.It’snotevenareservedword.Whenitiswrittenthisway,loadingisexecutedonthelineitiswritten,andtheexecutionishandedoverto(thecodeof)thelibrary.AsthereisnoconceptlikeJavapackagesinRuby,whenwe’dliketoseparatenamespaces,itisdonebyputtingfilesintoadirectory.

require("somelib/file1")require("somelib/file2")

Andinthelibraryusuallyclassesandsucharedefinedwithclassstatementsormodulestatements.Theconstantscopeofthetop

Page 97: Ruby Hacking Guide

levelisflatwithoutthedistinctionoffiles,soonecanseeclassesdefinedinanotherfilewithoutanyspecialpreparation.Topartitionthenamespaceofclassnamesonehastoexplicitlynestmodulesasshownbelow.

#exampleofthenamespacepartitionofnetlibrarymoduleNetclassSMTP#...endclassPOP#...endclassHTTP#...endend

MoreaboutClasses

ThetalkaboutConstantsstillgoesonUptonowweusedthefilesystemmetaphorforthescopeofconstants,butIwantyoutocompletelyforgetthat.

Thereismoreaboutconstants.Firstlyonecanalsoseeconstantsinthe“outer”class.

Const="ok"classCp(Const)#Shows"ok"

Page 98: Ruby Hacking Guide

end

Thereasonwhythisisdesignedinthiswayisbecausethisbecomesusefulwhenmodulesareusedasnamespaces.Let’sexplainthisbyaddingafewthingstothepreviousexampleofnetlibrary.

moduleNetclassSMTP#UsesNet::SMTPHelperinthemethodsendclassSMTPHelper#SupportstheclassNet::SMTPendend

Insuchcase,it’sconvenientifwecanrefertoitalsofromtheSMTPclassjustbywritingSMTPHelper,isn’tit?Therefore,itisconcludedthat“it’sconvenientifwecanseetheouterclasses”.

Theouterclasscanbereferencednomatterhowmanytimesitisnesting.Whenthesamenameisdefinedondifferentlevels,theonewhichwillfirstbefoundfromwithinwillbereferredto.

Const="far"classCConst="near"#ThisoneiscloserthantheoneaboveclassC2classC3p(Const)#"near"isshownendendend

There’sanotherwayofsearchingconstants.Ifthetoplevelis

Page 99: Ruby Hacking Guide

reachedwhengoingfurtherandfurtheroutsidethentheownsuperclassissearchedfortheconstant.

classAConst="ok"endclassB<Ap(Const)#"ok"isshownend

Really,that’sprettycomplicated.

Let’ssummarize.Whenlookingupaconstant,firsttheouterclassesissearchedthenthesuperclasses.Thisisquitecontrived,butlet’sassumeaclasshierarchyasfollows.

classA1endclassA2<A1endclassA3<A2classB1endclassB2<B1endclassB3<B2classC1endclassC2<C1endclassC3<C2p(Const)endendend

WhentheconstantConstinC3isreferenced,it’slookedupinthe

Page 100: Ruby Hacking Guide

orderdepictedinFig.11.

Figure11:Searchorderforconstants

Becarefulaboutonepoint.Thesuperclassesoftheclassesoutside,forinstanceA1andB2,aren’tsearchedatall.Ifit’soutsideonceit’salwaysoutsideandifit’ssuperclassonceit’salwayssuperclass.Otherwise,thenumberofclassessearchedwouldbecometoobigandthebehaviorofsuchcomplicatedthingwouldbecomeunpredictable.

MetaclassesIsaidthatamethodcanbecalledonifitisanobject.Ialsosaidthatthemethodsthatcanbecalledaredeterminedbytheclassofanobject.Thenshouldn’ttherebeaclassforclassobjects?(Fig.12)

Figure12:Aclassofclasses?

Inthiskindofsituation,inRuby,wecancheckinpractice.It’sbecausethere’s“amethodwhichreturnstheclass(classobject)to

Page 101: Ruby Hacking Guide

whichanobjectitselfbelongs”,Object#class.

p("string".class())#Stringisshownp(String.class())#Classisshownp(Object.class())#Classisshown

ApparentlyStringbelongstotheclassnamedClass.Thenwhat’stheclassofClass?

p(Class.class())#Classisshown

AgainClass.Inotherwords,whateverobjectitis,byfollowinglike.class().class().class()…,itwouldreachClassintheend,thenitwillstallintheloop(Fig.13).

Figure13:Theclassoftheclassoftheclass…

Classistheclassofclasses.Andwhathasarecursivestructureas“XofX”iscalledameta-X.HenceClassisametaclass.

MetaobjectsLet’schangethetargetandthinkaboutmodules.Asmodulesarealsoobjects,therealsoshouldbeaclassforthem.Let’ssee.

moduleMend

Page 102: Ruby Hacking Guide

p(M.class())#Moduleisshown

TheclassofamoduleseemstobeModule.AndwhatshouldbetheclassoftheclassModule?

p(Module.class())#Class

It’sagainClass

Nowwechangethedirectionandexaminetheinheritancerelationships.What’sthesuperclassofClassandModule?InRuby,wecanfinditoutwithClass#superclass.

p(Class.superclass())#Modulep(Module.superclass())#Objectp(Object.superclass())#nil

SoClassisasubclassofModule.Basedonthesefacts,Figure14showstherelationshipsbetweentheimportantclassesofRuby.

Figure14:TheclassrelationshipbetweentheimportantRubyclasses

Page 103: Ruby Hacking Guide

Uptonowweusednewandincludewithoutanyexplanation,butfinallyIcanexplaintheirtrueform.newisreallyamethoddefinedfortheclassClass.Thereforeonwhateverclass,(becauseitisaninstanceofClass),newcanbeusedimmediately.Butnewisn’tdefinedinModule.Henceit’snotpossibletocreateinstancesinamodule.AndsinceincludeisdefinedintheModuleclass,itcanbecalledonbothmodulesandclasses.

ThesethreeclassesObject,ModuleandclassareobjectsthatsupportthefoundationofRuby.WecansaythatthesethreeobjectsdescribetheRuby’sobjectworlditself.Namelytheyareobjectswhichdescribeobjects.Hence,ObjectModuleClassareRuby’s“meta-objects”.

SingletonMethodsIsaidthatmethodscanbecalledifitisanobject.Ialsosaidthatthemethodsthatcanbecalledaredeterminedbytheobject’sclass.HoweverIthinkIalsosaidthatideallymethodsbelongtoobjects.Classesarejustameanstoeliminatetheeffortofdefiningthesamemethodmorethanonce.

ActuallyInRubythere’salsoameanstodefinemethodsforindividualobjects(instances)notdependingontheclass.Todothis,youcanwritethisway.

obj=Object.new()defobj.my_first()puts("Myfirstsingletonmethod")

Page 104: Ruby Hacking Guide

endobj.my_first()#ShowsMyfirstsingletonmethod

AsyoualreadyknowObjectistherootforeveryclass.It’sveryunlikelythatamethodwhosenameissoweirdlikemy_firstisdefinedinsuchimportantclass.AndobjisaninstanceofObject.Howeverthemethodmy_firstcanbecalledonobj.Hencewehavecreatedwithoutdoubtamethodwhichhasnothingtodowiththeclasstheobjectbelongsto.Thesemethodswhicharedefinedforeachobjectindividuallyarecalledsingletonmethods.

Whenaresingletonmethodsused?First,itisusedwhendefiningsomethinglikestaticmethodsofJavaorC++.Inotherwordsmethodswhichcanbeusedwithoutcreatinganinstance.ThesemethodsareexpressedinRubyassingletonmethodsofaclassobject.

ForexampleinUNIXthere’sasystemcallunlink.Thiscommanddeletesafileentryfromthefilesystem.InRubyitcanbeuseddirectlyasthesingletonmethodunlinkoftheFileclass.Let’stryitout.

File.unlink("core")#deletesthecoredump

It’scumbersometosay“thesingletonmethodunlinkoftheobjectFile”.WesimplywriteFile.unlink.Don’tmixitupandwriteFile#unlink,orviceversadon’twriteFile.writeforthemethodwritedefinedinFile.

Page 105: Ruby Hacking Guide

▼Asummaryofthemethodnotation

notation thetargetobject exampleFile.unlink theFileclassitself File.unlink("core")File#write aninstanceofFile f.write("str")

ClassVariablesClassvariableswereaddedtoRubyfrom1.6on,theyarearelativelynewmechanism.Aswithconstants,theybelongtoaclass,andtheycanbereferencedandassignedfromboththeclassanditsinstances.Let’slookatanexample.Thebeginningofthenameis@@.

classC@@cvar="ok"p(@@cvar)#"ok"isshown

defprint_cvar()p(@@cvar)endend

C.new().print_cvar()#"ok"isshown

Asthefirstassignmentservesasthedefinition,areferencebeforeanassignmentliketheoneshownbelowleadstoaruntimeerror.Thereisan´@´infrontbutthebehaviordifferscompletelyfrominstancevariables.

%ruby-e'classC@@cvar

Page 106: Ruby Hacking Guide

end'-e:3:uninitializedclassvariable@@cvarinC(NameError)

HereIwasabitlazyandusedthe-eoption.Theprogramisthethreelinesbetweenthesinglequotes.

Classvariablesareinherited.Orsayingitdifferently,avariableinasuperiorclasscanbeassignedandreferencedintheinferiorclass.

classA@@cvar="ok"end

classB<Ap(@@cvar)#Shows"ok"defprint_cvar()p(@@cvar)endend

B.new().print_cvar()#Shows"ok"

GlobalVariables

Atlasttherearealsoglobalvariables.Theycanbereferencedfromeverywhereandassignedeverywhere.Thefirstletterofthenameisa$.

$gvar="globalvariable"p($gvar)#Shows"globalvariable"

Page 107: Ruby Hacking Guide

Aswithinstancevariables,allkindsofnamescanbeconsidereddefinedforglobalvariablesbeforeassignments.Inotherwordsareferencebeforeanassignmentgivesanilanddoesn’traiseanerror.

Copyright©2002-2004MineroAoki,Allrightsreserved.

EnglishTranslation:SebastianKrause<[email protected]>

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 108: Ruby Hacking Guide

RubyHackingGuide

TranslatedbyVincentISAMBART

Page 109: Ruby Hacking Guide

Chapter2:Objects

StructureofRubyobjects

GuidelineFromthischapter,wewillbeginactuallyexploringtherubysourcecode.First,asdeclaredatthebeginningofthisbook,we’llstartwiththeobjectstructure.

Whatarethenecessaryconditionsforobjectstobeobjects?Therecouldbemanywaystoexplainaboutobjectitself,butthereareonlythreeconditionsthataretrulyindispensable.

1. Theabilitytodifferentiateitselffromotherobjects(anidentity)

2. Theabilitytorespondtomessages(methods)3. Theabilitytostoreinternalstate(instancevariables)

Inthischapter,wearegoingtoconfirmthesethreefeaturesonebyone.

Thetargetfileismainlyruby.h,butwewillalsobrieflylookatotherfilessuchasobject.c,class.corvariable.c.

Page 110: Ruby Hacking Guide

VALUEandobjectstructInruby,thebodyofanobjectisexpressedbyastructandalwayshandledviaapointer.Adifferentstructtypeisusedforeachclass,butthepointertypewillalwaysbeVALUE(figure1).

Figure1:VALUEandstruct

HereisthedefinitionofVALUE:

▼VALUE

71typedefunsignedlongVALUE;

(ruby.h)

Inpractice,whenusingaVALUE,wecastittothepointertoeachobjectstruct.Thereforeifanunsignedlongandapointerhaveadifferentsize,rubywillnotworkwell.Strictlyspeaking,itwillnotworkifthere’sapointertypethatisbiggerthansizeof(unsignedlong).Fortunately,systemswhichcouldnotmeetthisrequirementisunlikelyrecently,butsometimeagoitseemstherewerequiteafewofthem.

Thestructs,ontheotherhand,haveseveralvariations,adifferent

Page 111: Ruby Hacking Guide

structisusedbasedontheclassoftheobject.

structRObject allthingsforwhichnoneofthefollowingappliesstructRClass classobjectstructRFloat smallnumbersstructRString stringstructRArray arraystructRRegexp regularexpressionstructRHash hashtablestructRFile IO,File,Socket,etc…structRData

alltheclassesdefinedatClevel,excepttheonesmentionedabove

structRStruct Ruby’sStructclassstructRBignum bigintegers

Forexample,foranstringobject,structRStringisused,sowewillhavesomethinglikethefollowing.

Figure2:Stringobject

Let’slookatthedefinitionofafewobjectstructs.

Page 112: Ruby Hacking Guide

▼Examplesofobjectstruct

/*structforordinaryobjects*/295structRObject{296structRBasicbasic;297structst_table*iv_tbl;298};

/*structforstrings(instanceofString)*/314structRString{315structRBasicbasic;316longlen;317char*ptr;318union{319longcapa;320VALUEshared;321}aux;322};

/*structforarrays(instanceofArray)*/324structRArray{325structRBasicbasic;326longlen;327union{328longcapa;329VALUEshared;330}aux;331VALUE*ptr;332};

(ruby.h)

Beforelookingateveryoneofthemindetail,let’sbeginwithsomethingmoregeneral.

First,asVALUEisdefinedasunsignedlong,itmustbecastbeforebeingusedwhenitisusedasapointer.That’swhyRxxxx()macroshavebeenmadeforeachobjectstruct.Forexample,forstruct

Page 113: Ruby Hacking Guide

RStringthereisRSTRING(),forstructRArraythereisRARRAY(),etc…Thesemacrosareusedlikethis:

VALUEstr=....;VALUEarr=....;RSTRING(str)->len;/*((structRString*)str)->len*/RARRAY(arr)->len;/*((structRArray*)arr)->len*/

AnotherimportantpointtomentionisthatallobjectstructsstartwithamemberbasicoftypestructRBasic.Asaresult,ifyoucastthisVALUEtostructRBasic*,youwillbeabletoaccessthecontentofbasic,regardlessofthetypeofstructpointedtobyVALUE.

Figure3:structRBasic

Becauseitispurposefullydesignedthisway,structRBasicmustcontainveryimportantinformationforRubyobjects.HereisthedefinitionforstructRBasic:

▼structRBasic

290structRBasic{291unsignedlongflags;292VALUEklass;

Page 114: Ruby Hacking Guide

293};

(ruby.h)

flagsaremultipurposeflags,mostlyusedtoregisterthestructtype(forinstancestructRObject).ThetypeflagsarenamedT_xxxx,andcanbeobtainedfromaVALUEusingthemacroTYPE().Hereisanexample:

VALUEstr;str=rb_str_new();/*createsaRubystring(itsstructisRString)*/TYPE(str);/*thereturnvalueisT_STRING*/

TheallflagsarenamedasT_xxxx,likeT_STRINGforstructRStringandT_ARRAYforstructRArray.Theyareverystraightforwardlycorrespondedtothetypenames.

TheothermemberofstructRBasic,klass,containstheclassthisobjectbelongsto.AstheklassmemberisoftypeVALUE,whatisstoredis(apointerto)aRubyobject.Inshort,itisaclassobject.

Figure4:objectandclass

Therelationbetweenanobjectanditsclasswillbedetailedinthe

Page 115: Ruby Hacking Guide

“Methods”sectionofthischapter.

Bytheway,thismemberisnamedklasssoasnottoconflictwiththereservedwordclasswhenthefileisprocessedbyaC++compiler.

AboutstructtypesIsaidthatthetypeofstructisstoredintheflagsmemberofstructBasic.Butwhydowehavetostorethetypeofstruct?It’stobeabletohandlealldifferenttypesofstructviaVALUE.IfyoucastapointertoastructtoVALUE,asthetypeinformationdoesnotremain,thecompilerwon’tbeabletohelp.Thereforewehavetomanagethetypeourselves.That’stheconsequenceofbeingabletohandleallthestructtypesinaunifiedway.

OK,buttheusedstructisdefinedbytheclasssowhyarethestructtypeandclassarestoredseparately?Beingabletofindthestructtypefromtheclassshouldbeenough.Therearetworeasonsfornotdoingthis.

Thefirstoneis(I’msorryforcontradictingwhatIsaidbefore),infacttherearestructsthatdonothaveastructRBasic(i.e.theyhavenoklassmember).ForexamplestructRNodethatwillappearinthesecondpartofthebook.However,flagsisguaranteedtobeinthebeginningmemberseveninspecialstructslikethis.Soifyouputthetypeofstructinflags,alltheobjectstructscanbedifferentiatedinoneunifiedway.

Page 116: Ruby Hacking Guide

Thesecondreasonisthatthereisnoone-to-onecorrespondencebetweenclassandstruct.Forexample,alltheinstancesofclassesdefinedattheRubylevelusestructRObject,sofindingastructfromaclasswouldrequiretokeepthecorrespondencebetweeneachclassandstruct.That’swhyit’seasierandfastertoputtheinformationaboutthetypeinthestruct.

Theuseofbasic.flagsRegardingtheuseofbasic.flags,becauseIfeelbadtosayitisthestructtype“andsuch”,I’llillustrateitentirelyhere.(Figure5)Thereisnoneedtounderstandeverythingrightaway,becausethisispreparedforthetimewhenyouwillbewonderingaboutitlater.

Figure5:Useofflags

Whenlookingatthediagram,itlookslikethat21bitsarenotusedon32bitmachines.Ontheseadditionalbits,theflagsFL_USER0toFL_USER8aredefined,andareusedforadifferentpurposeforeach

Page 117: Ruby Hacking Guide

struct.InthediagramIalsoputFL_USER0(FL_SINGLETON)asanexample.

ObjectsembeddedinVALUEAsIsaid,VALUEisanunsignedlong.AsVALUEisapointer,itmaylooklikevoid*wouldalsobeallright,butthereisareasonfornotdoingthis.Infact,VALUEcanalsonotbeapointer.The6casesforwhichVALUEisnotapointerarethefollowing:

1. smallintegers2. symbols3. true4. false5. nil6. Qundef

I’llexplainthemonebyone.

SmallintegersAlldataareobjectsinRuby,thusintegersarealsoobjects.Butsincetherearesomanykindofintegerobjects,ifeachofthemisexpressedasastruct,itwouldriskslowingdownexecutionsignificantly.Forexample,whenincrementingfrom0to50000,wewouldhesitatetocreate50000objectsforonlythatpurpose.

That’swhyinruby,integersthataresmalltosomeextentare

Page 118: Ruby Hacking Guide

treatedspeciallyandembeddeddirectlyintoVALUE.“Small”meanssignedintegersthatcanbeheldinsizeof(VALUE)*8-1bits.Inotherwords,on32bitsmachines,theintegershave1bitforthesign,and30bitsfortheintegerpart.IntegersinthisrangewillbelongtotheFixnumclassandtheotherintegerswillbelongtotheBignumclass.

Let’sseeinpracticetheINT2FIX()macrothatconvertsfromaCinttoaFixnum,andconfirmthatFixnumaredirectlyembeddedinVALUE.

▼INT2FIX

123#defineINT2FIX(i)((VALUE)(((long)(i))<<1|FIXNUM_FLAG))122#defineFIXNUM_FLAG0x01

(ruby.h)

Inbrief,shift1bittotheleft,andbitwiseoritwith1.

110100001000 beforeconversion1101000010001 afterconversion

ThatmeansthatFixnumasVALUEwillalwaysbeanoddnumber.Ontheotherhand,asRubyobjectstructsareallocatedwithmalloc(),theyaregenerallyarrangedonaddressesmultipleof4.SotheydonotoverlapwiththevaluesofFixnumasVALUE.

Also,toconvertintorlongtoVALUE,wecanusemacroslikeINT2NUM()orLONG2NUM().AnyconversionmacroXXXX2XXXXwithanamecontainingNUMcanmanagebothFixnumandBignum.ForexampleifINT2NUM()can’tconvertanintegerintoaFixnum,itwill

Page 119: Ruby Hacking Guide

automaticallyconvertittoBignum.NUM2INT()willconvertbothFixnumandBignumtoint.Ifthenumbercan’tfitinanint,anexceptionwillberaised,sothereisnoneedtocheckthevaluerange.

SymbolsWhataresymbols?

Asthisquestionisquitetroublesometoanswer,let’sstartwiththereasonswhysymbolswerenecessary.Inthefirstplace,there’satypenamedIDusedinsideruby.Hereitis.

▼ID

72typedefunsignedlongID;

(ruby.h)

ThisIDisanumberhavingaone-to-oneassociationwithastring.However,it’snotpossibletohaveanassociationbetweenallstringsinthisworldandnumericalvalues.Itislimitedtotheonetoonerelationshipsinsideonerubyprocess.I’llspeakofthemethodtofindanIDinthenextchapter“Namesandnametables”.

Inlanguageprocessor,therearealotofnamestohandle.Methodnamesorvariablenames,constantnames,filenames,classnames…It’stroublesometohandleallofthemasstrings(char*),becauseofmemorymanagementandmemorymanagementandmemorymanagement…Also,lotsofcomparisonswouldcertainly

Page 120: Ruby Hacking Guide

benecessary,butcomparingstringscharacterbycharacterwillslowdowntheexecution.That’swhystringsarenothandleddirectly,somethingwillbeassociatedandusedinstead.Andgenerallythat“something”willbeintegers,astheyarethesimplesttohandle.

TheseIDarefoundassymbolsintheRubyworld.Uptoruby1.4,thevaluesofIDconvertedtoFixnumwereusedassymbols.EventodaythesevaluescanbeobtainedusingSymbol#to_i.However,asrealuseresultscamepilingup,itwasunderstoodthatmakingFixnumandSymbolthesamewasnotagoodidea,sosince1.6anindependentclassSymbolhasbeencreated.

Symbolobjectsareusedalot,especiallyaskeysforhashtables.That’swhySymbol,likeFixnum,wasmadeembeddedinVALUE.Let’slookattheID2SYM()macroconvertingIDtoSymbolobject.

▼ID2SYM

158#defineSYMBOL_FLAG0x0e160#defineID2SYM(x)((VALUE)(((long)(x))<<8|SYMBOL_FLAG))

(ruby.h)

Whenshifting8bitsleft,xbecomesamultipleof256,thatmeansamultipleof4.Thenafterwithabitwiseor(inthiscaseit’sthesameasadding)with0x0e(14indecimal),theVALUEexpressingthesymbolisnotamultipleof4.Orevenanoddnumber.SoitdoesnotoverlaptherangeofanyotherVALUE.Quiteaclevertrick.

Page 121: Ruby Hacking Guide

Finally,let’sseethereverseconversionofID2SYM(),SYM2ID().

▼SYM2ID()

161#defineSYM2ID(x)RSHIFT((long)x,8)

(ruby.h)

RSHIFTisabitshifttotheright.Asrightshiftmaykeepornotthesigndependingoftheplatform,itbecameamacro.

truefalsenil

ThesethreeareRubyspecialobjects.trueandfalserepresentthebooleanvalues.nilisanobjectusedtodenotethatthereisnoobject.TheirvaluesattheClevelaredefinedlikethis:

▼truefalsenil

164#defineQfalse0/*Ruby'sfalse*/165#defineQtrue2/*Ruby'strue*/166#defineQnil4/*Ruby'snil*/

(ruby.h)

Thistimeit’sevennumbers,butas0or2can’tbeusedbypointers,theycan’toverlapwithotherVALUE.It’sbecauseusuallythefirstblockofvirtualmemoryisnotallocated,tomaketheprogramsdereferencingaNULLpointercrash.

AndasQfalseis0,itcanalsobeusedasfalseatClevel.Inpractice,

Page 122: Ruby Hacking Guide

inruby,whenafunctionreturnsabooleanvalue,it’softenmadetoreturnanintorVALUE,andreturnsQtrue/Qfalse.

ForQnil,thereisamacrodedicatedtocheckifaVALUEisQnilornot,NIL_P().

▼NIL_P()

170#defineNIL_P(v)((VALUE)(v)==Qnil)

(ruby.h)

ThenameendingwithpisanotationcomingfromLispdenotingthatitisafunctionreturningabooleanvalue.Inotherwords,NIL_Pmeans“istheargumentnil?”.Itseemsthe“p”charactercomesfrom“predicate.”Thisnamingruleisusedatmanydifferentplacesinruby.

Also,inRuby,falseandnilarefalse(inconditionalstatements)andalltheotherobjectsaretrue.However,inC,nil(Qnil)istrue.That’swhythere’stheRTEST()macrotodoRuby-styletestinC.

▼RTEST()

169#defineRTEST(v)(((VALUE)(v)&~Qnil)!=0)

(ruby.h)

AsinQnilonlythethirdlowerbitis1,in~Qnilonlythethirdlowerbitis0.ThenonlyQfalseandQnilbecome0withabitwiseand.

Page 123: Ruby Hacking Guide

!=0hasbeenaddedtobecertaintoonlyhave0or1,tosatisfytherequirementsofthegliblibrarythatonlywants0or1([ruby-dev:11049]).

Bytheway,whatisthe‘Q’ofQnil?‘R’Iwouldhaveunderstoodbutwhy‘Q’?WhenIasked,theanswerwas“Becauseit’slikethatinEmacs.”IdidnothavethefunanswerIwasexpecting…

Qundef

▼Qundef

167#defineQundef6/*undefinedvalueforplaceholder*/

(ruby.h)

Thisvalueisusedtoexpressanundefinedvalueintheinterpreter.Itcan’t(mustnot)befoundatallattheRubylevel.

Methods

IalreadybroughtupthethreeimportantpointsofaRubyobject:havinganidentity,beingabletocallamethod,andkeepingdataforeachinstance.Inthissection,I’llexplaininasimplewaythestructurelinkingobjectsandmethods.

structRClass

Page 124: Ruby Hacking Guide

InRuby,classesexistasobjectsduringtheexecution.Ofcourse.Sotheremustbeastructforclassobjects.ThatstructisstructRClass.ItsstructtypeflagisT_CLASS.

Asclassesandmodulesareverysimilar,thereisnoneedtodifferentiatetheircontent.That’swhymodulesalsousethestructRClassstruct,andaredifferentiatedbytheT_MODULEstructflag.

▼structRClass

300structRClass{301structRBasicbasic;302structst_table*iv_tbl;303structst_table*m_tbl;304VALUEsuper;305};

(ruby.h)

First,let’sfocusonthem_tbl(MethodTaBLe)member.structst_tableisanhashtableusedeverywhereinruby.Itsdetailswillbeexplainedinthenextchapter“Namesandnametables”,butbasically,itisatablemappingnamestoobjects.Inthecaseofm_tbl,itkeepsthecorrespondencebetweenthename(ID)ofthemethodspossessedbythisclassandthemethodsentityitself.Asforthestructureofthemethodentity,itwillbeexplainedinPart2andPart3.

Thefourthmembersuperkeeps,likeitsnamesuggests,thesuperclass.Asit’saVALUE,it’s(apointerto)theclassobjectofthesuperclass.InRubythereisonlyoneclassthathasnosuperclass

Page 125: Ruby Hacking Guide

(therootclass):Object.

HoweverIalreadysaidthatallObjectmethodsaredefinedintheKernelmodule,Objectjustincludesit.Asmodulesarefunctionallysimilartomultipleinheritance,itmayseemhavingjustsuperisproblematic,butinrubysomecleverconversionsaremadetomakeitlooklikesingleinheritance.Thedetailsofthisprocesswillbeexplainedinthefourthchapter“Classesandmodules”.

Becauseofthisconversion,superofthestructofObjectpointstostructRClasswhichistheentityofKernelobjectandthesuperofKernelisNULL.Sotoputitconversely,ifsuperisNULL,itsRClassistheentityofKernel(figure6).

Page 126: Ruby Hacking Guide

Figure6:ClasstreeattheClevel

MethodssearchWithclassesstructuredlikethis,youcaneasilyimaginethemethodcallprocess.Them_tbloftheobject’sclassissearched,andifthemethodwasnotfound,them_tblofsuperissearched,andsoon.Ifthereisnomoresuper,thatistosaythemethodwasnotfoundeveninObject,thenitmustnotbedefined.

Page 127: Ruby Hacking Guide

Thesequentialsearchprocessinm_tblisdonebysearch_method().

▼search_method()

256staticNODE*257search_method(klass,id,origin)258VALUEklass,*origin;259IDid;260{261NODE*body;262263if(!klass)return0;264while(!st_lookup(RCLASS(klass)->m_tbl,id,&body)){265klass=RCLASS(klass)->super;266if(!klass)return0;267}268269if(origin)*origin=klass;270returnbody;271}

(eval.c)

Thisfunctionsearchesthemethodnamedidintheclassobjectklass.

RCLASS(value)isthemacrodoing:

((structRClass*)(value))

st_lookup()isafunctionthatsearchesinst_tablethevaluecorrespondingtoakey.Ifthevalueisfound,thefunctionreturnstrueandputsthefoundvalueattheaddressgiveninthirdparameter(&body).

Page 128: Ruby Hacking Guide

Nevertheless,doingthissearcheachtimewhateverthecircumstanceswouldbetooslow.That’swhyinreality,oncecalled,amethodiscached.Sostartingfromthesecondtimeitwillbefoundwithoutfollowingsuperonebyone.Thiscacheanditssearchwillbeseeninthe15thchapter“Methods”.

Instancevariables

Inthissection,Iwillexplaintheimplementationofthethirdessentialcondition,instancevariables.

rb_ivar_set()

Instancevariableisthemechanismthatallowseachobjecttoholditsspecificdata.Sinceitisspecifictoeachobject,itseemsgoodtostoreitineachobjectitself(i.e.initsobjectstruct),butisitreallyso?Let’slookatthefunctionrb_ivar_set(),whichassignsanobjecttoaninstancevariable.

▼rb_ivar_set()

/*assignvaltotheidinstancevariableofobj*/984VALUE985rb_ivar_set(obj,id,val)986VALUEobj;987IDid;988VALUEval;989{

Page 129: Ruby Hacking Guide

990if(!OBJ_TAINTED(obj)&&rb_safe_level()>=4)991rb_raise(rb_eSecurityError,"Insecure:can'tmodifyinstancevariable");992if(OBJ_FROZEN(obj))rb_error_frozen("object");993switch(TYPE(obj)){994caseT_OBJECT:995caseT_CLASS:996caseT_MODULE:997if(!ROBJECT(obj)->iv_tbl)ROBJECT(obj)->iv_tbl=st_init_numtable();998st_insert(ROBJECT(obj)->iv_tbl,id,val);999break;1000default:1001generic_ivar_set(obj,id,val);1002break;1003}1004returnval;1005}

(variable.c)

rb_raise()andrb_error_frozen()arebotherrorchecks.Thiscanalwaysbesaidhereafter:Errorchecksarenecessaryinreality,butit’snotthemainpartoftheprocess.Therefore,weshouldwhollyignorethematfirstread.

Afterremovingtheerrorhandling,onlytheswitchremains,but

switch(TYPE(obj)){caseT_aaaa:caseT_bbbb:...}

thisformisanidiomofruby.TYPE()isthemacroreturningthetypeflagoftheobjectstruct(T_OBJECT,T_STRING,etc.).Inotherwordsas

Page 130: Ruby Hacking Guide

thetypeflagisanintegerconstant,wecanbranchdependingonitwithaswitch.FixnumorSymboldonothavestructs,butinsideTYPE()aspecialtreatmentisdonetoproperlyreturnT_FIXNUMandT_SYMBOL,sothere’snoneedtoworry.

Well,let’sgobacktorb_ivar_set().ItseemsonlythetreatmentsofT_OBJECT,T_CLASSandT_MODULEaredifferent.These3havebeenchosenonthebasisthattheirsecondmemberisiv_tbl.Let’sconfirmitinpractice.

▼Structswhosesecondmemberisiv_tbl

/*TYPE(val)==T_OBJECT*/295structRObject{296structRBasicbasic;297structst_table*iv_tbl;298};

/*TYPE(val)==T_CLASSorT_MODULE*/300structRClass{301structRBasicbasic;302structst_table*iv_tbl;303structst_table*m_tbl;304VALUEsuper;305};

(ruby.h)

iv_tblistheInstanceVariableTaBLe.Itrecordsthecorrespondencesbetweentheinstancevariablenamesandtheirvalues.

Inrb_ivar_set(),let’slookagainthecodeforthestructshaving

Page 131: Ruby Hacking Guide

iv_tbl.

if(!ROBJECT(obj)->iv_tbl)ROBJECT(obj)->iv_tbl=st_init_numtable();st_insert(ROBJECT(obj)->iv_tbl,id,val);break;

ROBJECT()isamacrothatcastsaVALUEintoa`structRObject*.It'spossiblethatwhatobj`pointstoisactuallyastructRClass,butwhenaccessingonlythesecondmember,noproblemwilloccur.

st_init_numtable()isafunctioncreatinganewst_table.st_insert()isafunctiondoingassociationsinast_table.

Inconclusion,thiscodedoesthefollowing:ifiv_tbldoesnotexist,itcreatesit,thenstoresthe[variablename→object]association.

There’sonethingtobecarefulabout.AsstructRClassisthestructofaclassobject,itsinstancevariabletableisfortheclassobjectitself.InRubyprograms,itcorrespondstosomethinglikethefollowing:

classC@ivar="content"end

generic_ivar_set()

WhathappenswhenassigningtoaninstancevariableofanobjectwhosestructisnotoneofT_OBJECTT_MODULET_CLASS?

Page 132: Ruby Hacking Guide

▼rb_ivar_set()inthecasethereisnoiv_tbl

1000default:1001generic_ivar_set(obj,id,val);1002break;

(variable.c)

Thisisdelegatedtogeneric_ivar_set().Beforelookingatthisfunction,let’sfirstexplainitsgeneralidea.

StructsthatarenotT_OBJECT,T_MODULEorT_CLASSdonothaveaniv_tblmember(thereasonwhytheydonothaveitwillbeexplainedlater).However,evenifitdoesnothavethemember,ifthere’sanothermethodlinkinganinstancetoastructst_table,itwouldbeabletohaveinstancevariables.Inruby,theseassociationsaresolvedbyusingaglobalst_table,generic_iv_table(figure7).

Figure7:generic_iv_table

Let’sseethisinpractice.

▼generic_ivar_set()

Page 133: Ruby Hacking Guide

801staticst_table*generic_iv_tbl;

830staticvoid831generic_ivar_set(obj,id,val)832VALUEobj;833IDid;834VALUEval;835{836st_table*tbl;837/*forthetimebeingyoucanignorethis*/838if(rb_special_const_p(obj)){839special_generic_ivar=1;840}/*initializegeneric_iv_tblifitdoesnotexist*/841if(!generic_iv_tbl){842generic_iv_tbl=st_init_numtable();843}844/*theprocessitself*/845if(!st_lookup(generic_iv_tbl,obj,&tbl)){846FL_SET(obj,FL_EXIVAR);847tbl=st_init_numtable();848st_add_direct(generic_iv_tbl,obj,tbl);849st_add_direct(tbl,id,val);850return;851}852st_insert(tbl,id,val);853}

(variable.c)

rb_special_const_p()istruewhenitsparameterisnotapointer.However,asthisifpartrequiresknowledgeofthegarbagecollector,we’llskipitfornow.I’dlikeyoutocheckitagainafterreadingthechapter5“Garbagecollection”.

st_init_numtable()alreadyappearedsometimeago.Itcreatesa

Page 134: Ruby Hacking Guide

newhashtable.

st_lookup()searchesavaluecorrespondingtoakey.Inthiscaseitsearchesforwhat’sattachedtoobj.Ifanattachedvaluecanbefound,thewholefunctionreturnstrueandstoresthevalueattheaddress(&tbl)givenasthirdparameter.Inshort,!st_lookup(...)canberead“ifavaluecan’tbefound”.

st_insert()wasalsoalreadyexplained.Itstoresanewassociationinatable.

st_add_direct()issimilartost_insert(),butitdoesnotcheckifthekeywasalreadystoredbeforeaddinganassociation.Itmeans,inthecaseofst_add_direct(),ifakeyalreadyregisteredisbeingused,twoassociationslinkedtothissamekeywillbestored.Wecanusest_add_direct()onlywhenthecheckforexistencehasalreadybeendone,orwhenanewtablehasjustbeencreated.Andthiscodewouldmeettheserequirements.

FL_SET(obj,FL_EXIVAR)isthemacrothatsetstheFL_EXIVARflaginthebasic.flagsofobj.Thebasic.flagsflagsareallnamedFL_xxxxandcanbesetusingFL_SET().TheseflagscanbeunsetwithFL_UNSET().TheEXIVARfromFL_EXIVARseemstobetheabbreviationofEXternalInstanceVARiable.

Thisflagissettospeedupthereadingofinstancevariables.IfFL_EXIVARisnotset,evenwithoutsearchingingeneric_iv_tbl,wecanseetheobjectdoesnothaveanyinstancevariables.Andof

Page 135: Ruby Hacking Guide

courseabitcheckiswayfasterthansearchingastructst_table.

GapsinstructsNowyouunderstoodthewaytostoretheinstancevariables,butwhyaretherestructswithoutiv_tbl?Whyistherenoiv_tblinstructRStringorstructRArray?Couldn’tiv_tblbepartofRBasic?

Totelltheconclusionfirst,wecandosuchthing,butshouldnot.Asamatteroffact,thisproblemisdeeplylinkedtothewayrubymanagesobjects.

Inruby,thememoryusedforstringdata(char[])andsuchisdirectlyallocatedusingmalloc().However,theobjectstructsarehandledinaparticularway.rubyallocatesthembyclusters,andthendistributethemfromtheseclusters.Andinthisway,ifthetypes(orrathertheirsizes)werediverse,it’shardtomanage,thusRVALUE,whichistheunionoftheallstructs,isdefinedandthearrayoftheunionsismanaged.

Thesizeofaunionisthesameasthesizeofthebiggestmember,soforinstance,ifoneofthestructsisbig,alotofspacewouldbewasted.Therefore,it’spreferablethateachstructsizeisassimilaraspossible.

ThemostusedstructmightbeusuallystructRString.Afterthat,dependingoneachprogram,therecomesstructRArray(array),RHash(hash),RObject(userdefinedobject),etc.However,thisstruct

Page 136: Ruby Hacking Guide

RObjectonlyusesthespaceofstructRBasic+1pointer.Ontheotherhand,structRString,RArrayandRHashtakethespaceofstructRBasic+3pointers.Inotherwords,whenthenumberofstructRObjectisbeingincreased,thememoryspaceofthetwopointersforeachobjectarewasted.Furthermore,ifthesizeofRStringwasasmuchas4pointers,Robjectwoulduselessthanthehalfsizeoftheunion,andthisistoowasteful.

Sothereceivedmeritforiv_tblismoreorlesssavingmemoryandspeedingup.Furthermorewedonotknowifitisusedoftenornot.Infact,generic_iv_tblwasnotintroducedbeforeruby1.2,soitwasnotpossibletouseinstancevariablesinStringorArrayatthattime.Nevertheless,itwasnotmuchofaproblem.Makinglargeamountsofmemoryuselessjustforsuchfunctionalitylooksstupid.

Ifyoutakeallthisintoconsideration,youcanconcludethatincreasingthesizeofobjectstructsforiv_tbldoesnotdoanygood.

rb_ivar_get()

Wesawtherb_ivar_set()functionthatsetsvariables,solet’sseequicklyhowtogetthem.

▼rb_ivar_get()

960VALUE961rb_ivar_get(obj,id)962VALUEobj;963IDid;964{

Page 137: Ruby Hacking Guide

965VALUEval;966967switch(TYPE(obj)){/*(A)*/968caseT_OBJECT:969caseT_CLASS:970caseT_MODULE:971if(ROBJECT(obj)->iv_tbl&&st_lookup(ROBJECT(obj)->iv_tbl,id,&val))972returnval;973break;/*(B)*/974default:975if(FL_TEST(obj,FL_EXIVAR)||rb_special_const_p(obj))976returngeneric_ivar_get(obj,id);977break;978}/*(C)*/979rb_warning("instancevariable%snotinitialized",rb_id2name(id));980981returnQnil;982}

(variable.c)

Thestructureiscompletelythesame.

(A)ForstructRObjectorRClass,wesearchthevariableiniv_tbl.Asiv_tblcanalsobeNULL,wemustcheckitbeforeusingit.Thenifst_lookup()findstherelation,itreturnstrue,sothewholeifcanbereadas“Iftheinstancevariablehasbeenset,returnitsvalue”.

(C)Ifnocorrespondencecouldbefound,inotherwordsifwereadaninstancevariablethathasnotbeenset,wefirstleavetheifthentheswitch.rb_warning()willthenissueawarningandnilwillbereturned.That’sbecauseyoucanreadinstancevariablesthathave

Page 138: Ruby Hacking Guide

notbeensetinRuby.

(B)Ontheotherhand,ifthestructisneitherstructRObjectnorRClass,theinstancevariabletableissearchedingeneric_iv_tbl.Whatgeneric_ivar_get()doescanbeeasilyguessed,soIwon’texplainit.I’dratherwantyoutofocusontheconditionoftheifstatement.

IalreadytoldyouthattheFL_EXIRVARflagissettotheobjectonwhichgeneric_ivar_set()isused.Here,thatflagisutilizedtomakethecheckfaster.

Andwhatisrb_special_const_p()?Thisfunctionreturnstruewhenitsparameterobjdoesnotpointtoastruct.Asnostructmeansnobasic.flags,noflagcanbesetinthefirstplace.ThusFL_xxxx()isdesignedtoalwaysreturnfalseforsuchobject.Hence,objectsthatarerb_special_const_p()shouldbetreatedspeciallyhere.

ObjectStructs

Inthissection,abouttheimportantonesamongobjectstructs,we’llbrieflyseetheirconcreteappearancesandhowtodealwiththem.

structRString

Page 139: Ruby Hacking Guide

structRStringisthestructfortheinstancesoftheStringclassanditssubclasses.

▼structRString

314structRString{315structRBasicbasic;316longlen;317char*ptr;318union{319longcapa;320VALUEshared;321}aux;322};

(ruby.h)

ptrisapointertothestring,andlenthelengthofthatstring.Verystraightforward.

Ratherthanastring,Ruby’sstringismoreabytearray,andcancontainanybyteincludingNUL.SowhenthinkingattheRubylevel,endingthestringwithNULdoesnotmeananything.ButasCfunctionsrequireNUL,forconveniencetheendingNUListhere.However,itssizeisnotincludedinlen.

Whendealingwithastringfromtheinterpreteroranextensionlibrary,youcanaccessptrandlenbywritingRSTRING(str)->ptrorRSTRING(str)->len,anditisallowed.Buttherearesomepointstopayattentionto.

1. youhavetocheckifstrreallypointstoastructRStringby

Page 140: Ruby Hacking Guide

yourselfbeforehand2. youcanreadthemembers,butyoumustnotmodifythem3. youcan’tstoreRSTRING(str)->ptrinsomethinglikealocal

variableanduseitlater

Whyisthat?First,thereisanimportantsoftwareengineeringprinciple:Don’tarbitrarilytamperwithsomeone’sdata.Whenthereareinterfacefunctions,weshouldusethem.However,therearealsoconcretereasonsinruby‘sdesignwhyyoushouldnotrefertoorstoreapointer,andthat’srelatedtothefourthmemberaux.However,toexplainproperlyhowtouseaux,wehavetoexplainfirstalittlemoreofRuby’sstrings’characteristics.

Ruby’sstringscanbemodified(aremutable).BymutableImeanafterthefollowingcode:

s="str"#createastringandassignittoss.concat("ing")#append"ing"tothisstringobjectp(s)#show"string"

thecontentoftheobjectpointedbyswillbecome“string”.It’sdifferentfromJavaorPythonstringobjects.Java’sStringBufferiscloser.

Andwhat’stherelation?First,mutablemeansthelength(len)ofthestringcanchange.Wehavetoincreaseordecreasetheallocatedmemorysizeeachtimethelengthchanges.Wecanofcourseuserealloc()forthat,butgenerallymalloc()andrealloc()areheavyoperations.Havingtorealloc()eachtimethestring

Page 141: Ruby Hacking Guide

changesisahugeburden.

That’swhythememorypointedbyptrhasbeenallocatedwithasizealittlebiggerthanlen.Becauseofthat,iftheaddedpartcanfitintotheremainingmemory,it’stakencareofwithoutcallingrealloc(),soit’sfaster.Thestructmemberaux.capacontainsthelengthincludingthisadditionalmemory.

Sowhatisthisotheraux.shared?It’stospeedupthecreationofliteralstrings.HavealookatthefollowingRubyprogram.

whiletruedo#repeatindefinitelya="str"#createastringwith"str"ascontentandassignittoaa.concat("ing")#append"ing"totheobjectpointedbyap(a)#show"string"end

Whateverthenumberoftimesyourepeattheloop,thefourthline’sphastoshow"string".Andtodoso,theexpression"str"musteverytimecreateanobjectthatholdsadistinctchar[].Buttheremustbealsothehighpossibilitythatstringsarenotmodifiedatall,andalotofuselesscopiesofchar[]wouldbecreatedinsuchsituation.Ifpossible,we’dliketoshareonecommonchar[].

Thetricktoshareisaux.shared.Everystringobjectcreatedwithaliteralusesonesharedchar[].Andafterachangeoccurs,theobject-specificmemoryisallocated.Whenusingasharedchar[],theflagELTS_SHAREDissetintheobjectstruct’sbasic.flags,andaux.sharedcontainstheoriginalobject.ELTSseemstobethe

Page 142: Ruby Hacking Guide

abbreviationofELemenTS.

Then,let’sreturntoourtalkaboutRSTRING(str)->ptr.ThoughreferringtoapointerisOK,youmustnotassigntoit.Thisisfirstbecausethevalueoflenorcapawillnolongeragreewiththeactualbody,andalsobecausewhenmodifyingstringscreatedaslitterals,aux.sharedhastobeseparated.

Beforeendingthissection,I’llwritesomeexamplesofdealingwithRString.I’dlikeyoutoregardstrasaVALUEthatpointstoRStringwhenreadingthis.

RSTRING(str)->len;/*length*/RSTRING(str)->ptr[0];/*firstcharacter*/str=rb_str_new("content",7);/*createastringwith"content"asitscontentthesecondparameteristhelength*/str=rb_str_new2("content");/*createastringwith"content"asitscontentitslengthiscalculatedwithstrlen()*/rb_str_cat2(str,"end");/*ConcatenateaCstringtoaRubystring*/

structRArray

structRArrayisthestructfortheinstancesofRuby’sarrayclassArray.

▼structRArray

324structRArray{325structRBasicbasic;326longlen;327union{328longcapa;329VALUEshared;

Page 143: Ruby Hacking Guide

330}aux;331VALUE*ptr;332};

(ruby.h)

Exceptforthetypeofptr,thisstructureisalmostthesameasstructRString.ptrpointstothecontentofthearray,andlenisitslength.auxisexactlythesameasinstructRString.aux.capaisthe“real”lengthofthememorypointedbyptr,andifptrisshared,aux.sharedstoresthesharedoriginalarrayobject.

Fromthisstructure,it’sclearthatRuby’sArrayisanarrayandnotalist.Sowhenthenumberofelementschangesinabigway,arealloc()mustbedone,andifanelementmustbeinsertedatanotherplacethantheend,amemmove()willoccur.Butevenifitdoesit,it’smovingsofastthatwedon’tnoticeaboutthat.Recentmachinesarereallyimpressive.

AndthewaytoaccesstoitsmembersissimilartothewayofRString.WithRARRAY(arr)->ptrandRARRAY(arr)->len,youcanrefertothemembers,anditisallowed,butyoumustnotassigntothem,etc.We’llonlylookatsimpleexamples:

/*manageanarrayfromC*/VALUEary;ary=rb_ary_new();/*createanemptyarray*/rb_ary_push(ary,INT2FIX(9));/*pushaRuby9*/RARRAY(ary)->ptr[0];/*lookwhat'satindex0*/rb_p(RARRAY(ary)->ptr[0]);/*doponary[0](theresultis9)*/

#manageanarrayfromRuby

Page 144: Ruby Hacking Guide

ary=[]#createanemptyarrayary.push(9)#push9ary[0]#lookwhat'satindex0p(ary[0])#doponary[0](theresultis9)

structRRegexp

It’sthestructfortheinstancesoftheregularexpressionclassRegexp.

▼structRRegexp

334structRRegexp{335structRBasicbasic;336structre_pattern_buffer*ptr;337longlen;338char*str;339};

(ruby.h)

ptristhecompiledregularexpression.stristhestringbeforecompilation(thesourcecodeoftheregularexpression),andlenisthisstring’slength.

AsanycodetohandleRegexpobjectsdoesn’tappearinthisbook,wewon’tseehowtouseit.Evenifyouuseitinextensionlibraries,aslongasyoudonotwanttouseitaveryparticularway,theinterfacefunctionsareenough.

structRHash

Page 145: Ruby Hacking Guide

structRHashisthestructforHashobject,whichisRuby’shashtable.

▼structRHash

341structRHash{342structRBasicbasic;343structst_table*tbl;344intiter_lev;345VALUEifnone;346};

(ruby.h)

It’sawrapperforstructst_table.st_tablewillbedetailedinthenextchapter“Namesandnametables”.

ifnoneisthevaluewhenakeydoesnothaveanassociatedvalue,itsdefaultisnil.iter_levistomakethehashtablereentrant(multithreadsafe).

structRFile

structRFileisastructforinstancesofthebuilt-inIOclassanditssubclasses.

▼structRFile

348structRFile{349structRBasicbasic;350structOpenFile*fptr;351};

(ruby.h)

Page 146: Ruby Hacking Guide

▼OpenFile

19typedefstructOpenFile{20FILE*f;/*stdioptrforread/write*/21FILE*f2;/*additionalptrforrwpipes*/22intmode;/*modeflags*/23intpid;/*child'spid(forpipes)*/24intlineno;/*numberoflinesread*/25char*path;/*pathnameforfile*/26void(*finalize)_((structOpenFile*));/*finalizeproc*/27}OpenFile;

(rubyio.h)

AllmembershavebeentransferredinstructOpenFile.Astherearen’tmanyinstancesofIOobjects,it’sOKtodoitlikethis.Thepurposeofeachmemberiswritteninthecomments.Basically,it’sawrapperaroundC’sstdio.

structRData

structRDatahasadifferenttenorfromwhatwesawbefore.Itisthestructforimplementationofextensionlibraries.

Ofcoursestructsforclassescreatedinextensionlibrariesarenecessary,butasthetypesofthesestructsdependonthecreatedclass,it’simpossibletoknowtheirsizeorstructinadvance.That’swhya“structformanagingapointertoauserdefinedstruct”hasbeencreatedonruby’ssidetomanagethis.ThisstructisstructRData.

▼structRData

Page 147: Ruby Hacking Guide

353structRData{354structRBasicbasic;355void(*dmark)_((void*));356void(*dfree)_((void*));357void*data;358};

(ruby.h)

dataisapointertotheuserdefinedstruct,dfreeisthefunctionusedtofreethatuserdefinedstruct,anddmarkisthefunctiontodo“mark”ofthemarkandsweep.

BecauseexplainingstructRDataisstilltoocomplicated,forthetimebeinglet’sjustlookatitsrepresentation(figure8).Thedetailedexplanationofitsmemberswillbeintroducedafterwe’llfinishchapter5“Garbagecollection”.

Figure8:RepresentationofstructRData

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5

Page 148: Ruby Hacking Guide

License

Page 149: Ruby Hacking Guide

RubyHackingGuide

TranslatedbyCliffordEscobarCAOILE

Page 150: Ruby Hacking Guide

Chapter3:Namesand

NameTable

st_table

st_tablehasalreadyappearedseveraltimesasamethodtableandaninstancetable.Inthischapterlet’slookatthestructureofthest_tableindetail.

SummaryIpreviouslymentionedthatthest_tableisahashtable.Whatisahashtable?Itisadatastructurethatrecordsone-to-onerelations,forexample,avariablenameanditsvalue,orafunctionnameanditsbody,etc.

However,datastructuresotherthanhashtablescan,ofcourse,recordone-to-onerelations.Forexample,alistofthefollowingstructswillsufficeforthispurpose.

structentry{IDkey;VALUEval;

Page 151: Ruby Hacking Guide

structentry*next;/*pointtothenextentry*/};

However,thismethodisslow.Ifthelistcontainsathousanditems,intheworstcase,itisnecessarytotraverseathousandlinks.Inotherwords,thesearchtimeincreasesinproportiontothenumberofelements.Thisisbad.Sinceancienttimes,variousspeedimprovementmethodshavebeenconceived.Thehashtableisoneofthoseimprovedmethods.Inotherwords,thepointisnotthatthehashtableisnecessarybutthatitcanbemadefaster.

Nowthen,letusexaminethest_table.Asitturnsout,thislibraryisnotcreatedbyMatsumoto,rather:

▼st.ccredits

1/*ThisisapublicdomaingeneralpurposehashtablepackagewrittenbyPeterMoore@UCB.*/

(st.c)

asshownabove.

Bytheway,whenIsearchedGoogleandfoundanotherversion,itmentionedthatst_tableisacontractionof“STringTABLE”.However,Ifinditcontradictorythatithasboth“generalpurpose”and“string”aspects.

Whatisahashtable?

Page 152: Ruby Hacking Guide

Ahashtablecanbethoughtasthefollowing:Letusthinkofanarraywithnitems.Forexample,letusmaken=64(figure1).

Figure1:Array

Thenletusspecifyafunctionfthattakesakeyandproducesanintegerifrom0ton-1(0-63).Wecallthisfahashfunction.fwhengiventhesamekeyalwaysproducesthesamei.Forexample,ifwecanassumethatthekeyislimitedtopositiveintegers,whenthekeyisdividedby64,theremaindershouldalwaysfallbetween0and63.Therefore,thiscalculatingexpressionhasapossibilityofbeingthefunctionf.

Whenrecordingrelationships,givenakey,functionfgeneratesi,andplacesthevalueintoindexiofthearraywehaveprepared.Indexaccessintoanarrayisveryfast.Thekeyconcernischangingakeyintoaninteger.

Figure2:Arrayassignment

Page 153: Ruby Hacking Guide

However,intherealworlditisn’tthateasy.Thereisacriticalproblemwiththisidea.Becausenisonly64,iftherearemorethan64relationshipstoberecorded,itiscertainthattherewillbethesameindexfortwodifferentkeys.Itisalsopossiblethatwithfewerthan64,thesamethingcanoccur.Forexample,giventheprevioushashfunction“key%64”,keys65and129willbothhaveahashvalueof1.Thisiscalledahashvaluecollision.Therearemanywaystoresolvesuchacollision.

Onesolutionistoinsertintothenextelementwhenacollisionoccurs.Thisiscalledopenaddressing.(Figure3).

Figure3:Openaddressing

Otherthanusingthearraylikethis,thereareotherpossibleapproaches,likeusingapointertoarespectivelinkedlistineachelementofthearray.Thenwhenacollisionoccurs,growthelinkedlist.Thisiscalledchaining.(Figure4)st_tableusesthischainingmethod.

Page 154: Ruby Hacking Guide

Figure4:Chaining

However,ifitcanbedeterminedaprioriwhatsetofkeyswillbeused,itispossibletoimagineahashfunctionthatwillnevercreatecollisions.Thistypeoffunctioniscalleda“perfecthashfunction”.Actually,therearetoolswhichcreateaperfecthashfunctiongivenasetofarbitrarystrings.GNUgperfisoneofthose.ruby‘sparserimplementationusesGNUgperfbut…thisisnotthetimetodiscussit.We’lldiscussthisinthesecondpartofthebook.

DataStructureLetusstartlookingatthesourcecode.Aswrittenintheintroductorychapter,ifthereisdataandcode,itisbettertoreadthedatafirst.Thefollowingisthedatatypeofst_table.

▼st_table

9typedefstructst_tablest_table;

16structst_table{17structst_hash_type*type;18intnum_bins;/*slotcount*/

Page 155: Ruby Hacking Guide

19intnum_entries;/*totalnumberofentries*/20structst_table_entry**bins;/*slot*/21};

(st.h)

▼structst_table_entry

16structst_table_entry{17unsignedinthash;18char*key;19char*record;20st_table_entry*next;21};

(st.c)

st_tableisthemaintablestructure.st_table_entryisaholderthatstoresonevalue.st_table_entrycontainsamembercallednextwhichofcourseisusedtomakest_table_entryintoalinkedlist.Thisisthechainpartofthechainingmethod.Thest_hash_typedatatypeisused,butIwillexplainthislater.Firstletmeexplaintheotherpartssoyoucancompareandunderstandtheroles.

Page 156: Ruby Hacking Guide

Figure5:st_tabledatastructure

So,letuscommentonst_hash_type.

▼structst_hash_type

11structst_hash_type{12int(*compare)();/*comparisonfunction*/13int(*hash)();/*hashfunction*/14};

(st.h)

ThisisstillChapter3soletusexamineitattentively.

int(*compare)()

Thispartshows,ofcourse,themembercomparewhichhasadatatypeof“apointertoafunctionthatreturnsanint”.hashisalsoofthesametype.Thisvariableissubstitutedinthefollowingway:

intgreat_function(intn){/*ToDo:Dosomethinggreat!*/returnn;}

{int(*f)();f=great_function;

Anditiscalledlikethis:

Page 157: Ruby Hacking Guide

(*f)(7);}

Hereletusreturntothest_hash_typecommentary.Ofthetwomembershashandcompare,hashisthehashfunctionfexplainedpreviously.

Ontheotherhand,compareisafunctionthatevaluatesifthekeyisactuallythesameornot.Withthechainingmethod,inthespotwiththesamehashvaluen,multipleelementscanbeinserted.Toknowexactlywhichelementisbeingsearchedfor,thistimeitisnecessarytouseacomparisonfunctionthatwecanabsolutelytrust.comparewillbethatfunction.

Thisst_hash_typeisagoodgeneralizedtechnique.Thehashtableitselfcannotdeterminewhatthestoredkeys’datatypewillbe.Forexample,inruby,st_table’skeysareIDorchar*orVALUE,buttowritethesamekindofhashforeach(datatype)isfoolish.Usually,thethingsthatchangewiththedifferentkeydatatypesarethingslikethehashfunction.Forthingslikememoryallocationandcollisiondetection,typicallymostofthecodeisthesame.Onlythepartswheretheimplementationchangeswithadifferingdatatypewillbebundledupintoafunction,andapointertothatfunctionwillbeused.Inthisfashion,themajorityofthecodethatmakesupthehashtableimplementationcanuseit.

Inobject-orientedlanguages,inthefirstplace,youcanattachaproceduretoanobjectandpassit(around),sothismechanismis

Page 158: Ruby Hacking Guide

notnecessary.Perhapsitmorecorrecttosaythatthismechanismisbuilt-inasalanguage’sfeature.

st_hash_typeexampleTheusageofadatastructurelikest_hash_typeisgoodasanabstraction.Ontheotherhand,whatkindofcodeitactuallypassesthroughmaybedifficulttounderstand.Ifwedonotexaminewhatsortoffunctionisusedforhashorcompare,wewillnotgraspthereality.Tounderstandthis,itisprobablysufficienttolookatst_init_numtable()introducedinthepreviouschapter.Thisfunctioncreatesatableforintegerdatatypekeys.

▼st_init_numtable()

182st_table*183st_init_numtable()184{185returnst_init_table(&type_numhash);186}

(st.c)

st_init_table()isthefunctionthatallocatesthetablememoryandsoon.type_numhashisanst_hash_type(itisthemembernamed“type”ofst_table).Regardingthistype_numhash:

▼type_numhash

37staticstructst_hash_typetype_numhash={38numcmp,

Page 159: Ruby Hacking Guide

39numhash,40};

552staticint553numcmp(x,y)554longx,y;555{556returnx!=y;557}

559staticint560numhash(n)561longn;562{563returnn;564}

(st.c)

Verysimple.Thetablethattherubyinterpreterusesisbyandlargethistype_numhash.

st_lookup()

Nowthen,letuslookatthefunctionthatusesthisdatastructure.First,it’sagoodideatolookatthefunctionthatdoesthesearching.Shownbelowisthefunctionthatsearchesthehashtable,st_lookup().

▼st_lookup()

247int248st_lookup(table,key,value)249st_table*table;250registerchar*key;251char**value;

Page 160: Ruby Hacking Guide

252{253unsignedinthash_val,bin_pos;254registerst_table_entry*ptr;255256hash_val=do_hash(key,table);257FIND_ENTRY(table,ptr,hash_val,bin_pos);258259if(ptr==0){260return0;261}262else{263if(value!=0)*value=ptr->record;264return1;265}266}

(st.c)

Theimportantpartsareprettymuchindo_hash()andFIND_ENTRY().Letuslookattheminorder.

▼do_hash()

68#definedo_hash(key,table)(unsignedint)(*(table)->type->hash)((key))

(st.c)

Justincase,letuswritedownthemacrobodythatisdifficulttounderstand:

(table)->type->hash

isafunctionpointerwherethekeyispassedasaparameter.Thisisthesyntaxforcallingthefunction.*isnotappliedtotable.Inotherwords,thismacroisahashvaluegeneratorforakey,usingthe

Page 161: Ruby Hacking Guide

preparedhashfunctiontype->hashforeachdatatype.

Next,letusexamineFIND_ENTRY().

▼FIND_ENTRY()

235#defineFIND_ENTRY(table,ptr,hash_val,bin_pos)do{\236bin_pos=hash_val%(table)->num_bins;\237ptr=(table)->bins[bin_pos];\238if(PTR_NOT_EQUAL(table,ptr,hash_val,key)){\239COLLISION;\240while(PTR_NOT_EQUAL(table,ptr->next,hash_val,key)){\241ptr=ptr->next;\242}\243ptr=ptr->next;\244}\245}while(0)

227#definePTR_NOT_EQUAL(table,ptr,hash_val,key)((ptr)!=0&&\(ptr->hash!=(hash_val)||!EQUAL((table),(key),(ptr)->key)))

66#defineEQUAL(table,x,y)\((x)==(y)||(*table->type->compare)((x),(y))==0)

(st.c)

COLLISIONisadebugmacrosowewill(should)ignoreit.

TheparametersofFIND_ENTRY(),startingfromtheleftare:

1. st_table2. thefoundentrywillbepointedtobythisparameter3. hashvalue4. temporaryvariable

Page 162: Ruby Hacking Guide

And,thesecondparameterwillpointtothefoundst_table_entry*.

Attheoutermostlevel,ado..while(0)isusedtosafelywrapupamultipleexpressionmacro.Thisisruby‘s,orrather,Clanguage’spreprocessoridiom.Inthecaseofif(1),theremaybeadangerofaddinganelsepart.Inthecaseofwhile(1),itbecomesnecessarytoaddabreakattheveryend.

Also,thereisnosemicolonaddedafterthewhile(0).

FIND_ENTRY();

Thisissothatthesemicolonthatisnormallywrittenattheendofanexpressionwillnotgotowaste.

st_add_direct()

Continuingon,letusexaminest_add_direct()whichisafunctionthataddsanewrelationshiptothehashtable.Thisfunctiondoesnotcheckifthekeyisalreadyregistered.Italwaysaddsanewentry.Thisisthemeaningofdirectinthefunctionname.

▼st_add_direct()

308void309st_add_direct(table,key,value)310st_table*table;311char*key;312char*value;313{314unsignedinthash_val,bin_pos;

Page 163: Ruby Hacking Guide

315316hash_val=do_hash(key,table);317bin_pos=hash_val%table->num_bins;318ADD_DIRECT(table,key,value,hash_val,bin_pos);319}

(st.c)

Justasbefore,thedo_hash()macrothatobtainsavalueiscalledhere.Afterthat,thenextcalculationisthesameasatthestartofFIND_ENTRY(),whichistoexchangethehashvalueforarealindex.

ThentheinsertionoperationseemstobeimplementedbyADD_DIRECT().Sincethenameisalluppercase,wecananticipatethatisamacro.

▼ADD_DIRECT()

268#defineADD_DIRECT(table,key,value,hash_val,bin_pos)\269do{\270st_table_entry*entry;\271if(table->num_entries/(table->num_bins)\>ST_DEFAULT_MAX_DENSITY){\272rehash(table);\273bin_pos=hash_val%table->num_bins;\274}\275\/*(A)*/\276entry=alloc(st_table_entry);\277\278entry->hash=hash_val;\279entry->key=key;\280entry->record=value;\/*(B)*/\281entry->next=table->bins[bin_pos];\282table->bins[bin_pos]=entry;\283table->num_entries++;\

Page 164: Ruby Hacking Guide

284}while(0)

(st.c)

ThefirstifisanexceptioncasesoIwillexplainitafterwards.

(A)Allocateandinitializeast_table_entry.

(B)Inserttheentryintothestartofthelist.Thisistheidiomforhandlingthelist.Inotherwords,

entry->next=list_beg;list_beg=entry;

makesitpossibletoinsertanentrytothefrontofthelist.Thisissimilarto“cons-ing”intheLisplanguage.Checkforyourselfthateveniflist_begisNULL,thiscodeholdstrue.

Now,letmeexplainthecodeIleftaside.

▼ADD_DIRECT()-rehash

271if(table->num_entries/(table->num_bins)\>ST_DEFAULT_MAX_DENSITY){\272rehash(table);\273bin_pos=hash_val%table->num_bins;\274}\

(st.c)

DENSITYis“concentration”.Inotherwords,thisconditionalchecksifthehashtableis“crowded”ornot.Inthest_table,asthenumber

Page 165: Ruby Hacking Guide

ofvaluesthatusethesamebin_posincreases,thelongerthelinklistbecomes.Inotherwords,searchbecomesslower.Thatiswhyforagivenbincount,whentheaverageelementsperbinbecometoomany,binisincreasedandthecrowdingisreduced.

ThecurrentST_DEFAULT_MAX_DENSITYis

▼ST_DEFAULT_MAX_DENSITY

23#defineST_DEFAULT_MAX_DENSITY5

(st.c)

Becauseofthissetting,ifinallbin_posthereare5st_table_entries,thenthesizewillbeincreased.

st_insert()

st_insert()isnothingmorethanacombinationofst_add_direct()andst_lookup(),soifyouunderstandthosetwo,thiswillbeeasy.

▼st_insert()

286int287st_insert(table,key,value)288registerst_table*table;289registerchar*key;290char*value;291{292unsignedinthash_val,bin_pos;293registerst_table_entry*ptr;294295hash_val=do_hash(key,table);

Page 166: Ruby Hacking Guide

296FIND_ENTRY(table,ptr,hash_val,bin_pos);297298if(ptr==0){299ADD_DIRECT(table,key,value,hash_val,bin_pos);300return0;301}302else{303ptr->record=value;304return1;305}306}

(st.c)

Itchecksiftheelementisalreadyregisteredinthetable.Onlywhenitisnotregisteredwillitbeadded.Ifthereisainsertion,return0.Ifthereisnoinsertion,returna1.

IDandSymbols

I’vealreadydiscussedwhatanIDis.Itisacorrespondencebetweenanarbitrarystringofcharactersandavalue.Itisusedtodeclarevariousnames.Theactualdatatypeisunsignedint.

Fromchar*toIDTheconversionfromstringtoIDisexecutedbyrb_intern().Thisfunctionisratherlong,solet’somitthemiddle.

▼rb_intern()(simplified)

Page 167: Ruby Hacking Guide

5451staticst_table*sym_tbl;/*char*toID*/5452staticst_table*sym_rev_tbl;/*IDtochar**/

5469ID5470rb_intern(name)5471constchar*name;5472{5473constchar*m=name;5474IDid;5475intlast;5476/*Ifforaname,thereisacorrespondingIDthatisalreadyregistered,thenreturnthatID*/5477if(st_lookup(sym_tbl,name,&id))5478returnid;

/*omitted...createanewID*/

/*registerthenameandIDrelation*/5538id_regist:5539name=strdup(name);5540st_add_direct(sym_tbl,name,id);5541st_add_direct(sym_rev_tbl,id,name);5542returnid;5543}

(parse.y)

ThestringandIDcorrespondencerelationshipcanbeaccomplishedbyusingthest_table.Thereprobablyisn’tanyespeciallydifficultparthere.

Whatistheomittedsectiondoing?Itistreatingglobalvariablenamesandinstancevariablesnamesasspecialandflaggingthem.Thisisbecauseintheparser,itisnecessarytoknowthevariable’sclassificationfromtheID.However,thefundamentalpartofIDisunrelatedtothis,soIwon’texplainithere.

Page 168: Ruby Hacking Guide

FromIDtochar*Thereverseofrb_intern()isrb_id2name(),whichtakesanIDandgeneratesachar*.Youprobablyknowthis,butthe2inid2nameis“to”.“To”and“two”havethesamepronounciation,so“2”isusedfor“to”.Thissyntaxisoftenseen.

ThisfunctionalsosetstheIDclassificationflagssoitislong.Letmesimplifyit.

▼rb_id2name()(simplified)

char*rb_id2name(id)IDid;{char*name;

if(st_lookup(sym_rev_tbl,id,&name))returnname;return0;}

Maybeitseemsthatitisalittleover-simplified,butinrealityifweremovethedetailsitreallybecomesthissimple.

ThepointIwanttoemphasizeisthatthefoundnameisnotcopied.TherubyAPIdoesnotrequire(orrather,itforbids)thefree()-ingofthereturnvalue.Also,whenparametersarepassed,italwayscopiesthem.Inotherwords,thecreationandreleaseiscompletedbyoneside,eitherbytheuserorbyruby.

Page 169: Ruby Hacking Guide

Sothen,whencreationandreleasecannotbeaccomplished(whenpasseditisnotreturned)onavalue,thenaRubyobjectisused.Ihavenotyetdiscussedit,butaRubyobjectisautomaticallyreleasedwhenitisnolongerneeded,evenifwearenottakingcareoftheobject.

ConvertingVALUEandIDIDisshownasaninstanceoftheSymbolclassattheRubylevel.Anditcanbeobtainedlikeso:"string".intern.TheimplementationofString#internisrb_str_intern().

▼rb_str_intern()

2996staticVALUE2997rb_str_intern(str)2998VALUEstr;2999{3000IDid;30013002if(!RSTRING(str)->ptr||RSTRING(str)->len==0){3003rb_raise(rb_eArgError,"interningemptystring");3004}3005if(strlen(RSTRING(str)->ptr)!=RSTRING(str)->len)3006rb_raise(rb_eArgError,"stringcontains`\\0'");3007id=rb_intern(RSTRING(str)->ptr);3008returnID2SYM(id);3009}

(string.c)

Thisfunctionisquitereasonableasarubyclasslibrarycodeexample.PleasepayattentiontothepartwhereRSTRING()isused

Page 170: Ruby Hacking Guide

andcasted,andwherethedatastructure’smemberisaccessed.

Let’sreadthecode.First,rb_raise()ismerelyerrorhandlingsoweignoreitfornow.Therb_intern()wepreviouslyexaminedishere,andalsoID2SYMishere.ID2SYM()isamacrothatconvertsIDtoSymbol.

AndthereverseoperationisaccomplishedusingSymbol#to_sandsuch.Theimplementationisinsym_to_s.

▼sym_to_s()

522staticVALUE523sym_to_s(sym)524VALUEsym;525{526returnrb_str_new2(rb_id2name(SYM2ID(sym)));527}

(object.c)

SYM2ID()isthemacrothatconvertsSymbol(VALUE)toanID.

Itlookslikethefunctionisnotdoinganythingunreasonable.However,itisprobablynecessarytopayattentiontotheareaaroundthememoryhandling.rb_id2name()returnsachar*thatmustnotbefree().rb_str_new2()copiestheparameter’schar*andusesthecopy(anddoesnotchangetheparameter).Inthiswaythepolicyisconsistent,whichallowsthelinetobewrittenjustbychainingthefunctions.

Page 171: Ruby Hacking Guide

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 172: Ruby Hacking Guide

RubyHackingGuide

TranslatedbyVincentISAMBART

Page 173: Ruby Hacking Guide

Chapter4:Classesand

modules

Inthischapter,we’llseethedetailsofthedatastructurescreatedbyclassesandmodules.

Classesandmethodsdefinition

First,I’dliketohavealookathowRubyclassesaredefinedattheClevel.Thischapterinvestigatesalmostonlyparticularcases,soI’dlikeyoutoknowfirstthewayusedmostoften.

ThemainAPItodefineclassesandmodulesconsistsofthefollowing6functions:

rb_define_class()

rb_define_class_under()

rb_define_module()

rb_define_module_under()

rb_define_method()

rb_define_singleton_method()

Page 174: Ruby Hacking Guide

Thereareafewotherversionsofthesefunctions,buttheextensionlibrariesandevenmostofthecorelibraryisdefinedusingjustthisAPI.I’llintroducetoyouthesefunctionsonebyone.

Classdefinitionrb_define_class()definesaclassatthetop-level.Let’staketheRubyarrayclass,Array,asanexample.

▼Arrayclassdefinition

19VALUErb_cArray;

1809void1810Init_Array()1811{1812rb_cArray=rb_define_class("Array",rb_cObject);

(array.c)

rb_cObjectandrb_cArraycorrespondrespectivelytoObjectandArrayattheRubylevel.Theaddedprefixrbshowsthatitbelongstorubyandthecthatitisaclassobject.Thesenamingrulesareusedeverywhereinruby.

Thiscalltorb_define_class()definesaclasscalledArray,whichinheritsfromObject.Atthesametimeasrb_define_class()createstheclassobject,italsodefinestheconstant.ThatmeansthatafterthisyoucanalreadyaccessArrayfromaRubyprogram.ItcorrespondstothefollowingRubyprogram:

Page 175: Ruby Hacking Guide

classArray<Object

I’dlikeyoutonotethefactthatthereisnoend.Itwaswrittenlikethisonpurpose.Itisbecausewithrb_define_class()thebodyoftheclasshasnotbeenexecuted.

NestedclassdefinitionAfterthat,there’srb_define_class_under().Thisfunctiondefinesaclassnestedinanotherclassormodule.Thistimetheexampleiswhatisreturnedbystat(2),File::Stat.

▼DefinitionofFile::Stat

78VALUErb_cFile;80staticVALUErb_cStat;

2581rb_cFile=rb_define_class("File",rb_cIO);2674rb_cStat=rb_define_class_under(rb_cFile,"Stat",rb_cObject);

(file.c)

ThiscodecorrespondstothefollowingRubyprogram;

classFile<IOclassStat<Object

ThistimeagainIomittedtheendonpurpose.

Moduledefinition

Page 176: Ruby Hacking Guide

rb_define_module()issimplesolet’sendthisquickly.

▼DefinitionofEnumerable

17VALUErb_mEnumerable;

492rb_mEnumerable=rb_define_module("Enumerable");

(enum.c)

Theminthebeginningofrb_mEnumerableissimilartothecforclasses:itshowsthatitisamodule.ThecorrespondingRubyprogramis:

moduleEnumerable

rb_define_module_under()isnotusedmuchsowe’llskipit.

MethoddefinitionThistimethefunctionistheonefordefiningmethods,rb_define_method().It’susedveryoften.We’lltakeonceagainanexamplefromArray.

▼DefinitionofArray#to_s

1818rb_define_method(rb_cArray,"to_s",rb_ary_to_s,0);

(array.c)

Withthistheto_smethodisdefinedinArray.Themethodbodyis

Page 177: Ruby Hacking Guide

givenbyafunctionpointer(rb_ary_to_s).Thefourthparameteristhenumberofparameterstakenbythemethod.Asto_sdoesnottakeanyparameters,it’s0.IfwewritethecorrespondingRubyprogram,we’llhavethis:

classArray<Objectdefto_s#contentofrb_ary_to_s()endend

Ofcoursetheclasspartisnotincludedinrb_define_method()andonlythedefpartisaccurate.Butifthereisnoclasspart,itwilllooklikethemethodisdefinedlikeafunction,soIalsowrotetheenclosingclasspart.

Onemoreexample,thistimetakingaparameter:

▼DefinitionofArray#concat

1835rb_define_method(rb_cArray,"concat",rb_ary_concat,1);

(array.c)

Theclassforthedefinitionisrb_cArray(Array),themethodnameisconcat,itsbodyisrb_ary_concat()andthenumberofparametersis1.ItcorrespondstowritingthecorrespondingRubyprogram:

classArray<Objectdefconcat(str)#contentofrb_ary_concat()end

Page 178: Ruby Hacking Guide

end

SingletonmethodsdefinitionWecandefinemethodsthatarespecifictoasingleobjectinstance.Theyarecalledsingletonmethods.AsIusedFile.unlinkasanexampleinchapter1“Rubylanguageminimum”,Ifirstwantedtoshowithere,butforaparticularreasonwe’lllookatFile.linkinstead.

▼DefinitionofFile.link

2624rb_define_singleton_method(rb_cFile,"link",rb_file_s_link,2);

(file.c)

It’susedlikerb_define_method().Theonlydifferenceisthatherethefirstparameterisjustthe“object”wherethemethodisdefined.Inthiscase,it’sdefinedinrb_cFile.

EntrypointBeingabletomakedefinitionslikebeforeisgreat,butwherearethesefunctionscalledfrom,andbywhatmeansaretheyexecuted?ThesedefinitionsaregroupedinfunctionsnamedInit_xxxx().Forinstance,forArrayafunctionInit_Array()likethishasbeenmade:

▼Init_Array

Page 179: Ruby Hacking Guide

1809void1810Init_Array()1811{1812rb_cArray=rb_define_class("Array",rb_cObject);1813rb_include_module(rb_cArray,rb_mEnumerable);18141815rb_define_singleton_method(rb_cArray,"allocate",rb_ary_s_alloc,0);1816rb_define_singleton_method(rb_cArray,"[]",rb_ary_s_create,-1);1817rb_define_method(rb_cArray,"initialize",rb_ary_initialize,-1);1818rb_define_method(rb_cArray,"to_s",rb_ary_to_s,0);1819rb_define_method(rb_cArray,"inspect",rb_ary_inspect,0);1820rb_define_method(rb_cArray,"to_a",rb_ary_to_a,0);1821rb_define_method(rb_cArray,"to_ary",rb_ary_to_a,0);1822rb_define_method(rb_cArray,"frozen?",rb_ary_frozen_p,0);

(array.c)

TheInitforthebuilt-infunctionsareexplicitlycalledduringthestartupofruby.Thisisdoneininits.c.

▼rb_call_inits()

47void48rb_call_inits()49{50Init_sym();51Init_var_tables();52Init_Object();53Init_Comparable();54Init_Enumerable();55Init_Precision();56Init_eval();57Init_String();58Init_Exception();59Init_Thread();60Init_Numeric();61Init_Bignum();62Init_Array();

Page 180: Ruby Hacking Guide

(inits.c)

Thisway,Init_Array()iscalledproperly.

Thatexplainsitforthebuilt-inlibraries,butwhataboutextensionlibraries?Infact,forextensionlibrariestheconventionisthesame.Takethefollowingcode:

require"myextension"

Withthis,iftheloadedextensionlibraryismyextension.so,atloadtime,the(extern)functionnamedInit_myextension()iscalled.Howtheyarecalledisbeyondthescopeofthischapter.Forthat,youshouldreadchapter18,“Load”.Herewe’lljustendthiswithanexampleofInit.

Thefollowingexampleisfromstringio,anextensionlibraryprovidedwithruby,thatistosaynotfromabuilt-inlibrary.

▼Init_stringio()(beginning)

895void896Init_stringio()897{898VALUEStringIO=rb_define_class("StringIO",rb_cData);899rb_define_singleton_method(StringIO,"allocate",strio_s_allocate,0);900rb_define_singleton_method(StringIO,"open",strio_s_open,-1);901rb_define_method(StringIO,"initialize",strio_initialize,-1);902rb_enable_super(StringIO,"initialize");903rb_define_method(StringIO,"become",strio_become,1);904rb_define_method(StringIO,"reopen",strio_reopen,-1);

Page 181: Ruby Hacking Guide

(ext/stringio/stringio.c)

Singletonclasses

rb_define_singleton_method()

Youshouldnowbeabletomoreorlessunderstandhownormalmethodsaredefined.Somehowmakingthebodyofthemethod,thenregisteringitinm_tblwilldo.Butwhataboutsingletonmethods?We’llnowlookintothewaysingletonmethodsaredefined.

▼rb_define_singleton_method()

721void722rb_define_singleton_method(obj,name,func,argc)723VALUEobj;724constchar*name;725VALUE(*func)();726intargc;727{728rb_define_method(rb_singleton_class(obj),name,func,argc);729}

(class.c)

AsIexplained,rb_define_method()isafunctionusedtodefinenormalmethods,sothedifferencefromnormalmethodsisonlyrb_singleton_class().Butwhatoneartharesingletonclasses?

Page 182: Ruby Hacking Guide

Inbrief,singletonclassesarevirtualclassesthatareonlyusedtoexecutesingletonmethods.Singletonmethodsarefunctionsdefinedinsingletonclasses.Classesthemselvesareinthefirstplace(inaway)the“implementation”tolinkobjectsandmethods,butsingletonclassesareevenmoreontheimplementationside.IntheRubylanguageway,theyarenotformallyincluded,anddon’tappearmuchattheRubylevel.

rb_singleton_class()

Well,let’sconfirmwhatthesingletonclassesaremadeof.It’stoosimpletojustshowyouthecodeofafunctioneachtimesothistimeI’lluseanewweapon,acallgraph.

rb_define_singleton_methodrb_define_methodrb_singleton_classSPECIAL_SINGLETONrb_make_metaclassrb_class_bootrb_singleton_class_attached

Callgraphsaregraphsshowingcallingrelationshipsamongfunctions(ormoregenerallyprocedures).Thecallgraphsshowingallthecallswritteninthesourcecodearecalledstaticcallgraphs.Theonesexpressingonlythecallsdoneduringanexecutionarecalleddynamiccallgraphs.

Thisdiagramisastaticcallgraphandtheindentationexpresseswhichfunctioncallswhichone.Forinstance,rb_define_singleton_method()callsrb_define_method()and

Page 183: Ruby Hacking Guide

rb_singleton_class().Andthisrb_singleton_class()itselfcallsSPECIAL_SINGLETON()andrb_make_metaclass().Inordertoobtaincallgraphs,youcanusecflowandsuch.{cflow:seealsodoc/callgraph.htmlintheattachedCD-ROM}

Inthisbook,becauseIwantedtoobtaincallgraphsthatcontainonlyfunctions,Icreatedaruby-specifictoolbymyself.Perhapsitcanbegeneralizedbymodifyingitscodeanalyzingpart,thusI’dliketosomehowmakeituntilaroundthepublicationofthisbook.Thesesituationsarealsoexplainedindoc/callgraph.htmloftheattachedCD-ROM.

Let’sgobacktothecode.Whenlookingatthecallgraph,youcanseethatthecallsmadebyrb_singleton_class()goverydeep.Untilnowallcalllevelswereshallow,sowecouldsimplylookatthefunctionswithoutgettingtoolost.Butatthisdepth,IeasilyforgetwhatIwasdoing.Insuchsituationyoumustbringacallgraphtokeepawareofwhereitiswhenreading.Thistime,asanexample,we’lldecodetheproceduresbelowrb_singleton_class()inparallel.Weshouldlookoutforthefollowingtwopoints:

Whatexactlyaresingletonclasses?Whatisthepurposeofsingletonclasses?

NormalclassesandsingletonclassesSingletonclassesarespecialclasses:they’rebasicallythesameasnormalclasses,butthereareafewdifferences.Wecansaythat

Page 184: Ruby Hacking Guide

findingthesedifferencesisexplainingconcretelysingletonclasses.

Whatshouldwedotofindthem?Weshouldfindthedifferencesbetweenthefunctioncreatingnormalclassesandtheonecreatingsingletonclasses.Forthis,wehavetofindthefunctionforcreatingnormalclasses.Thatisasnormalclassescanbedefinedbyrb_define_class(),itmustcallinawayoranotherafunctiontocreatenormalclasses.Forthemoment,we’llnotlookatthecontentofrb_define_class()itself.Ihavesomereasonstobeinterestedinsomethingthat’sdeeper.That’swhywewillfirstlookatthecallgraphofrb_define_class().

rb_define_classrb_class_inheritedrb_define_class_idrb_class_newrb_class_bootrb_make_metaclassrb_class_bootrb_singleton_class_attached

I’minterestedbyrb_class_new().Doesn’tthisnamemeansitcreatesanewclass?Let’sconfirmthat.

▼rb_class_new()

37VALUE38rb_class_new(super)39VALUEsuper;40{41Check_Type(super,T_CLASS);42if(super==rb_cClass){43rb_raise(rb_eTypeError,"can'tmakesubclassofClass");

Page 185: Ruby Hacking Guide

44}45if(FL_TEST(super,FL_SINGLETON)){46rb_raise(rb_eTypeError,"can'tmakesubclassofvirtualclass");47}48returnrb_class_boot(super);49}

(class.c)

Check_Type()ischecksthetypeofobjectstructure,sowecanignoreit.rb_raise()iserrorhandlingsowecanignoreit.Onlyrb_class_boot()remains.Solet’slookatit.

▼rb_class_boot()

21VALUE22rb_class_boot(super)23VALUEsuper;24{25NEWOBJ(klass,structRClass);/*allocatesstructRClass*/26OBJSETUP(klass,rb_cClass,T_CLASS);/*initializationoftheRBasicpart*/2728klass->super=super;/*(A)*/29klass->iv_tbl=0;30klass->m_tbl=0;31klass->m_tbl=st_init_numtable();3233OBJ_INFECT(klass,super);34return(VALUE)klass;35}

(class.c)

NEWOBJ()andOBJSETUP()arefixedexpressionsusedwhencreatingRubyobjectsthatpossessoneofthebuilt-instructuretypes(structRxxxx).Theyarebothmacros.InNEWOBJ(),structRClassiscreated

Page 186: Ruby Hacking Guide

andthepointerisputinitsfirstparameterklass.InOBJSETUP(),thestructRBasicmemberoftheRClass(andthusbasic.klassandbasic.flags)isinitialized.

OBJ_INFECT()isamacrorelatedtosecurity.Fromnowon,we’llignoreit.

At(A),thesupermemberofklassissettothesuperparameter.Itlookslikerb_class_boot()isafunctionthatcreatesaclassinheritingfromsuper.

So,asrb_class_boot()isafunctionthatcreatesaclass,andrb_class_new()isalmostidentical.

Then,let’soncemorelookatrb_singleton_class()’scallgraph:

rb_singleton_classSPECIAL_SINGLETONrb_make_metaclassrb_class_bootrb_singleton_class_attached

Herealsorb_class_boot()iscalled.Souptothatpoint,it’sthesameasinnormalclasses.What’sgoingonafteriswhat’sdifferentbetweennormalclassesandsingletonclasses,inotherwordsthecharacteristicsofsingletonclasses.Ifeverything’sclearsofar,wejustneedtoreadrb_singleton_class()andrb_make_metaclass().

Compressedrb_singleton_class()

Page 187: Ruby Hacking Guide

rb_singleton_class()isalittlelongsowe’llfirstremoveitsnon-essentialparts.

▼rb_singleton_class()

678#defineSPECIAL_SINGLETON(x,c)do{\679if(obj==(x)){\680returnc;\681}\682}while(0)

684VALUE685rb_singleton_class(obj)686VALUEobj;687{688VALUEklass;689690if(FIXNUM_P(obj)||SYMBOL_P(obj)){691rb_raise(rb_eTypeError,"can'tdefinesingleton");692}693if(rb_special_const_p(obj)){694SPECIAL_SINGLETON(Qnil,rb_cNilClass);695SPECIAL_SINGLETON(Qfalse,rb_cFalseClass);696SPECIAL_SINGLETON(Qtrue,rb_cTrueClass);697rb_bug("unknownimmediate%ld",obj);698}699700DEFER_INTS;701if(FL_TEST(RBASIC(obj)->klass,FL_SINGLETON)&&702(BUILTIN_TYPE(obj)==T_CLASS||703rb_iv_get(RBASIC(obj)->klass,"__attached__")==obj)){704klass=RBASIC(obj)->klass;705}706else{707klass=rb_make_metaclass(obj,RBASIC(obj)->klass);708}709if(OBJ_TAINTED(obj)){710OBJ_TAINT(klass);711}712else{

Page 188: Ruby Hacking Guide

713FL_UNSET(klass,FL_TAINT);714}715if(OBJ_FROZEN(obj))OBJ_FREEZE(klass);716ALLOW_INTS;717718returnklass;719}

(class.c)

Thefirstandthesecondhalfareseparatedbyablankline.Thefirsthalfhandlesspecialcasesandthesecondhalfhandlesthegeneralcase.Inotherwords,thesecondhalfisthetrunkofthefunction.That’swhywe’llkeepitforlaterandtalkaboutthefirsthalf.

Everythingthatishandledinthefirsthalfarenon-pointerVALUEs,itmeanstheirobjectstructsdonotexist.First,FixnumandSymbolareexplicitlypicked.Then,rb_special_const_p()isafunctionthatreturnstruefornon-pointerVALUEs,sothereonlyQtrue,QfalseandQnilshouldgetcaught.Otherthanthat,therearenovalidnon-pointerVALUEsoitwouldbereportedasabugwithrb_bug().

DEFER_INTS()andALLOW_INTS()bothendwiththesameINTSsoyoushouldseeapairinthem.That’sthecase,andtheyaremacrosrelatedtosignals.Becausetheyaredefinedinrubysig.h,youcanguessthatINTSistheabbreviationofinterrupts.Youcanignorethem.

Compressedrb_make_metaclass()▼rb_make_metaclass()

Page 189: Ruby Hacking Guide

142VALUE143rb_make_metaclass(obj,super)144VALUEobj,super;145{146VALUEklass=rb_class_boot(super);147FL_SET(klass,FL_SINGLETON);148RBASIC(obj)->klass=klass;149rb_singleton_class_attached(klass,obj);150if(BUILTIN_TYPE(obj)==T_CLASS){151RBASIC(klass)->klass=klass;152if(FL_TEST(obj,FL_SINGLETON)){153RCLASS(klass)->super=RBASIC(rb_class_real(RCLASS(obj)->super))->klass;154}155}156157returnklass;158}

(class.c)

Wealreadysawrb_class_boot().Itcreatesa(normal)classusingthesuperparameterasitssuperclass.Afterthat,theFL_SINGLETONofthisclassisset.Thisisclearlysuspicious.Thenameofthefunctionmakesusthinkthatitistheindicationofasingletonclass.

Whataresingletonclasses?Finishingtheaboveprocess,furthermore,we’llthroughawaythedeclarationsbecauseparameters,returnvaluesandlocalvariablesareallVALUE.Thatmakesusabletocompresstothefollowing:

▼rb_singleton_class()rb_make_metaclass()(aftercompression)

rb_singleton_class(obj)

Page 190: Ruby Hacking Guide

{if(FL_TEST(RBASIC(obj)->klass,FL_SINGLETON)&&(BUILTIN_TYPE(obj)==T_CLASS||BUILTIN_TYPE(obj)==T_MODULE)&&rb_iv_get(RBASIC(obj)->klass,"__attached__")==obj){klass=RBASIC(obj)->klass;}else{klass=rb_make_metaclass(obj,RBASIC(obj)->klass);}returnklass;}

rb_make_metaclass(obj,super){klass=createaclasswithsuperassuperclass;FL_SET(klass,FL_SINGLETON);RBASIC(obj)->klass=klass;rb_singleton_class_attached(klass,obj);if(BUILTIN_TYPE(obj)==T_CLASS){RBASIC(klass)->klass=klass;if(FL_TEST(obj,FL_SINGLETON)){RCLASS(klass)->super=RBASIC(rb_class_real(RCLASS(obj)->super))->klass;}}

returnklass;}

Theconditionoftheifstatementofrb_singleton_class()seemsquitecomplicated.However,thisconditionisnotconnectedtorb_make_metaclass(),whichisthemainstream,sowe’llseeitlater.Let’sfirstthinkaboutwhathappensonthefalsebranchoftheif.

TheBUILTIN_TYPE()ofrb_make_metaclass()issimilartoTYPE()asitisamacrotogetthestructuretypeflag(T_xxxx).Thatmeansthischeckinrb_make_metaclassmeans“ifobjisaclass”.Forthemoment

Page 191: Ruby Hacking Guide

weassumethatobjisaclass,sowe’llremoveit.

Withthesesimplifications,wegetthefollowing:

▼rb_singleton_class()rb_make_metaclass()(afterrecompression)

rb_singleton_class(obj){klass=createaclasswithRBASIC(obj)->klassassuperclass;FL_SET(klass,FL_SINGLETON);RBASIC(obj)->klass=klass;returnklass;}

Butthereisstillaquitehardtounderstandsidetoit.That’sbecauseklassisusedtoooften.Solet’srenametheklassvariabletosclass.

▼rb_singleton_class()rb_make_metaclass()(variablesubstitution)

rb_singleton_class(obj){sclass=createaclasswithRBASIC(obj)->klassassuperclass;FL_SET(sclass,FL_SINGLETON);RBASIC(obj)->klass=sclass;returnsclass;}

Nowitshouldbeveryeasytounderstand.Tomakeitevensimpler,I’verepresentedwhatisdonewithadiagram(figure1).Inthehorizontaldirectionisthe“instance–class”relation,andintheverticaldirectionisinheritance(thesuperclassesareabove).

Page 192: Ruby Hacking Guide

Figure1:rb_singleton_class

Whencomparingthefirstandlastpartofthisdiagram,youcanunderstandthatsclassisinsertedwithoutchangingthestructure.That’sallthereistosingletonclasses.Inotherwordstheinheritanceisincreasedonestep.Bydefiningmethodsthere,wecandefinemethodswhichhavecompletelynothingtodowithotherinstancesofklass.

SingletonclassesandinstancesBytheway,didyounoticeabout,duringthecompressionprocess,thecalltorb_singleton_class_attached()wasstealthilyremoved?Here:

rb_make_metaclass(obj,super){klass=createaclasswithsuperassuperclass;FL_SET(klass,FL_SINGLETON);RBASIC(obj)->klass=klass;rb_singleton_class_attached(klass,obj);/*THIS*/

Page 193: Ruby Hacking Guide

Let’shavealookatwhatitdoes.

▼rb_singleton_class_attached()

130void131rb_singleton_class_attached(klass,obj)132VALUEklass,obj;133{134if(FL_TEST(klass,FL_SINGLETON)){135if(!RCLASS(klass)->iv_tbl){136RCLASS(klass)->iv_tbl=st_init_numtable();137}138st_insert(RCLASS(klass)->iv_tbl,rb_intern("__attached__"),obj);139}140}

(class.c)

IftheFL_SINGLETONflagofklassisset…inotherwordsifit’sasingletonclass,putthe__attached__→objrelationintheinstancevariabletableofklass(iv_tbl).That’showitlookslike(inourcaseklassisalwaysasingletonclass…inotherwordsitsFL_SINGLETONflagisalwaysset).

__attached__doesnothavethe@prefix,butit’sstoredintheinstancevariablestablesoit’sstillaninstancevariable.SuchaninstancevariablecanneverbereadattheRubylevelsoitcanbeusedtokeepvaluesforthesystem’sexclusiveuse.

Let’snowthinkabouttherelationshipbetweenklassandobj.klassisthesingletonclassofobj.Inotherwords,this“invisible”instancevariableallowsthesingletonclasstorememberthe

Page 194: Ruby Hacking Guide

instanceitwascreatedfrom.Itsvalueisusedwhenthesingletonclassischanged,notablytocallhookmethodsontheinstance(i.e.obj).Forexample,whenamethodisaddedtoasingletonclass,theobj‘ssingleton_method_addedmethodiscalled.Thereisnologicalnecessitytodoingit,itwasdonebecausethat’showitwasdefinedinthelanguage.

Butisitreallyallright?Storingtheinstancein__attached__willforceonesingletonclasstohaveonlyoneattachedinstance.Forexample,bygetting(insomewayoranother)thesingletonclassandcallingnewonit,won’tasingletonclassenduphavingmultipleinstances?

Thiscannotbedonebecausetheproperchecksaredonetopreventthecreationofaninstanceofasingletonclass.

Singletonclassesareinthefirstplaceforsingletonmethods.Singletonmethodsaremethodsexistingonlyonaparticularobject.Ifsingletonclassescouldhavemultipleinstances,theywouldbethesameasnormalclasses.Hence,eachsingletonclasshasonlyoneinstance…orrather,itmustbelimitedtoone.

SummaryWe’vedonealot,maybemadearealmayhem,solet’sfinishandputeverythinginorderwithasummary.

Whataresingletonclasses?TheyareclassesthathavetheFL_SINGLETONflagsetandthatcanonlyhaveoneinstance.

Page 195: Ruby Hacking Guide

Whataresingletonmethods?Theyaremethodsdefinedinthesingletonclassofanobject.

Metaclasses

Inheritanceofsingletonmethods

InfinitechainofclassesEvenaclasshasaclass,andit’sClass.AndtheclassofClassisagainClass.Wefindourselvesinaninfiniteloop(figure2).

Figure2:Infiniteloopofclasses

Uptohereit’ssomethingwe’vealreadygonethrough.What’sgoingafterthatisthethemeofthischapter.Whydoclasseshavetomakealoop?

First,inRubyalldataareobjects.AndclassesaredatainRubysotheyhavetobeobjects.

Astheyareobjects,theymustanswertomethods.Andsettingtherule“toanswertomethodsyoumustbelongtoaclass”made

Page 196: Ruby Hacking Guide

processingeasier.That’swherecomestheneedforaclasstoalsohaveaclass.

Let’sbaseourselvesonthisandthinkaboutthewaytoimplementit.First,wecantryfirstwiththemostnaïveway,Class‘sclassisClassClass,ClassClass’sclassisClassClassClass…,chainingclassesofclassesonebyone.Butwhicheverthewayyoulookatit,thiscan’tbeimplementedeffectively.That’swhyit’scommoninobjectorientedlanguageswhereclassesareobjectsthatClass’sclassistoClassitself,creatinganendlessvirtualinstance-classrelationship.

((errata:ThisstructureisimplementedefficientlyinrecentRuby1.8,thusitcanbeimplementedefficiently.))

I’mrepeatingmyself,butthefactthatClass‘sclassisClassisonlytomaketheimplementationeasier,there’snothingimportantinthislogic.

“Classisalsoanobject”“Everythingisanobject”isoftenusedasadvertisingstatementwhenspeakingaboutRuby.Andasapartofthat,“Classesarealsoobjects!”alsoappears.Buttheseexpressionsoftengotoofar.Whenthinkingaboutthesesayings,wehavetosplitthemintwo:

alldataareobjectsclassesaredata

Page 197: Ruby Hacking Guide

Talkingaboutdataorcodemakesadiscussionmuchhardertounderstand.That’swhyherewe’llrestrictthemeaningof“data”to“whatcanbeputinvariablesinprograms”.

Beingabletomanipulateclassesfromprogramsgivesprogramstheabilitytomanipulatethemselves.Thisiscalledreflection.InRuby,whichisaobjectorientedlanguageandfurthermorehasclasses,itisequivalenttobeabletodirectlymanipulateclasses.

Nevertheless,there’salsoawayinwhichclassesarenotobjects.Forexample,there’snoprobleminprovidingafeaturetomanipulateclassesasfunction-stylemethods(functionsdefinedatthetop-level).However,asinsidetheinterpretertherearedatastructurestorepresenttheclasses,it’smorenaturalinobjectorientedlanguagestomakethemavailabledirectly.AndRubydidthischoice.

Furthermore,anobjectiveinRubyisforalldatatobeobjects.That’swhyit’sappropriatetomakethemobjects.

Bytheway,thereisalsoareasonnotlinkedtoreflectionwhyinRubyclasseshadtobemadeobjects.Thatistopreparetheplacetodefinemethodswhichareindependentfrominstances(whatarecalledstaticmethodsinJavaandC++).

Andtoimplementstaticmethods,anotherthingwasnecessary:singletonmethods.Bychainreaction,thatalsomakessingletonclassesnecessary.Figure3showsthesedependencyrelationships.

Page 198: Ruby Hacking Guide

Figure3:Requirementsdependencies

ClassmethodsinheritanceInRuby,singletonmethodsdefinedinaclassarecalledclassmethods.However,theirspecificationisalittlestrange.Forsomereasons,classmethodsareinheritable.

classAdefA.test#definesasingletonmethodinAputs("ok")endend

classB<Aend

B.test()#callsit

Thiscan’toccurwithsingletonmethodsfromobjectsthatarenotclasses.Inotherwords,classesaretheonlyoneshandledspecially.Inthefollowingsectionwe’llseehowclassmethodsareinherited.

Page 199: Ruby Hacking Guide

SingletonclassofaclassAssumingthatclassmethodsareinherited,whereisthisoperationdone?Itmustbedoneeitheratclassdefinition(creation)oratsingletonmethoddefinition.Thenlet’sfirstlookatthecodedefiningclasses.

Classdefinitionmeansofcourserb_define_class().Nowlet’stakethecallgraphofthisfunction.

rb_define_classrb_class_inheritedrb_define_class_idrb_class_newrb_class_bootrb_make_metaclassrb_class_bootrb_singleton_class_attached

Ifyou’rewonderingwhereyou’veseenitbefore,welookedatitintheprevioussection.Atthattimeyoudidnotseeitbutifyoulookclosely,somehowrb_make_metaclass()appeared.Aswesawbefore,thisfunctionintroducesasingletonclass.Thisisverysuspicious.Whyisthiscalledevenifwearenotdefiningasingletonfunction?Furthermore,whyisthelowerlevelrb_make_metaclass()usedinsteadofrb_singleton_class()?Itlookslikewehavetocheckthesesurroundingsagain.

rb_define_class_id()

Let’sfirststartourreadingwithitscaller,rb_define_class_id().

Page 200: Ruby Hacking Guide

▼rb_define_class_id()

160VALUE161rb_define_class_id(id,super)162IDid;163VALUEsuper;164{165VALUEklass;166167if(!super)super=rb_cObject;168klass=rb_class_new(super);169rb_name_class(klass,id);170rb_make_metaclass(klass,RBASIC(super)->klass);171172returnklass;173}

(class.c)

rb_class_new()wasafunctionthatcreatesaclasswithsuperasitssuperclass.rb_name_class()‘snamemeansitnamesaclass,butforthemomentwedonotcareaboutnamessowe’llskipit.Afterthatthere’stherb_make_metaclass()inquestion.I’mconcernedbythefactthatwhencalledfromrb_singleton_class(),theparametersweredifferent.Lasttimewaslikethis:

rb_make_metaclass(obj,RBASIC(obj)->klass);

Butthistimeislikethis:

rb_make_metaclass(klass,RBASIC(super)->klass);

Soasyoucanseeit’sslightlydifferent.Howdotheresultschangedependingonthat?Let’shaveonceagainalookatasimplified

Page 201: Ruby Hacking Guide

rb_make_metaclass().

rb_make_metaclass(oncemore)▼rb_make_metaclass(afterfirstcompression)

rb_make_metaclass(obj,super){klass=createaclasswithsuperassuperclass;FL_SET(klass,FL_SINGLETON);RBASIC(obj)->klass=klass;rb_singleton_class_attached(klass,obj);if(BUILTIN_TYPE(obj)==T_CLASS){RBASIC(klass)->klass=klass;if(FL_TEST(obj,FL_SINGLETON)){RCLASS(klass)->super=RBASIC(rb_class_real(RCLASS(obj)->super))->klass;}}

returnklass;}

Lasttime,theifstatementwaswhollyskipped,butlookingonceagain,somethingisdoneonlyforT_CLASS,inotherwordsclasses.Thisclearlylooksimportant.Inrb_define_class_id(),asit’scalledlikethis:

rb_make_metaclass(klass,RBASIC(super)->klass);

Let’sexpandrb_make_metaclass()’sparametervariableswiththeactualvalues.

▼rb_make_metaclass(recompression)

Page 202: Ruby Hacking Guide

rb_make_metaclass(klass,super_klass/*==RBASIC(super)->klass*/){sclass=createaclasswithsuper_classassuperclass;RBASIC(klass)->klass=sclass;RBASIC(sclass)->klass=sclass;returnsclass;}

Doingthisasadiagramgivessomethinglikefigure4.Init,thenamesbetweenparenthesesaresingletonclasses.ThisnotationisoftenusedinthisbooksoI’dlikeyoutorememberit.Thismeansthatobj‘ssingletonclassiswrittenas(obj).And(klass)isthesingletonclassforklass.Itlookslikethesingletonclassiscaughtbetweenaclassandthisclass’ssuperclass’sclass.

Figure4:Introductionofaclass’ssingletonclass

Byexpandingourimaginationfurtherfromthisresult,wecanthinkthatthesuperclass’sclass(thecinfigure4)mustagainbeasingletonclass.You’llunderstandwithonemoreinheritancelevel(figure5).

Page 203: Ruby Hacking Guide

Figure5:Hierarchyofmulti-levelinheritance

Astherelationshipbetweensuperandklassisthesameastheonebetweenklassandklass2,cmustbethesingletonclass(super).Ifyoucontinuelikethis,finallyyou’llarriveattheconclusionthatObject‘sclassmustbe(Object).Andthat’sthecaseinpractice.Forexample,byinheritinglikeinthefollowingprogram:

classA<ObjectendclassB<Aend

internally,astructurelikefigure6iscreated.

Figure6:Classhierarchyandmetaclasses

Page 204: Ruby Hacking Guide

Asclassesandtheirmetaclassesarelinkedandinheritlikethis,classmethodsareinherited.

ClassofaclassofaclassYou’veunderstoodtheworkingofclassmethodsinheritance,butbydoingthat,intheoppositesomequestionshaveappeared.Whatistheclassofaclass’ssingletonclass?Forthis,wecancheckitbyusingdebuggers.I’vemadefigure7fromtheresultsofthisinvestigation.

Figure7:Classofaclass’ssingletonclass

Aclass’ssingletonclassputsitselfasitsownclass.Quitecomplicated.

Thesecondquestion:theclassofObjectmustbeClass.Didn’tIproperlyconfirmthisinchapter1:Rubylanguageminimumbyusingclass()method?

Page 205: Ruby Hacking Guide

p(Object.class())#Class

Certainly,that’sthecase“attheRubylevel”.But“attheClevel”,it’sthesingletonclass(Object).If(Object)doesnotappearattheRubylevel,it’sbecauseObject#classskipsthesingletonclasses.Let’slookatthebodyofthemethod,rb_obj_class()toconfirmthat.

▼rb_obj_class()

86VALUE87rb_obj_class(obj)88VALUEobj;89{90returnrb_class_real(CLASS_OF(obj));91}

76VALUE77rb_class_real(cl)78VALUEcl;79{80while(FL_TEST(cl,FL_SINGLETON)||TYPE(cl)==T_ICLASS){81cl=RCLASS(cl)->super;82}83returncl;84}

(object.c)

CLASS_OF(obj)returnsthebasic.klassofobj.Whileinrb_class_real(),allsingletonclassesareskipped(advancingtowardsthesuperclass).Inthefirstplace,singletonclassarecaughtbetweenaclassanditssuperclass,likeaproxy.That’swhywhena“real”classisnecessary,wehavetofollowthesuperclass

Page 206: Ruby Hacking Guide

chain(figure8).

I_CLASSwillappearlaterwhenwewilltalkaboutinclude.

Figure8:Singletonclassandrealclass

SingletonclassandmetaclassWell,thesingletonclassesthatwereintroducedinclassesisalsoonetypeofclass,it’saclass’sclass.Soitcanbecalledmetaclass.

However,youshouldbewaryofthefactthatbeingasingletonclassdoesnotmeanbeingametaclass.Thesingletonclassesintroducedinclassesaremetaclasses.Theimportantfactisnotthattheyaresingletonclasses,butthattheyaretheclassesofclasses.IwasstuckonthispointwhenIstartedlearningRuby.AsImaynotbetheonlyone,Iwouldliketomakethisclear.

Thinkingaboutthis,therb_make_metaclass()functionnameisnotverygood.Whenusedforaclass,itdoesindeedcreateametaclass,butwhenusedforotherobjects,thecreatedclassisnotametaclass.

Thenfinally,evenifyouunderstoodthatsomeclassesare

Page 207: Ruby Hacking Guide

metaclasses,it’snotasiftherewasanyconcretegain.I’dlikeyounottocaretoomuchaboutit.

BootstrapWehavenearlyfinishedourtalkaboutclassesandmetaclasses.Butthereisstilloneproblemleft.It’saboutthe3metaobjectsObject,ModuleandClass.These3cannotbecreatedwiththecommonuseAPI.Tomakeaclass,itsmetaclassmustbebuilt,butlikewesawsometimeago,themetaclass’ssuperclassisClass.However,asClasshasnotbeencreatedyet,themetaclasscannotbebuild.Soinruby,onlythese3classes’screationishandledspecially.

Thenlet’slookatthecode:

▼Object,ModuleandClasscreation

1243rb_cObject=boot_defclass("Object",0);1244rb_cModule=boot_defclass("Module",rb_cObject);1245rb_cClass=boot_defclass("Class",rb_cModule);12461247metaclass=rb_make_metaclass(rb_cObject,rb_cClass);1248metaclass=rb_make_metaclass(rb_cModule,metaclass);1249metaclass=rb_make_metaclass(rb_cClass,metaclass);

(object.c)

First,inthefirsthalf,boot_defclass()issimilartorb_class_boot(),itjustcreatesaclasswithitsgivensuperclassset.Theselinksgiveussomethingliketheleftpartoffigure9.

Page 208: Ruby Hacking Guide

Andinthethreelinesofthesecondhalf,(Object),(Module)and(Class)arecreatedandset(rightfigure9).(Object)and(Module)‘sclasses…thatisthemselves…isalreadysetinrb_make_metaclass()sothereisnoproblem.Withthis,themetaobjects’bootstrapisfinished.

Figure9:Metaobjectscreation

Aftertakingeverythingintoaccount,itgivesusthefinalshapelikefigure10.

Page 209: Ruby Hacking Guide

Figure10:Rubymetaobjects

Classnames

Inthissection,wewillanalysehow’sformedthereciprocalconversionbetweenclassandclassnames,inotherwordsconstants.Concretely,wewilltargetrb_define_class()andrb_define_class_under().

Name→classFirstwe’llreadrb_defined_class().Aftertheendofthisfunction,theclasscanbefoundfromtheconstant.

▼rb_define_class()

Page 210: Ruby Hacking Guide

183VALUE184rb_define_class(name,super)185constchar*name;186VALUEsuper;187{188VALUEklass;189IDid;190191id=rb_intern(name);192if(rb_autoload_defined(id)){/*(A)autoload*/193rb_autoload_load(id);194}195if(rb_const_defined(rb_cObject,id)){/*(B)rb_const_defined*/196klass=rb_const_get(rb_cObject,id);/*(C)rb_const_get*/197if(TYPE(klass)!=T_CLASS){198rb_raise(rb_eTypeError,"%sisnotaclass",name);199}/*(D)rb_class_real*/200if(rb_class_real(RCLASS(klass)->super)!=super){201rb_name_error(id,"%sisalreadydefined",name);202}203returnklass;204}205if(!super){206rb_warn("nosuperclassfor'%s',Objectassumed",name);207}208klass=rb_define_class_id(id,super);209rb_class_inherited(super,klass);210st_add_direct(rb_class_tbl,id,klass);211212returnklass;213}

(class.c)

Thiscanbeclearlydividedintothetwoparts:beforeandafterrb_define_class_id().Theformeristoacquireorcreatetheclass.Thelatteristoassignittotheconstant.Wewilllookatitinmoredetailbelow.

Page 211: Ruby Hacking Guide

(A)InRuby,thereisafeaturenamedautoloadthatautomaticallyloadslibrarieswhencertainconstantsareaccessed.Thesefunctionsnamedrb_autoload_xxxx()areforitschecks.Youcanignoreitwithoutanyproblem.

(B)WedeterminewhetherthenameconstanthasbeendefinedornotinObject.

(C)Getthevalueofthenameconstant.Thiswillbeexplainedindetailinchapter6.

(D)We’veseenrb_class_real()sometimeago.IftheclasscisasingletonclassoranICLASS,itclimbsthesuperhierarchyuptoaclassthatisnotandreturnsit.Inshort,thisfunctionskipsthevirtualclassesthatshouldnotappearattheRubylevel.

That’swhatwecanreadnearby.

Asconstantsareinvolvedaroundthis,itisverytroublesome.ButIfeellikethechapteraboutconstantsisprobablynotsorightplacetotalkaboutclassdefinition,that’sthereasonofsuchhalfwaydescriptionaroundhere.

Moreover,aboutthiscomingafterrb_define_class_id(),

st_add_direct(rb_class_tbl,id,klass);

Thispartassignstheclasstotheconstant.However,whicheverwayyoulookatityoudonotseethat.Infact,top-levelclassesand

Page 212: Ruby Hacking Guide

modulesthataredefinedinCareseparatedfromtheotherconstantsandregroupedinrb_class_tbl().ThesplitisslightlyrelatedtotheGC.It’snotessential.

Class→nameWeunderstoodhowtheclasscanbeobtainedfromtheclassname,buthowtodotheopposite?BydoingthingslikecallingporClass#name,wecangetthenameoftheclass,buthowisitimplemented?

Infactthisisdonebyrb_name_class()whichalreadyappearedalongtimeago.Thecallisaroundthefollowing:

rb_define_classrb_define_class_idrb_name_class

Let’slookatitscontent:

▼rb_name_class()

269void270rb_name_class(klass,id)271VALUEklass;272IDid;273{274rb_iv_set(klass,"__classid__",ID2SYM(id));275}

(variable.c)

Page 213: Ruby Hacking Guide

__classid__isanotherinstancevariablethatcan’tbeseenfromRuby.AsonlyVALUEscanbeputintheinstancevariabletable,theIDisconvertedtoSymbolusingID2SYM().

That’showweareabletofindtheconstantnamefromtheclass.

NestedclassesSo,inthecaseofclassesdefinedatthetop-level,weknowhowworksthereciprocallinkbetweennameandclass.What’sleftisthecaseofclassesdefinedinmodulesorotherclasses,andforthatit’salittlemorecomplicated.Thefunctiontodefinethesenestedclassesisrb_define_class_under().

▼rb_define_class_under()

215VALUE216rb_define_class_under(outer,name,super)217VALUEouter;218constchar*name;219VALUEsuper;220{221VALUEklass;222IDid;223224id=rb_intern(name);225if(rb_const_defined_at(outer,id)){226klass=rb_const_get(outer,id);227if(TYPE(klass)!=T_CLASS){228rb_raise(rb_eTypeError,"%sisnotaclass",name);229}230if(rb_class_real(RCLASS(klass)->super)!=super){231rb_name_error(id,"%sisalreadydefined",name);232}233returnklass;

Page 214: Ruby Hacking Guide

234}235if(!super){236rb_warn("nosuperclassfor'%s::%s',Objectassumed",237rb_class2name(outer),name);238}239klass=rb_define_class_id(id,super);240rb_set_class_path(klass,outer,name);241rb_class_inherited(super,klass);242rb_const_set(outer,id,klass);243244returnklass;245}

(class.c)

Thestructureisliketheoneofrb_define_class():beforethecalltorb_define_class_id()istheredefinitioncheck,afteristhecreationofthereciprocallinkbetweenconstantandclass.Thefirsthalfisprettyboringlysimilartorb_define_class()sowe’llskipit.Inthesecondhalf,rb_set_class_path()isnew.We’regoingtolookatit.

rb_set_class_path()

Thisfunctiongivesthenamenametotheclassklassnestedintheclassunder.“classpath”meansaconstantnameincludingallthenestinginformationstartingfromtop-level,forexample“Net::NetPrivate::Socket”.

▼rb_set_class_path()

210void211rb_set_class_path(klass,under,name)212VALUEklass,under;213constchar*name;

Page 215: Ruby Hacking Guide

214{215VALUEstr;216217if(under==rb_cObject){/*definedattop-level*/218str=rb_str_new2(name);/*createaRubystringfromname*/219}220else{/*nestedconstant*/221str=rb_str_dup(rb_class_path(under));/*copythereturnvalue*/222rb_str_cat2(str,"::");/*concatenate"::"*/223rb_str_cat2(str,name);/*concatenatename*/224}225rb_iv_set(klass,"__classpath__",str);226}

(variable.c)

Everythingexceptthelastlineistheconstructionoftheclasspath,andthelastlinemakestheclassrememberitsownname.__classpath__isofcourseanotherinstancevariablethatcan’tbeseenfromaRubyprogram.Inrb_name_class()therewas__classid__,butidisdifferentbecauseitdoesnotincludenestinginformation(lookatthetablebelow).

__classpath__Net::NetPrivate::Socket__classid__Socket

Itmeansclassesdefinedforexampleinrb_defined_class()allhave__classid__or__classpath__defined.Sotofindunder‘sclasspathwecanlookupintheseinstancevariables.Thisisdonebyrb_class_path().We’llomititscontent.

Namelessclasses

Page 216: Ruby Hacking Guide

ContrarytowhatIhavejustsaid,thereareinfactcasesinwhichneither__classpath__nor__classid__areset.ThatisbecauseinRubyyoucanuseamethodlikethefollowingtocreateaclass.

c=Class.new()

Ifaclassiscreatedlikethis,itwon’tgothroughrb_define_class_id()andtheclasspathwon’tbeset.Inthiscase,cdoesnothaveanyname,whichistosaywegetanunnamedclass.

However,iflaterit’sassignedtoaconstant,anamewillbeattachedtotheclassatthatmoment.

SomeClass=c#theclassnameisSomeClass

Strictlyspeaking,atthefirsttimerequestingthenameafterassigningittoaconstant,thenamewillbeattachedtotheclass.Forinstance,whencallingponthisSomeClassclassorwhencallingtheClass#namemethod.Whendoingthis,avalueequaltotheclassissearchedinrb_class_tbl,andanamehastobechosen.Thefollowingcasecanalsohappen:

classAclassBC=tmp=Class.new()p(tmp)#herewesearchforthenameendend

sointheworstcasewehavetosearchforthewholeconstant

Page 217: Ruby Hacking Guide

space.However,generally,therearen’tmanyconstantssoevensearchingallconstantsdoesnottaketoomuchtime.

Include

Weonlytalkedaboutclassessolet’sfinishthischapterwithsomethingelseandtalkaboutmoduleinclusion.

rb_include_module(1)IncludesaredonebytheordinarymethodModule#include.ItscorrespondingfunctioninCisrb_include_module().Infact,tobeprecise,itsbodyisrb_mod_include(),andthereModule#append_featureiscalled,andthisfunction’sdefaultimplementationfinallycallsrb_include_module().Mixingwhat’shappeninginRubyandCgivesusthefollowingcallgraph.

Module#include(rb_mod_include)Module#append_features(rb_mod_append_features)rb_include_module

Anyway,themanipulationsthatareusuallyregardedasinclusionsaredonebyrb_include_module().Thisfunctionisalittlelongsowe’lllookatitahalfatatime.

▼rb_include_module(firsthalf)

Page 218: Ruby Hacking Guide

/*includemoduleinclass*/347void348rb_include_module(klass,module)349VALUEklass,module;350{351VALUEp,c;352intchanged=0;353354rb_frozen_class_p(klass);355if(!OBJ_TAINTED(klass)){356rb_secure(4);357}358359if(NIL_P(module))return;360if(klass==module)return;361362switch(TYPE(module)){363caseT_MODULE:364caseT_CLASS:365caseT_ICLASS:366break;367default:368Check_Type(module,T_MODULE);369}

(class.c)

Forthemomentit’sonlysecurityandtypechecking,thereforewecanignoreit.Theprocessitselfisbelow:

▼rb_include_module(secondhalf)

371OBJ_INFECT(klass,module);372c=klass;373while(module){374intsuperclass_seen=Qfalse;375376if(RCLASS(klass)->m_tbl==RCLASS(module)->m_tbl)377rb_raise(rb_eArgError,"cyclicincludedetected");378/*(A)skipifthesuperclassalreadyincludesmodule*/

Page 219: Ruby Hacking Guide

379for(p=RCLASS(klass)->super;p;p=RCLASS(p)->super){380switch(BUILTIN_TYPE(p)){381caseT_ICLASS:382if(RCLASS(p)->m_tbl==RCLASS(module)->m_tbl){383if(!superclass_seen){384c=p;/*movetheinsertionpoint*/385}386gotoskip;387}388break;389caseT_CLASS:390superclass_seen=Qtrue;391break;392}393}394c=RCLASS(c)->super=include_class_new(module,RCLASS(c)->super);395changed=1;396skip:397module=RCLASS(module)->super;398}399if(changed)rb_clear_cache();400}

(class.c)

First,whatthe(A)blockdoesiswritteninthecomment.Itseemstobeaspecialconditionsolet’sfirstskipreadingitfornow.Byextractingtheimportantpartsfromtherestwegetthefollowing:

c=klass;while(module){c=RCLASS(c)->super=include_class_new(module,RCLASS(c)->super);module=RCLASS(module)->super;}

Inotherwords,it’sarepetitionofmodule‘ssuper.Whatisinmodule’ssupermustbeamoduleincludedbymodule(becauseourintuition

Page 220: Ruby Hacking Guide

tellsusso).Thenthesuperclassoftheclasswheretheinclusionoccursisreplacedwithsomething.Wedonotunderstandmuchwhat,butatthemomentIsawthatIfelt“Ah,doesn’tthislooktheadditionofelementstoalist(likeLISP’scons)?”anditsuddenlymakethestoryfaster.Inotherwordsit’sthefollowingform:

list=new(item,list)

Thinkingaboutthis,itseemswecanexpectthatmoduleisinsertedbetweencandc->super.Ifit’slikethis,itfitsmodule’sspecification.

Buttobesureofthiswehavetolookatinclude_class_new().

include_class_new()

▼include_class_new()

319staticVALUE320include_class_new(module,super)321VALUEmodule,super;322{323NEWOBJ(klass,structRClass);/*(A)*/324OBJSETUP(klass,rb_cClass,T_ICLASS);325326if(BUILTIN_TYPE(module)==T_ICLASS){327module=RBASIC(module)->klass;328}329if(!RCLASS(module)->iv_tbl){330RCLASS(module)->iv_tbl=st_init_numtable();331}332klass->iv_tbl=RCLASS(module)->iv_tbl;/*(B)*/333klass->m_tbl=RCLASS(module)->m_tbl;334klass->super=super;/*(C)*/

Page 221: Ruby Hacking Guide

335if(TYPE(module)==T_ICLASS){/*(D)*/336RBASIC(klass)->klass=RBASIC(module)->klass;/*(D-1)*/337}338else{339RBASIC(klass)->klass=module;/*(D-2)*/340}341OBJ_INFECT(klass,module);342OBJ_INFECT(klass,super);343344return(VALUE)klass;345}

(class.c)

We’reluckythere’snothingwedonotknow.

(A)Firstcreateanewclass.

(B)Transplantmodule’sinstancevariableandmethodtablesintothisclass.

(C)Maketheincludingclass’ssuperclass(super)thesuperclassofthisnewclass.

Inotherwords,itlookslikethisfunctioncreatesanincludeclasswhichwecanregarditassomethinglikean“avatar”ofthemodule.Theimportantpointisthatat(B)onlythepointerismovedon,withoutduplicatingthetable.Later,ifamethodisadded,themodule’sbodyandtheincludeclasswillstillhaveexactlythesamemethods(figure11).

Page 222: Ruby Hacking Guide

Figure11:Includeclass

Ifyoulookcloselyat(A),thestructuretypeflagissettoT_ICLASS.Thisseemstobethemarkofanincludeclass.Thisfunction’snameisinclude_class_new()soICLASS’sImustbeinclude.

Andifyouthinkaboutjoiningwhatthisfunctionandrb_include_module()do,weknowthatourpreviousexpectationswerenotwrong.Inbrief,includingisinsertingtheincludeclassofamodulebetweenaclassanditssuperclass(figure12).

Page 223: Ruby Hacking Guide

Figure12:Include

At(D-2)themoduleisstoredintheincludeclass’sklass.At(D-1),themodule’sbodyistakenout…I’dliketosaysoifpossible,butinfactthischeckdoesnothaveanyuse.TheT_ICLASScheckisalreadydoneatthebeginningofthisfunction,sowhenarrivingheretherecan’tstillbeaT_ICLASS.Modificationtorubypiledupatpiecebypieceduringquitealongperiodoftimesotherearequiteafewsmalloverlooks.

Thereisonemorethingtoconsider.Somehowtheincludeclass’sbasic.klassisonlyusedtopointtothemodule’sbody,soforexamplecallingamethodontheincludeclasswouldbeverybad.SoincludeclassesmustnotbeseenfromRubyprograms.Andinpracticeallmethodsskipincludeclasses,withnoexception.

SimulationItwascomplicatedsolet’slookataconcreteexample.I’dlikeyoutolookatfigure13(1).Wehavethec1classandthem1modulethatincludesm2.Fromthere,thechangesmadetoincludem1inc1are(2)and(3).imsareofcourseincludeclasses.

Page 224: Ruby Hacking Guide

Figure13:Include

rb_include_module(2)Well,nowwecanexplainthepartofrb_include_module()weskipped.

Page 225: Ruby Hacking Guide

▼rb_include_module(avoidingdoubleinclusion)

378/*(A)skipifthesuperclassalreadyincludesmodule*/379for(p=RCLASS(klass)->super;p;p=RCLASS(p)->super){380switch(BUILTIN_TYPE(p)){381caseT_ICLASS:382if(RCLASS(p)->m_tbl==RCLASS(module)->m_tbl){383if(!superclass_seen){384c=p;/*theinsertingpointismoved*/385}386gotoskip;387}388break;389caseT_CLASS:390superclass_seen=Qtrue;391break;392}393}

(class.c)

Amongthesuperclassesoftheklass(p),ifapisT_ICLASS(anincludeclass)andhasthesamemethodtableastheoneofthemodulewewanttoinclude(module),itmeansthatthepisanincludeclassofthemodule.Therefore,itwouldbeskippedtonotincludethemoduletwice.However,ifthismoduleincludesanothermodule(module->super),Itwouldbecheckedoncemore.

But,becausepisamodulethathasbeenincludedonce,themodulesincludedbyitmustalsoalreadybeincluded…that’swhatIthoughtforamoment,butwecanhavethefollowingcontext:

moduleMendmoduleM2

Page 226: Ruby Hacking Guide

endclassCincludeM#M2isnotyetincludedinMend#thereforeM2isnotinC'ssuperclasses

moduleMincludeM2#asthereM2isincludedinM,endclassCincludeM#IwouldlikeheretoonlyaddM2end

Tosaythisconversely,therearecasesthataresultofincludeisnotpropagatedsoon.

Forclassinheritance,theclass’ssingletonmethodswereinheritedbutinthecaseofmodulethereisnosuchthing.Thereforethesingletonmethodsofthemodulearenotinheritedbytheincludingclass(ormodule).Whenyouwanttoalsoinheritsingletonmethods,theusualwayistooverrideModule#append_features.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 227: Ruby Hacking Guide

RubyHackingGuide

TranslatedbySebastianKrause&ocha-

Page 228: Ruby Hacking Guide

Chapter5:Garbage

Collection

Aconceptionofanexecutingprogram

It’sallofasuddenbutatthebeginningofthischapter,we’lllearnaboutthememoryspaceofanexecutingprogram.Inthischapterwe’llstepinsidethelowerlevelpartsofacomputerquiteabit,sowithoutpreliminaryknowledgeit’llbehardtofollow.Andit’llbealsonecessaryforthefollowingchapters.Oncewefinishthishere,therestwillbeeasier.

MemorySegmentsAgeneralCprogramhasthefollowingpartsinthememoryspace:

1. thetextarea2. aplaceforstaticandglobalvariables3. themachinestack4. theheap

Thetextareaiswherethecodelies.Obviouslythesecondarea

Page 229: Ruby Hacking Guide

holdsstaticandglobalvariables.Argumentsandlocalvariablesoffunctionsarepilingupinthemachinestack.Theheapistheplacewhereallocatedbymalloc().

Let’stalkabitmoreaboutnumberthree,themachinestack.Sinceitiscalledthemachine“stack”,obviouslyithasastackstructure.Inotherwords,newstuffispiledontopofitoneafteranother.Whenweactuallypushesvaluesonthestack,eachvaluewouldbeatinypiecesuchasint.Butlogically,therearealittlelargerpieces.Theyarecalledstackframes.

Onestackframecorrespondstoonefunctioncall.Orinotherwordswhenthereisafunctioncall,onestackframeispushed.Whendoingreturn,onestackframewillbepopped.Figure1showsthereallysimplifiedappearanceofthemachinestack.

Figure1:MachineStack

Inthispicture,“above”iswrittenabovethetopofthestack,butthisitisnotnecessarilyalwaysthecasethatthemachinestack

Page 230: Ruby Hacking Guide

goesfromlowaddressestohighaddresses.Forinstance,onthex86machinethestackgoesfromhightolowaddresses.

alloca()

Byusingmalloc(),wecangetanarbitrarilylargememoryareaoftheheap.alloca()isthemachinestackversionofit.Butunlikemalloc()it’snotnecessarytofreethememoryallocatedwithalloca().Oroneshouldsay:itisfreedautomaticallyatthesamemomentofreturnofeachfunction.That’swhyit’snotpossibletouseanallocatedvalueasthereturnvalue.It’sthesameas“Youmustnotreturnthepointertoalocalvariable.”

There’sbeennotanydifficulty.Wecanconsideritsomethingtolocallyallocateanarraywhosesizecanbechangedatruntime.

Howeverthereexistenvironmentswherethereisnonativealloca().Therearestillmanywhowouldliketousealloca()evenifinsuchenvironment,sometimesafunctiontodothesamethingiswritteninC.Butinthatcase,onlythefeaturethatwedon’thavetofreeitbyourselvesisimplementedanditdoesnotnecessarilyallocatethememoryonthemachinestack.Infact,itoftendoesnot.Ifitwerepossible,anativealloca()couldhavebeenimplementedinthefirstplace.

Howcanoneimplementalloca()inC?Thesimplestimplementationis:firstallocatememorynormallywithmalloc().Thenrememberthepairofthefunctionwhichcalledalloca()and

Page 231: Ruby Hacking Guide

theassignedaddressesinagloballist.Afterthat,checkthislistwheneveralloca()iscalled,iftherearethememoriesallocatedforthefunctionsalreadyfinished,freethembyusingfree().

Figure2:Thebehaviorofanalloca()implementedinC

Themissing/alloca.cofrubyisanexampleofanemulatedalloca().

Overview

Fromhereonwecanatlasttalkaboutthemainsubjectofthischapter:garbagecollection.

WhatisGC?

Page 232: Ruby Hacking Guide

Objectsarenormallyontopofthememory.Naturally,ifalotofobjectsarecreated,alotofmemoryisused.Ifmemorywereinfinitetherewouldbenoproblem,butinrealitythereisalwaysamemorylimit.That’swhythememorywhichisnotusedanymoremustbecollectedandrecycled.Moreconcretelythememoryreceivedthroughmalloc()mustbereturnedwithfree().

However,itwouldrequirealotofeffortsifthemanagementofmalloc()andfree()wereentirelylefttoprogrammers.Especiallyinobjectorientedprograms,becauseobjectsarereferringeachother,itisdifficulttotellwhentoreleasememory.

Theregarbagecollectioncomesin.GarbageCollection(GC)isafeaturetoautomaticallydetectandfreethememorywhichhasbecomeunnecessary.Withgarbagecollection,theworry“WhenshouldIhavetofree()??”hasbecomeunnecessary.Betweenwhenitexistsandwhenitdoesnotexist,theeaseofwritingprogramsdiffersconsiderably.

Bytheway,inabookaboutsomethingthatI’veread,there’sadescription“thethingtotidyupthefragmentedusablememoryisGC”.Thistaskiscalled“compaction”.Itiscompactionbecauseitmakesathingcompact.Becausecompactionmakesmemorycachemoreoftenhit,ithaseffectsforspeed-uptosomeextent,butitisnotthemainpurposeofGC.ThepurposeofGCistocollectmemory.TherearemanyGCswhichcollectmemoriesbutdon’tdocompaction.TheGCofrubyalsodoesnotdocompaction.

Page 233: Ruby Hacking Guide

Then,inwhatkindofsystemisGCavailable?InCandC++,there’sBoehmGC\footnote{BoehmGChttp://www.hpl.hp.com/personal/Hans_Boehm/gc}whichcanbeusedasanadd-on.And,fortherecentlanguagessuchasJavaandPerl,Python,C#,Eiffel,GCisastandardequipment.Andofcourse,RubyhasitsGC.Let’sfollowthedetailsofruby’sGCinthischapter.Thetargetfileisgc.c.

WhatdoesGCdo?BeforeexplainingtheGCalgorithm,Ishouldexplain“whatgarbagecollectionis”.Inotherwords,whatkindofstateofthememoryis“theunnecessarymemory”?

Tomakedescriptionsmoreconcrete,let’ssimplifythestructurebyassumingthatthereareonlyobjectsandlinks.ThiswouldlookasshowninFigure3.

Figure3:Objects

Theobjectspointedtobyglobalvariablesandtheobjectsonthe

Page 234: Ruby Hacking Guide

stackofalanguagearesurelynecessary.Andobjectspointedtobyinstancevariablesoftheseobjectsarealsonecessary.Furthermore,theobjectsthatarereachablebyfollowinglinksfromtheseobjectsarealsonecessary.

Toputitmorelogically,thenecessaryobjectsareallobjectswhichcanbereachedrecursivelyvialinksfromthe“surelynecessaryobjects”asthestartpoints.Thisisdepictedinfigure4.Whatareontheleftofthelineareall“surelynecessaryobjects”,andtheobjectswhichcanbereachedfromthemarecoloredblack.Theseobjectscoloredblackarethenecessaryobjects.Therestoftheobjectscanbereleased.

Figure4:necessaryobjectsandunnecessaryobjects

Page 235: Ruby Hacking Guide

Intechnicalterms,“thesurelynecessaryobjects”arecalled“therootsofGC”.That’sbecausetheyaretherootsoftreestructuresthatemergesasaconsequenceoftracingnecessaryobjects.

MarkandSweepGCwasfirstimplementedinLisp.TheGCimplementedinLispatfirst,itmeanstheworld’sfirstGC,iscalledmark&sweepGC.TheGCofrubyisonetypeofit.

TheimageofMark-and-SweepGCisprettyclosetoourdefinitionof“necessaryobject”.First,put“marks”ontherootobjects.Settingthemasthestartpoints,put“marks”onallreachableobjects.Thisisthemarkphase.

Atthemomentwhenthere’snotanyreachableobjectleft,checkallobjectsintheobjectpool,release(sweep)allobjectsthathavenotmarked.“Sweep”isthe“sweep”ofMinesweeper.

Therearetwoadvantages.

Theredoesnotneedtobeany(oralmostany)concernforgarbagecollectionoutsidetheimplementationofGC.Cyclescanalsobereleased.(Asforcycles,seealsothesectionof“ReferenceCount”)

Therearealsotwodisadvantages.

Inordertosweepeveryobjectmustbetouchedatleastonce.

Page 236: Ruby Hacking Guide

TheloadoftheGCisconcentratedatonepoint.

Whenusingtheemacseditor,theresometimesappears"Garbagecollecting..."anditcompletelystopsreacting.Thatisanexampleoftheseconddisadvantage.Butthispointcanbealleviatedbymodifyingthealgorithm(itiscalledincrementalGC).

StopandCopyStopandCopyisavariationofMarkandSweep.First,prepareseveralobjectareas.Tosimplifythisdescription,assumetherearetwoareasAandBhere.Andputan“active”markontheoneoftheareas.Whencreatinganobject,createitonlyinthe“active”one.(Figure5)

Figure5:StopandCopy(1)

WhentheGCstarts,followlinksfromtherootsinthesamemannerasmark-and-sweep.However,moveobjectstoanotherareainsteadofmarkingthem(Figure6).Whenallthelinkshavebeenfollowed,discardtheallelementswhichremaininA,andmakeBactivenext.

Page 237: Ruby Hacking Guide

Figure6:StopandCopy(2)

StopandCopyalsohastwoadvantages:

CompactionhappensatthesametimeascollectingthememorySinceobjectsthatreferenceeachothermoveclosertogether,there’smorepossibilityofhittingthecache.

Andalsotwodisadvantages:

TheobjectareaneedstobemorethantwiceasbigThepositionsofobjectswillbechanged

Itseemswhatexistinthisworldarenotonlypositivethings.

ReferencecountingReferencecountingdiffersabitfromtheaforementionedGCs,thereach-checkcodeisdistributedinseveralplaces.

First,attachanintegercounttoeachelement.Whenreferringviavariablesorarrays,thecounterofthereferencedobjectisincreased.Whenquittingtorefer,decreasethecounter.Whenthecounterofanobjectbecomeszero,releasetheobject.Thisisthe

Page 238: Ruby Hacking Guide

methodcalledreferencecounting(Figure7).

Figure7:Referencecounting

Thismethodalsohastwoadvantages:

TheloadofGCisdistributedovertheentireprogram.Theobjectthatbecomesunnecessaryisimmediatelyfreed.

Andalsotwodisadvantages.

Thecounterhandlingtendstobeforgotten.Whendoingitnaivelycyclesarenotreleased.

I’llexplainaboutthesecondpointjustincase.AcycleisacycleofreferencesasshowninFigure8.Ifthisisthecasethecounterswillneverdecreaseandtheobjectswillneverbereleased.

Page 239: Ruby Hacking Guide

Figure8:Cycle

Bytheway,latestPython(2.2)usesreferencecountingGCbutitcanfreecycles.However,itisnotbecauseofthereferencecountingitself,butbecauseitsometimesinvokesmarkandsweepGCtocheck.

ObjectManagement

Ruby’sgarbagecollectionisonlyconcernedwithrubyobjects.Moreover,itonlyconcernedwiththeobjectscreatedandmanagedbyruby.Converselyspeaking,ifthememoryisallocatedwithoutfollowingacertainprocedure,itwon’tbetakencareof.Forinstance,thefollowingfunctionwillcauseamemoryleakevenifrubyisrunning.

voidnot_ok(){malloc(1024);/*receivememoryanddiscardit*/}

However,thefollowingfunctiondoesnotcauseamemoryleak.

voidthis_is_ok()

Page 240: Ruby Hacking Guide

{rb_ary_new();/*createarubyarrayanddiscardit*/}

Sincerb_ary_new()usesRuby’sproperinterfacetoallocatememory,thecreatedobjectisunderthemanagementoftheGCofruby,thusrubywilltakecareofit.

structRVALUE

Sincethesubstanceofanobjectisastruct,managingobjectsmeansmanagingthatstructs.Ofcoursethenon-pointerobjectslikeFixnumSymbolniltruefalseareexceptions,butIwon’talwaysdescribeaboutittopreventdescriptionsfrombeingredundant.

Eachstructtypehasitsdifferentsize,butprobablyinordertokeepmanagementsimpler,aunionofallthestructsofbuilt-inclassesisdeclaredandtheunionisalwaysusedwhendealingwithmemory.Thedeclarationofthatunionisasfollows.

▼RVALUE

211typedefstructRVALUE{212union{213struct{214unsignedlongflags;/*0ifnotused*/215structRVALUE*next;216}free;217structRBasicbasic;218structRObjectobject;219structRClassklass;220structRFloatflonum;221structRStringstring;

Page 241: Ruby Hacking Guide

222structRArrayarray;223structRRegexpregexp;224structRHashhash;225structRDatadata;226structRStructrstruct;227structRBignumbignum;228structRFilefile;229structRNodenode;230structRMatchmatch;231structRVarmapvarmap;232structSCOPEscope;233}as;234}RVALUE;

(gc.c)

structRVALUEisastructthathasonlyoneelement.I’veheardthatthereasonwhyunionisnotdirectlyusedistoenabletoeasilyincreaseitsmemberswhendebuggingorwhenextendinginthefuture.

First,let’sfocusonthefirstelementoftheunionfree.flags.Thecommentsays“0ifnotused”,butisittrue?Istherenotanypossibilityforfree.flagstobe0bychance?

Aswe’veseeninChapter2:Objects,allobjectstructshavestructRBasicasitsfirstelement.Therefore,bywhicheverelementoftheunionweaccess,obj->as.free.flagsmeansthesameasitiswrittenasobj->as.basic.flags.Andobjectsalwayshavethestruct-typeflag(suchasT_STRING),andtheflagisalwaysnot0.Therefore,theflagofan“alive”objectwillnevercoincidentallybe0.Hence,wecanconfirmthatsettingtheirflagsto0isnecessityandsufficiencytorepresent“dead”objects.

Page 242: Ruby Hacking Guide

ObjectheapThememoryforalltheobjectstructshasbeenbroughttogetheringlobalvariableheaps.Hereafter,let’scallthisanobjectheap.

▼Objectheap

239#defineHEAPS_INCREMENT10240staticRVALUE**heaps;241staticintheaps_length=0;242staticintheaps_used=0;243244#defineHEAP_MIN_SLOTS10000245staticint*heaps_limits;246staticintheap_slots=HEAP_MIN_SLOTS;

(gc.c)

heapsisanarrayofarraysofstructRVALUE.SinceitisheapS,theeachcontainedarrayisprobablyeachheap.Eachelementofheapiseachslot(Figure9).

Page 243: Ruby Hacking Guide

Figure9:heaps,heap,slot

Thelengthofheapsisheap_lengthanditcanbechanged.Thenumberoftheslotsactuallyinuseisheaps_used.Thelengthofeachheapisinthecorrespondingheaps_limits[index].Figure10showsthestructureoftheobjectheap.

Page 244: Ruby Hacking Guide

Figure10:conceptualdiagramofheapsinmemory

Thisstructurehasanecessitytobethisway.Forinstance,ifallstructsarestoredinanarray,thememoryspacewouldbethemostcompact,butwecannotdorealloc()becauseitcouldchangetheaddresses.ThisisbecauseVALUEsaremerepointers.

InthecaseofanimplementationofJava,thecounterpartofVALUEsarenotaddressesbuttheindexesofobjects.Sincetheyarehandledthroughapointertable,objectsaremovable.Howeverinthiscase,indexingofthearraycomesineverytimeanobjectaccessoccurs

Page 245: Ruby Hacking Guide

anditlowerstheperformanceinsomedegree.

Ontheotherhand,whathappensifitisanone-dimensionalarrayofpointerstoRVALUEs(itmeansVALUEs)?Thisseemstobeabletogowellatthefirstglance,butitdoesnotwhenGC.Thatis,asI’lldescribeindetail,theGCofrubyneedstoknowtheintegers"whichseemsVALUE(thepointerstoRVALUE).IfallRVALUEareallocatedinaddresseswhicharefarfromeachother,itneedstocomparealladdressofRVALUEwithallintegers“whichcouldbepointers”.ThismeansthetimeforGCbecomestheordermorethanO(n^2),andnotacceptable.

Accordingtotheserequirements,itisgoodthattheobjectheapformastructurethattheaddressesarecohesivetosomeextentandwhosepositionandtotalamountarenotrestrictedatthesametime.

freelist

UnusedRVALUEsaremanagedbybeinglinkedasasinglelinewhichisalinkedlistthatstartswithfreelist.Theas.free.nextofRVALUEisthelinkusedforthispurpose.

▼freelist

236staticRVALUE*freelist=0;

(gc.c)

Page 246: Ruby Hacking Guide

add_heap()

Asweunderstoodthedatastructure,let’sreadthefunctionadd_heap()toaddaheap.Becausethisfunctioncontainsalotoflinesnotpartofthemainline,I’llshowtheonesimplifiedbyomittingerrorhandlingsandcastings.

▼add_heap()(simplified)

staticvoidadd_heap(){RVALUE*p,*pend;

/*extendheapsifnecessary*/if(heaps_used==heaps_length){heaps_length+=HEAPS_INCREMENT;heaps=realloc(heaps,heaps_length*sizeof(RVALUE*));heaps_limits=realloc(heaps_limits,heaps_length*sizeof(int));}

/*increaseheapsby1*/p=heaps[heaps_used]=malloc(sizeof(RVALUE)*heap_slots);heaps_limits[heaps_used]=heap_slots;pend=p+heap_slots;if(lomem==0||lomem>p)lomem=p;if(himem<pend)himem=pend;heaps_used++;heap_slots*=1.8;

/*linktheallocatedRVALUEtofreelist*/while(p<pend){p->as.free.flags=0;p->as.free.next=freelist;freelist=p;p++;}}

Page 247: Ruby Hacking Guide

Pleasecheckthefollowingpoints.

thelengthofheapisheap_slotstheheap_slotsbecomes1.8timeslargereverytimewhenaheapisaddedthelengthofheaps[i](thevalueofheap_slotswhencreatingaheap)isstoredinheaps_limits[i].

Plus,sincelomemandhimemaremodifiedonlybythisfunction,onlybythisfunctionyoucanunderstandthemechanism.Thesevariablesholdthelowestandthehighestaddressesoftheobjectheap.Thesevaluesareusedlaterwhendeterminingtheintegers“whichseemsVALUE”.

rb_newobj()

Consideringalloftheabovepoints,wecantellthewaytocreateanobjectinasecond.IfthereisatleastaRVALUElinkedfromfreelist,wecanuseit.Otherwise,doGCorincreasetheheaps.Let’sconfirmthisbyreadingtherb_newobj()functiontocreateanobject.

▼rb_newobj()

297VALUE298rb_newobj()299{300VALUEobj;301302if(!freelist)rb_gc();303304obj=(VALUE)freelist;

Page 248: Ruby Hacking Guide

305freelist=freelist->as.free.next;306MEMZERO((void*)obj,RVALUE,1);307returnobj;308}

(gc.c)

Iffreelestis0,inotherwords,ifthere’snotanyunusedstructs,invokeGCandcreatespaces.Evenifwecouldnotcollectnotanyobject,there’snoproblembecauseinthiscaseanewspaceisallocatedinrb_gc().Andtakeastructfromfreelist,zerofillitbyMEMZERO(),andreturnit.

Mark

Asdescribed,ruby’sGCisMark&Sweep.Its“mark”is,concretelyspeaking,tosetaFL_MARKflag:lookforunusedVALUE,setFL_MARKflagstofoundones,thenlookattheobjectheapafterinvestigatingallandfreeobjectsthatFL_MARKhasnotbeenset.

rb_gc_mark()

rb_gc_mark()isthefunctiontomarkobjectsrecursively.

▼rb_gc_mark()

573void574rb_gc_mark(ptr)

Page 249: Ruby Hacking Guide

575VALUEptr;576{577intret;578registerRVALUE*obj=RANY(ptr);579580if(rb_special_const_p(ptr))return;/*specialconstnotmarked*/581if(obj->as.basic.flags==0)return;/*freecell*/582if(obj->as.basic.flags&FL_MARK)return;/*alreadymarked*/583584obj->as.basic.flags|=FL_MARK;585586CHECK_STACK(ret);587if(ret){588if(!mark_stack_overflow){589if(mark_stack_ptr-mark_stack<MARK_STACK_MAX){590*mark_stack_ptr=ptr;591mark_stack_ptr++;592}593else{594mark_stack_overflow=1;595}596}597}598else{599rb_gc_mark_children(ptr);600}601}

(gc.c)

ThedefinitionofRANY()isasfollows.Itisnotparticularlyimportant.

▼RANY()

295#defineRANY(o)((RVALUE*)(o))

(gc.c)

Page 250: Ruby Hacking Guide

Therearethechecksfornon-pointersoralreadyfreedobjectsandtherecursivechecksformarkedobjectsatthebeginning,

obj->as.basic.flags|=FL_MARK;

andobj(thisistheptrparameterofthisfunction)ismarked.Thennext,it’stheturntofollowthereferencesfromobjandmark.rb_gc_mark_children()doesit.

Theothers,whatstartswithCHECK_STACK()andiswrittenalotisadevicetopreventthemachinestackoverflow.Sincerb_gc_mark()usesrecursivecallstomarkobjects,ifthereisabigobjectcluster,itispossibletorunshortofthelengthofthemachinestack.Tocounterthat,ifthemachinestackisnearlyoverflow,itstopstherecursivecalls,pilesuptheobjectsonagloballist,andlateritmarksthemonceagain.Thiscodeisomittedbecauseitisnotpartofthemainline.

rb_gc_mark_children()

Now,asforrb_gc_mark_children(),itjustlistsuptheinternaltypesandmarksonebyone,thusitisnotjustlongbutalsonotinteresting.Here,itisshownbutthesimpleenumerationsareomitted:

▼rb_gc_mark_children()

603void604rb_gc_mark_children(ptr)

Page 251: Ruby Hacking Guide

605VALUEptr;606{607registerRVALUE*obj=RANY(ptr);608609if(FL_TEST(obj,FL_EXIVAR)){610rb_mark_generic_ivar((VALUE)obj);611}612613switch(obj->as.basic.flags&T_MASK){614caseT_NIL:615caseT_FIXNUM:616rb_bug("rb_gc_mark()calledforbrokenobject");617break;618619caseT_NODE:620mark_source_filename(obj->as.node.nd_file);621switch(nd_type(obj)){622caseNODE_IF:/*1,2,3*/623caseNODE_FOR:624caseNODE_ITER:/*…………omitted…………*/749}750return;/*notneedtomarkbasic.klass*/751}752753rb_gc_mark(obj->as.basic.klass);754switch(obj->as.basic.flags&T_MASK){755caseT_ICLASS:756caseT_CLASS:757caseT_MODULE:758rb_gc_mark(obj->as.klass.super);759rb_mark_tbl(obj->as.klass.m_tbl);760rb_mark_tbl(obj->as.klass.iv_tbl);761break;762763caseT_ARRAY:764if(FL_TEST(obj,ELTS_SHARED)){765rb_gc_mark(obj->as.array.aux.shared);766}767else{768longi,len=obj->as.array.len;769VALUE*ptr=obj->as.array.ptr;770

Page 252: Ruby Hacking Guide

771for(i=0;i<len;i++){772rb_gc_mark(*ptr++);773}774}775break;

/*…………omitted…………*/

837default:838rb_bug("rb_gc_mark():unknowndatatype0x%x(0x%x)%s",839obj->as.basic.flags&T_MASK,obj,840is_pointer_to_heap(obj)?"corruptedobject":"nonobject");841}842}

(gc.c)

Itcallsrb_gc_mark()recursively,isonlywhatI’dlikeyoutoconfirm.Intheomittedpart,NODEandT_xxxxareenumeratedrespectively.NODEwillbeintroducedinPart2.

Additionally,let’sseetheparttomarkT_DATA(thestructusedforextensionlibraries)becausethere’ssomethingwe’dliketocheck.Thiscodeisextractedfromthesecondswitchstatement.

▼rb_gc_mark_children()–T_DATA

789caseT_DATA:790if(obj->as.data.dmark)(*obj->as.data.dmark)(DATA_PTR(obj));791break;

(gc.c)

Here,itdoesnotuserb_gc_mark()orsimilarfunctions,butthe

Page 253: Ruby Hacking Guide

dmarkwhichisgivenfromusers.Insideit,ofcourse,itmightuserb_gc_mark()orsomething,butnotusingisalsopossible.Forexample,inanextremesituation,ifauserdefinedobjectdoesnotcontainVALUE,there’snoneedtomark.

rb_gc()

Bynow,we’vefinishedtotalkabouteachobject.Fromnowon,let’sseethefunctionrb_gc()thatpresidesthewhole.Theobjectsmarkedhereare“objectswhichareobviouslynecessary”.Inotherwords,“therootsofGC”.

▼rb_gc()

1110void1111rb_gc()1112{1113structgc_list*list;1114structFRAME*volatileframe;/*gcc2.7.2.3-O2bug??*/1115jmp_bufsave_regs_gc_mark;1116SET_STACK_END;11171118if(dont_gc||during_gc){1119if(!freelist){1120add_heap();1121}1122return;1123}

/*……markfromtheallroots……*/

1183gc_sweep();1184}

(gc.c)

Page 254: Ruby Hacking Guide

Therootswhichshouldbemarkedwillbeshownonebyoneafterthis,butI’dliketomentionjustonepointhere.

InrubytheCPUregistersandthemachinestackarealsotheroots.ItmeansthatthelocalvariablesandargumentsofCareautomaticallymarked.Forexample,

staticintf(void){VALUEarr=rb_ary_new();

/*……dovariousthings……*/}

likethisway,wecanprotectanobjectjustbyputtingitintoavariable.ThisisaverysignificanttraitoftheGCofruby.Becauseofthisfeature,ruby’sextensionlibrariesareinsanelyeasytowrite.

However,whatisonthestackisnotonlyVALUE.Therearealotoftotallyunrelatedvalues.HowtoresolvethisisthekeywhenreadingtheimplementationofGC.

TheRubyStackFirst,itmarksthe(ruby‘s)stackframesusedbytheinterpretor.SinceyouwillbeabletofindoutwhoitisafterreachingPart3,youdon’thavetothinksomuchaboutitfornow.

▼MarkingtheRubyStack

Page 255: Ruby Hacking Guide

1130/*markframestack*/1131for(frame=ruby_frame;frame;frame=frame->prev){1132rb_gc_mark_frame(frame);1133if(frame->tmp){1134structFRAME*tmp=frame->tmp;1135while(tmp){1136rb_gc_mark_frame(tmp);1137tmp=tmp->prev;1138}1139}1140}1141rb_gc_mark((VALUE)ruby_class);1142rb_gc_mark((VALUE)ruby_scope);1143rb_gc_mark((VALUE)ruby_dyna_vars);

(gc.c)

ruby_frameruby_classruby_scoperuby_dyna_varsarethevariablestopointtoeachtopofthestacksoftheevaluator.Theseholdtheframe,theclassscope,thelocalvariablescope,andtheblocklocalvariablesatthattimerespectively.

RegisterNext,itmarkstheCPUregisters.

▼markingtheregisters

1148FLUSH_REGISTER_WINDOWS;1149/*Here,allregistersmustbesavedintojmp_buf.*/1150setjmp(save_regs_gc_mark);1151mark_locations_array((VALUE*)save_regs_gc_mark,sizeof(save_regs_gc_mark)/sizeof(VALUE*));

(gc.c)

Page 256: Ruby Hacking Guide

FLUSH_REGISTER_WINDOWSisspecial.Wewillseeitlater.

setjmp()isessentiallyafunctiontoremotelyjump,butthecontentoftheregistersaresavedintotheargument(whichisavariableoftypejmp_buf)asitssideeffect.Makinguseofthis,itattemptstomarkthecontentoftheregisters.Thingsaroundherereallylooklikesecrettechniques.

HoweveronlydjgppandHuman68karespeciallytreated.djgppisagccenvironmentforDOS.Human68kisanOSofSHARPX680x0Series.Inthesetwoenvironments,thewholeregistersseemtobenotsavedonlybytheordinarysetjmp(),setjmp()isredefinedasfollowsasaninline-assemblertoexplicitlywriteouttheregisters.

▼theoriginalversionofsetjmp

1072#ifdef__GNUC__1073#ifdefined(__human68k__)||defined(DJGPP)1074#ifdefined(__human68k__)1075typedefunsignedlongrb_jmp_buf[8];1076__asm__(".even\n\2-bytealignment1077_rb_setjmp:\n\thelabelofrb_setjmp()function1078move.l4(sp),a0\n\loadthefirstargumenttothea0register1079movem.ld3-d7/a3-a5,(a0)\n\copytheregisterstowherea0pointsto1080moveq.l#0,d0\n\set0tod0(asthereturnvalue)1081rts");return1082#ifdefsetjmp1083#undefsetjmp1084#endif1085#else1086#ifdefined(DJGPP)1087typedefunsignedlongrb_jmp_buf[6];1088__asm__(".align4\n\order4-bytealignment1089_rb_setjmp:\n\thelabelforrb_setjmp()function1090pushl%ebp\n\pushebptothestack

Page 257: Ruby Hacking Guide

1091movl%esp,%ebp\n\setthestackpointertoebp1092movl8(%ebp),%ebp\n\pickupthefirstargumentandsettoebp1093movl%eax,(%ebp)\n\inthefollowings,storeeachregister1094movl%ebx,4(%ebp)\n\towhereebppointsto1095movl%ecx,8(%ebp)\n\1096movl%edx,12(%ebp)\n\1097movl%esi,16(%ebp)\n\1098movl%edi,20(%ebp)\n\1099popl%ebp\n\restoreebpfromthestack1100xorl%eax,%eax\n\set0toeax(asthereturnvalue)1101ret");return1102#endif1103#endif1104intrb_setjmp(rb_jmp_buf);1105#definejmp_bufrb_jmp_buf1106#definesetjmprb_setjmp1107#endif/*__human68k__orDJGPP*/1108#endif/*__GNUC__*/

(gc.c)

Alignmentistheconstraintwhenputtingvariablesonmemories.Forexample,in32-bitmachineintisusually32bits,butwecannotalwaystake32bitsfromanywhereofmemories.Particularly,RISCmachinehasstrictconstraints,itisdecidedlike“fromamultipleof4byte”or“fromevenbyte”.Whentherearesuchconstraints,memoryaccessunitcanbemoresimplified(thus,itcanbefaster).Whenthere’stheconstraintof“fromamultipleof4byte”,itiscalled“4-bytealignment”.

Plus,inccofdjgpporHuman68k,there’sarulethatthecompilerputtheunderlinetotheheadofeachfunctionname.Therefore,whenwritingaCfunctioninAssembler,weneedtoputtheunderline(_)toitsheadbyourselves.Thistypeofconstraintsaretechniquesinordertoavoidtheconflictsinnameswithlibrary

Page 258: Ruby Hacking Guide

functions.AlsoinUNIX,itissaidthattheunderlinehadbeenattachedbysometimeago,butitalmostdisappearsnow.

Now,thecontentoftheregistershasbeenabletobewrittenoutintojmp_buf,itwillbemarkedinthenextcode:

▼marktheregisters(shownagain)

1151mark_locations_array((VALUE*)save_regs_gc_mark,sizeof(save_regs_gc_mark)/sizeof(VALUE*));

(gc.c)

Thisisthefirsttimethatmark_locations_array()appears.I’lldescribeitinthenextsection.

mark_locations_array()

▼mark_locations_array()

500staticvoid501mark_locations_array(x,n)502registerVALUE*x;503registerlongn;504{505while(n--){506if(is_pointer_to_heap((void*)*x)){507rb_gc_mark(*x);508}509x++;510}511}

(gc.c)

Page 259: Ruby Hacking Guide

Thisfunctionistomarktheallelementsofanarray,butitslightlydiffersfromthepreviousmarkfunctions.Untilnow,eachplacetobemarkediswhereweknowitsurelyholdsaVALUE(apointertoanobject).Howeverthistime,whereitattemptstomarkistheregisterspace,itisenoughtoexpectthatthere’realsowhatarenotVALUE.Tocounterthat,ittriestodetectwhetherornotthevalueisaVALUE(apointer),thenifitseems,thevaluewillbehandledasapointer.Thiskindofmethodsarecalled“conservativeGC”.Itseemsthatitisconservativebecauseit“tentativelyinclinesthingstothesafeside”

Next,we’lllookatthefunctiontocheckif“itlookslikeaVALUE”,itisis_pointer_to_heap().

is_pointer_to_heap()

▼is_pointer_to_heap()

480staticinlineint481is_pointer_to_heap(ptr)482void*ptr;483{484registerRVALUE*p=RANY(ptr);485registerRVALUE*heap_org;486registerlongi;487488if(p<lomem||p>himem)returnQfalse;489490/*checkifthere'sthepossibilitythatpisapointer*/491for(i=0;i<heaps_used;i++){492heap_org=heaps[i];493if(heap_org<=p&&p<heap_org+heaps_limits[i]&&494((((char*)p)-((char*)heap_org))%sizeof(RVALUE))==0)

Page 260: Ruby Hacking Guide

495returnQtrue;496}497returnQfalse;498}

(gc.c)

IfIbrieflyexplainit,itwouldlooklikethefollowings:

checkifitisinbetweenthetopandthebottomoftheaddresseswhereRVALUEsreside.checkifitisintherangeofaheapmakesurethevaluepointstotheheadofaRVALUE.

Sincethemechanismislikethis,it’sobviouslypossiblethatanon-VALUEvalueismistakenlyhandledasaVALUE.Butatleast,itwillneverfailtofindouttheusedVALUEs.And,withthisamountoftests,itmayrarelypickupanon-VALUEvalueunlessitintentionallydoes.Therefore,consideringaboutthebenefitswecanobtainbyGC,it’ssufficienttocompromise.

RegisterWindowThissectionisaboutFLUSH_REGISTER_WINDOWS()whichhasbeendeferred.

RegisterwindowsarethemechanismtoenabletoputapartofthemachinestackintoinsidetheCPU.Inshort,itisacachewhosepurposeofuseisnarroweddown.Recently,itexistsonlyinSparcarchitecture.It’spossiblethattherearealsoVALUEsinregister

Page 261: Ruby Hacking Guide

windows,andit’salsonecessarytogetdownthemintomemory.

Thecontentofthemacroislikethis:

▼FLUSH_REGISTER_WINDOWS

125#ifdefined(sparc)||defined(__sparc__)126#ifdefined(linux)||defined(__linux__)127#defineFLUSH_REGISTER_WINDOWSasm("ta0x83")128#else/*Solaris,notsparclinux*/129#defineFLUSH_REGISTER_WINDOWSasm("ta0x03")130#endif131#else/*Notasparc*/132#defineFLUSH_REGISTER_WINDOWS133#endif

(defines.h)

asm(...)isabuilt-inassembler.However,eventhoughIcallitassembler,thisinstructionnamedtaisthecallofaprivilegedinstruction.Inotherwords,thecallisnotoftheCPUbutoftheOS.That’swhytheinstructionisdifferentforeachOS.ThecommentsdescribeonlyaboutLinuxandSolaris,butactuallyFreeBSDandNetBSDarealsoworksonSparc,sothiscommentiswrong.

Plus,ifitisnotSparc,itisunnecessarytoflush,thusFLUSH_REGISTER_WINDOWSisdefinedasnothing.Likethis,themethodtogetamacrobacktonothingisveryfamoustechniquethatisalsoconvenientwhendebugging.

MachineStack

Page 262: Ruby Hacking Guide

Then,let’sgobacktotherestofrb_gc().Thistime,itmarksVALUESsinthemachinestack.

▼markthemachinestack

1152rb_gc_mark_locations(rb_gc_stack_start,(VALUE*)STACK_END);1153#ifdefined(__human68k__)1154rb_gc_mark_locations((VALUE*)((char*)rb_gc_stack_start+2),1155(VALUE*)((char*)STACK_END+2));1156#endif

(gc.c)

rb_gc_stack_startseemsthestartaddress(theendofthestack)andSTACK_ENDseemstheendaddress(thetop).And,rb_gc_mark_locations()practicallymarksthestackspace.

Therearerb_gc_mark_locations()twotimesinordertodealwiththearchitectureswhicharenot4-bytealignment.rb_gc_mark_locations()triestomarkforeachportionofsizeof(VALUE),soifitisin2-bytealignmentenvironment,sometimesnotbeabletoproperlymark.Inthiscase,itmovestherange2bytesthenmarksagain.

Now,rb_gc_stack_start,STACK_END,rb_gc_mark_locations(),let’sexaminethesethreeinthisorder.

Init_stack()

Thefirstthingisrb_gc_starck_start.ThisvariableissetonlyduringInit_stack().AsthenameInit_mightsuggest,thisfunctionis

Page 263: Ruby Hacking Guide

calledatthetimewheninitializingtherubyinterpretor.

▼Init_stack()

1193void1194Init_stack(addr)1195VALUE*addr;1196{1197#ifdefined(__human68k__)1198externvoid*_SEND;1199rb_gc_stack_start=_SEND;1200#else1201VALUEstart;12021203if(!addr)addr=&start;1204rb_gc_stack_start=addr;1205#endif1206#ifdefHAVE_GETRLIMIT1207{1208structrlimitrlim;12091210if(getrlimit(RLIMIT_STACK,&rlim)==0){1211doublespace=(double)rlim.rlim_cur*0.2;12121213if(space>1024*1024)space=1024*1024;1214STACK_LEVEL_MAX=(rlim.rlim_cur-space)/sizeof(VALUE);1215}1216}1217#endif1218}

(gc.c)

Whatisimportantisonlythepartinthemiddle.Itdefinesanarbitrarylocalvariable(itisallocatedonthestack)anditsetsitsaddresstorb_gc_stack_start.The_SENDinsidethecodefor__human68k__isprobablythevariabledefinedbyalibraryof

Page 264: Ruby Hacking Guide

compilerorsystem.Naturally,youcanpresumethatitisthecontractionofStackEND.

Meanwhile,thecodeafterthatbundledbyHAVE_GETRLIMITappearstocheckthelengthofthestackanddomysteriousthings.Thisisalsointhesamecontextofwhatisdoneatrb_gc_mark_children()topreventthestackoverflow.Wecanignorethis.

STACK_END

Next,we’lllookattheSTACK_ENDwhichisthemacrotodetecttheendofthestack.

▼STACK_END

345#ifdefC_ALLOCA346#defineSET_STACK_ENDVALUEstack_end;alloca(0);347#defineSTACK_END(&stack_end)348#else349#ifdefined(__GNUC__)&&defined(USE_BUILTIN_FRAME_ADDRESS)350#defineSET_STACK_ENDVALUE*stack_end=__builtin_frame_address(0)351#else352#defineSET_STACK_ENDVALUE*stack_end=alloca(1)353#endif354#defineSTACK_END(stack_end)355#endif

(gc.c)

AstherearethreevariationsofSET_STACK_END,let’sstartwiththebottomone.alloca()allocatesaspaceattheendofthestackandreturnsit,sothereturnvalueandtheendaddressofthestackshouldbeveryclose.Hence,itconsidersthereturnvalueof

Page 265: Ruby Hacking Guide

alloca()asanapproximatevalueoftheendofthestack.

Let’sgobackandlookattheoneatthetop.WhenthemacroC_ALLOCAisdefined,alloca()isnotnativelydefined,…inotherwords,itindicatesacompatiblefunctionisdefinedinC.Imentionedthatinthiscasealloca()internallyallocatesmemorybyusingmalloc().However,itdoesnothelptogetthepositionofthestackatall.Todealwiththissituation,itdeterminesthatthelocalvariablestack_endofthecurrentlyexecutingfunctionisclosetotheendofthestackandusesitsaddress(&stack_end).

Plus,thiscodecontainsalloca(0)whosepurposeisnoteasytosee.Thishasbeenafeatureofthealloca()definedinCsinceearlytimes,anditmeans“pleasecheckandfreetheunusedspace”.SincethisisusedwhendoingGC,itattemptstofreethememoryallocatedwithalloca()atthesametime.ButIthinkit’sbettertoputitinanothermacroinsteadofmixingintosuchplace…

Andatlast,inthemiddlecase,itisabout__builtin_frame_address().__GNUC__isasymboldefinedingcc(thecompilerofGNUC).Sincethisisusedtolimit,itisabuilt-ininstructionofgcc.Youcangettheaddressofthen-timespreviousstackframewith__builtin_frame_address(n).Asfor__builtin_frame_adress(0),itprovidestheaddressofthecurrentframe.

rb_gc_mark_locations()

Thelastoneistherb_gc_mark_locations()functionthatactually

Page 266: Ruby Hacking Guide

marksthestack.

▼rb_gc_mark_locations()

513void514rb_gc_mark_locations(start,end)515VALUE*start,*end;516{517VALUE*tmp;518longn;519520if(start>end){521tmp=start;522start=end;523end=tmp;524}525n=end-start+1;526mark_locations_array(start,n);527}

(gc.c)

Basically,delegatingtothefunctionmark_locations_array()whichmarksaspaceissufficient.Whatthisfunctiondoesisproperlyadjustingthearguments.Suchadjustmentisrequiredbecauseinwhichdirectionthemachinestackextendsisundecided.Ifthemachinestackextendstoloweraddresses,endissmaller,ifitextendstohigheraddresses,startissmaller.Therefore,sothatthesmalleronebecomesstart,theyareadjustedhere.

TheotherrootobjectsFinally,itmarksthebuilt-inVALUEcontainersoftheinterpretor.

Page 267: Ruby Hacking Guide

▼Theotherroots

1159/*marktheregisteredglobalvariables*/1160for(list=global_List;list;list=list->next){1161rb_gc_mark(*list->varptr);1162}1163rb_mark_end_proc();1164rb_gc_mark_global_tbl();11651166rb_mark_tbl(rb_class_tbl);1167rb_gc_mark_trap_list();11681169/*marktheinstancevariablesoftrue,false,etcifexist*/1170rb_mark_generic_ivar_tbl();1171/*markthevariablesusedintherubyparser(onlywhileparsing)*/1172rb_gc_mark_parser();

(gc.c)

WhenputtingaVALUEintoaglobalvariableofC,itisrequiredtoregisteritsaddressbyuserviarb_gc_register_address().Astheseobjectsaresavedinglobal_List,allofthemaremarked.

rb_mark_end_proc()istomarktheproceduralobjectswhichareregisteredviakindofENDstatementofRubyandexecutedwhenaprogramfinishes.(ENDstatementswillnotbedescribedinthisbook).

rb_gc_mark_global_tbl()istomarktheglobalvariabletablerb_global_tbl.(Seealsothenextchapter“VariablesandConstants”)

rb_mark_tbl(rb_class_tbl)istomarkrb_class_tblwhichwasdiscussedinthepreviouschapter.

Page 268: Ruby Hacking Guide

rb_gc_mark_trap_list()istomarktheproceduralobjectswhichareregisteredviatheRuby’sfunction-likemethodtrap.(Thisisrelatedtosignalsandwillalsonotbedescribedinthisbook.)

rb_mark_generic_ivar_tbl()istomarktheinstancevariabletablepreparedfornon-pointerVALUEsuchastrue.

rb_gc_mark_parser()istomarkthesemanticstackoftheparser.(ThesemanticstackwillbedescribedinPart2.)

Untilhere,themarkphasehasbeenfinished.

Sweep

ThespecialtreatmentforNODEThesweepphaseistheprocedurestofindoutandfreethenot-markedobjects.But,forsomereason,theobjectsoftypeT_NODEarespeciallytreated.Takealookatthenextpart:

▼atthebegginingofgc_sweep()

846staticvoid847gc_sweep()848{849RVALUE*p,*pend,*final_list;850intfreed=0;851inti,used=heaps_used;

Page 269: Ruby Hacking Guide

852853if(ruby_in_compile&&ruby_parser_stack_on_heap()){854/*Iftheyaccstackisnotonthemachinestack,855donotcollectNODEwhileparsing*/856for(i=0;i<used;i++){857p=heaps[i];pend=p+heaps_limits[i];858while(p<pend){859if(!(p->as.basic.flags&FL_MARK)&&BUILTIN_TYPE(p)==T_NODE)860rb_gc_mark((VALUE)p);861p++;862}863}864}

(gc.c)

NODEisaobjecttoexpressaprogramintheparser.NODEisputonthestackpreparedbyatoolnamedyaccwhilecompiling,butthatstackisnotalwaysonthemachinestack.Concretelyspeaking,whenruby_parser_stack_on_heap()isfalse,itindicatesitisnotonthemachinestack.Inthiscase,aNODEcouldbeaccidentallycollectedinthemiddleofitscreation,thustheobjectsoftypeT_NODEareunconditionallymarkedandprotectedfrombeingcollectedwhilecompiling(ruby_in_compile).

FinalizerAfterithasreachedhere,allnot-markedobjectscanbefreed.However,there’sonethingtodobeforefreeing.InRubythefreeingofobjectscanbehooked,anditisnecessarytocallthem.Thishookiscalled“finalizer”.

Page 270: Ruby Hacking Guide

▼gc_sweep()Middle

869freelist=0;870final_list=deferred_final_list;871deferred_final_list=0;872for(i=0;i<used;i++){873intn=0;874875p=heaps[i];pend=p+heaps_limits[i];876while(p<pend){877if(!(p->as.basic.flags&FL_MARK)){878(A)if(p->as.basic.flags){879obj_free((VALUE)p);880}881(B)if(need_call_final&&FL_TEST(p,FL_FINALIZE)){882p->as.free.flags=FL_MARK;/*remainsmarked*/883p->as.free.next=final_list;884final_list=p;885}886else{887p->as.free.flags=0;888p->as.free.next=freelist;889freelist=p;890}891n++;892}893(C)elseif(RBASIC(p)->flags==FL_MARK){894/*theobjectsthatneedtofinalize*/895/*areleftuntouched*/896}897else{898RBASIC(p)->flags&=~FL_MARK;899}900p++;901}902freed+=n;903}904if(freed<FREE_MIN){905add_heap();906}907during_gc=0;

Page 271: Ruby Hacking Guide

(gc.c)

Thischecksallovertheobjectheapfromtheedge,andfreestheobjectonwhichFL_MARKflagisnotsetbyusingobj_free()(A).obj_free()frees,forinstance,onlychar[]usedbyStringobjectsorVALUE[]usedbyArrayobjects,butitdoesnotfreetheRVALUEstructanddoesnottouchbasic.flagsatall.Therefore,ifastructismanipulatedafterobj_free()iscalled,there’snoworryaboutgoingdown.

Afteritfreestheobjects,itbranchesbasedonFL_FINALIZEflag(B).IfFL_FINALIZEissetonanobject,sinceitmeansatleastafinalizerisdefinedontheobject,theobjectisaddedtofinal_list.Otherwise,theobjectisimmediatelyaddedtofreelist.Whenfinalizing,basic.flagsbecomesFL_MARK.Thestruct-typeflag(suchasT_STRING)isclearedbecauseofthis,andtheobjectcanbedistinguishedfromaliveobjects.

Then,thisphasecompletesbyexecutingtheallfinalizers.Noticethatthehookedobjectshavealreadydiedwhencallingthefinalizers.Itmeansthatwhileexecutingthefinalizers,onecannotusethehookedobjects.

▼gc_sweep()therest

910if(final_list){911RVALUE*tmp;912913if(rb_prohibit_interrupt||ruby_in_compile){914deferred_final_list=final_list;

Page 272: Ruby Hacking Guide

915return;916}917918for(p=final_list;p;p=tmp){919tmp=p->as.free.next;920run_final((VALUE)p);921p->as.free.flags=0;922p->as.free.next=freelist;923freelist=p;924}925}926}

(gc.c)

Theforinthelasthalfisthemainfinalizingprocedure.TheifinthefirsthalfisthecasewhentheexecutioncouldnotbemovedtotheRubyprogramforvariousreasons.Theobjectswhosefinalizationisdeferredwillbeappearintheroute(C)ofthepreviouslist.

rb_gc_force_recycle()

I’lltalkaboutalittledifferentthingattheend.Untilnow,theruby‘sgarbagecollectordecideswhetherornotitcollectseachobject,butthere’salsoawaythatusersexplicitlyletitcollectaparticularobject.It’srb_gc_force_recycle().

▼rb_gc_force_recycle()

928void929rb_gc_force_recycle(p)930VALUEp;931{932RANY(p)->as.free.flags=0;

Page 273: Ruby Hacking Guide

933RANY(p)->as.free.next=freelist;934freelist=RANY(p);935}

(gc.c)

Itsmechanismisnotsospecial,butIintroducedthisbecauseyou’llseeitseveraltimesinPart2andPart3.

Discussions

TofreespacesThespaceallocatedbyanindividualobject,say,char[]ofString,isfreedduringthesweepphase,butthecodetofreetheRVALUEstructitselfhasnotappearedyet.And,theobjectheapalsodoesnotmanagethenumberofstructsinuseandsuch.Thismeansthatiftheruby’sobjectspaceisonceallocateditwouldneverbefreed.

Forexample,themailerwhatI’mcreatingnowtemporarilyusesthespacealmost40Mbyteswhenconstructingthethreadsfor500mails,butifmostofthespacebecomesunusedastheconsequenceofGCitwillkeepoccupyingthe40Mbytes.Becausemymachineisalsokindofmodern,itdoesnotmatterifjustthe40Mbytesareused.But,ifthisoccursinaserverwhichkeepsrunning,there’sthepossibilityofbecomingaproblem.

However,onealsoneedtoconsiderthatfree()doesnotalways

Page 274: Ruby Hacking Guide

meanthedecreaseoftheamountofmemoryinuse.IfitdoesnotreturnmemorytoOS,theamountofmemoryinuseoftheprocessneverdecrease.And,dependingontheimplementationofmalloc(),althoughdoingfree()itoftendoesnotcausereturningmemorytoOS.

…Ihadwrittenso,butjustbeforethedeadlineofthisbook,RVALUEbecametobefreed.TheattachedCD-ROMalsocontainstheedgeruby,sopleasecheckbydiff.…whatasadending.

GenerationalGCMark&Sweephasanweakpoint,itis“itneedstotouchtheentireobjectspaceatleastonce”.There’sthepossibilitythatusingtheideaofGenerationalGCcanmakeupfortheweakpoint.

ThefundamentalofGenerationalGCistheexperientialrulethat“Mostobjectsarelastingforeitherverylongorveryshorttime”.Youmaybeconvincedaboutthispointbythinkingforsecondsabouttheprogramsyouwrite.

Then,thinkingbasedonthisrule,onemaycomeupwiththeideathat“long-livedobjectsdonotneedtobemarkedorswepteachandeverytime”.Onceanobjectisthoughtthatitwillbelong-lived,itistreatedspeciallyandexcludedfromtheGCtarget.Then,forbothmarkingandsweeping,itcansignificantlydecreasethenumberoftargetobjects.Forexample,ifhalfoftheobjectsarelong-livedataparticularGCtime,thenumberofthetargetobjects

Page 275: Ruby Hacking Guide

ishalf.

There’saproblem,though.GenerationalGCisverydifficulttodoifobjectscan’tbemoved.Itisbecausethelong-livedobjectsare,asIjustwrote,neededto“betreatedspecially”.SincegenerationalGCdecreasesthenumberoftheobjectsdealtwithandreducesthecost,ifwhichgenerationaobjectbelongstoisnotclearlycategorized,asaconsequenceitisequivalenttodealingwithbothgenerations.Furthermore,theruby’sGCisalsoaconservativeGC,soitalsohastobecreatedsothatis_pointer_to_heap()work.Thisisparticularlydifficult.

Howtosolvethisproblemis…BythehandofMr.KiyamaMasato,theimplementationofGenerationalGCforrubyhasbeenpublished.I’llbrieflydescribehowthispatchdealswitheachproblem.Andthistime,bycourtesyofMr.Kiyama,thisGenerationalGCpatchanditspaperarecontainedinattachedCD-ROM.(Seealsodoc/generational-gc.html)

Then,Ishallstarttheexplanation.Inordertoeaseexplaining,fromnowon,thelong-livedobjectsarecalledas“old-generationobjects”,theshort-livedobjectsarecalledas“new-generationobjects”,

First,aboutthebiggestproblemwhichisthespecialtreatmentfortheold-generationobjects.Thispointisresolvedbylinkingonlythenew-generationobjectsintoalistnamednewlist.ThislistissubstantializedbyincreasingRVALUE’selements.

Page 276: Ruby Hacking Guide

Second,aboutthewaytodetecttheold-generationobjects.Itisverysimplydonebyjustremovingthenewlistobjectswhichwerenotgarbagecollectedfromthenewlist.Inotherwords,onceanobjectsurvivesthroughGC,itwillbetreatedasanold-generationobject.

Third,aboutthewaytodetectthereferencesfromold-generationobjectstonew-generationobjects.InGenerationalGC,it’ssortof,theold-generationobjectskeepbeinginthemarkedstate.However,whentherearelinksfromold-generationtonew-generation,thenew-generationobjectswillnotbemarked.(Figure11)

Figure11:referenceovergenerations

Thisisnotgood,soatthemomentwhenanold-generationalobjectreferstoanew-generationalobject,thenew-generationalobjectmustbeturnedintoold-generational.Thepatchmodifiesthe

Page 277: Ruby Hacking Guide

librariesandaddscheckstowherethere’spossibilitythatthiskindofreferenceshappens.

Thisistheoutlineofitsmechanism.Itwasscheduledthatthispatchisincludedruby1.7,butithasnotbeenincludedyet.Itissaidthatthereasonisitsspeed,There’saninferencethatthecostofthethirdpoint“checkallreferences”matters,buttheprecisecausehasnotfiguredout.

CompactionCouldtheruby’sGCdocompaction?SinceVALUEofrubyisadirectpointertoastruct,iftheaddressofthestructarechangedbecauseofcompaction,itisnecessarytochangetheallVALUEsthatpointtothemovedstructs.

However,sincetheruby’sGCisaconservativeGC,“thecasewhenitisimpossibletodeterminewhetherornotitisreallyaVALUE”ispossible.Changingthevalueeventhoughinthissituation,ifitwasnotVALUEsomethingawfulwillhappen.CompactionandconservativeGCarereallyincompatible.

But,let’scontrivecountermeasuresinonewayoranother.ThefirstwayistoletVALUEbeanobjectIDinsteadofapointer.(Figure12)ItmeanssandwichingaindirectlayerbetweenVALUEandastruct.Inthisway,asit’snotnecessarytorewriteVALUE,structscanbesafelymoved.Butastrade-offs,accessingspeedslowsdownandthecompatibilityofextensionlibrariesislost.

Page 278: Ruby Hacking Guide

Figure12:referencethroughtheobjectID

Then,thenextwayistoallowmovingthestructonlywhentheyarepointedfromonlythepointersthat“issurelyVALUE”(Figure13).ThismethodiscalledMostly-copyinggarbagecollection.Intheordinaryprograms,therearenotsomanyobjectsthatis_pointer_to_heap()istrue,sotheprobabilityofbeingabletomovetheobjectstructsisquitehigh.

Page 279: Ruby Hacking Guide

Figure13:Mostly-copyinggarbagecollection

Moreoverandmoreover,byenablingtomovethestruct,theimplementationofGenerationalGCbecomessimpleatthesametime.Itseemstobeworthtochallenge.

volatiletoprotectfromGCIwrotethatGCtakescareofVALUEonthestack,thereforeifaVALUEislocatedasalocalvariabletheVALUEshouldcertainlybemarked.Butinrealityduetotheeffectsofoptimization,it’spossiblethatthevariablesdisappear.Forexample,there’sapossibilityofdisappearinginthefollowingcase:

VALUEstr;str=rb_str_new2("...");printf("%s\n",RSTRING(str)->ptr);

Becausethiscodedoesnotaccessthestritself,somecompilers

Page 280: Ruby Hacking Guide

onlykeepsstr->ptrinmemoryanddeletesthestr.Ifthishappened,thestrwouldbecollectedandtheprocesswouldbedown.There’snochoiceinthiscase

volatileVALUEstr;

weneedtowritethisway.volatileisareservedwordofC,andithasaneffectofforbiddingoptimizationsthathavetodowiththisvariable.IfvolatilewasattachedinthecoderelatestoRuby,youcouldassumealmostcertainlythatitsexistsforGC.WhenIreadK&R,Ithought“whatistheuseofthis?”,andtotallydidn’texpecttoseetheplentyoftheminruby.

Consideringtheseaspects,thepromiseoftheconservativeGC“usersdon’thavetocareaboutGC”seemsnotalwaystrue.Therewasonceadiscussionthat“theScheme’sGCnamedKSMdoesnotneedvolatile”,butitseemsitcouldnotbeappliedtorubybecauseitsalgorithmhasahole.

Whentoinvoke

Insidegc.cWhentoinvokeGC?Insidegc.c,therearethreeplacescallingrb_gc()insideofgc.c,

Page 281: Ruby Hacking Guide

ruby_xmalloc()

ruby_xrealloc()

rb_newobj()

Asforruby_xmalloc()andruby_xrealloc(),itiswhenfailingtoallocatememory.DoingGCmayfreememoriesandit’spossiblethataspacebecomesavailableagain.rb_newobj()hasasimilarsituation,itinvokeswhenfreelistbecomesempty.

InsidetheinterpritorThere’sseveralplacesexceptforgc.cwherecallingrb_gc()intheinterpretor.

First,inio.canddir.c,whenitrunsoutoffiledescriptorsandcouldnotopen,itinvokesGC.IfIOobjectsaregarbagecollected,it’spossiblethatthefilesareclosedandfiledescriptorsbecomeavailable.

Inruby.c,rb_gc()issometimesdoneafterloadingafile.AsImentionedinthepreviousSweepsection,itistocompensateforthefactthatNODEcannotbegarbagecollectedwhilecompiling.

ObjectCreation

Page 282: Ruby Hacking Guide

We’vefinishedaboutGCandcometobeabletodealwiththeRubyobjectsfromitscreationtoitsfreeing.SoI’dliketodescribeaboutobjectcreationshere.ThisisnotsorelatedtoGC,rather,itisrelatedalittletothediscussionaboutclassesinthepreviouschapter.

AllocationFrameworkWe’vecreatedobjectsmanytimes.Forexample,inthisway:

classCendC.new()

Atthistime,howdoesC.newcreateaobject?

First,C.newisactuallyClass#new.Itsactualbodyisthis:

▼rb_class_new_instance()

725VALUE726rb_class_new_instance(argc,argv,klass)727intargc;728VALUE*argv;729VALUEklass;730{731VALUEobj;732733obj=rb_obj_alloc(klass);734rb_obj_call_init(obj,argc,argv);735736returnobj;737}

Page 283: Ruby Hacking Guide

(object.c)

rb_obj_alloc()callstheallocatemethodagainsttheklass.Inotherwords,itcallsC.allocateinthisexamplecurrentlyexplained.ItisClass#allocatebydefaultanditsactualbodyisrb_class_allocate_instance().

▼rb_class_allocate_instance()

708staticVALUE709rb_class_allocate_instance(klass)710VALUEklass;711{712if(FL_TEST(klass,FL_SINGLETON)){713rb_raise(rb_eTypeError,"can'tcreateinstanceofvirtualclass");714}715if(rb_frame_last_func()!=alloc){716returnrb_obj_alloc(klass);717}718else{719NEWOBJ(obj,structRObject);720OBJSETUP(obj,klass,T_OBJECT);721return(VALUE)obj;722}723}

(object.c)

rb_newobj()isafunctionthatreturnsaRVALUEbytakingfromthefreelist.NEWOBJ()isjustarb_newobj()withtype-casting.TheOBJSETUP()isamacrotoinitializethestructRBasicpart,youcanthinkthatthisexistsonlyinordernottoforgettosettheFL_TAINTflag.

Page 284: Ruby Hacking Guide

Therestisgoingbacktorb_class_new_instance(),thenitcallsrb_obj_call_init().Thisfunctioncallsinitializeonthejustcreatedobject,andtheinitializationcompletes.

Thisissummarizedasfollows:

SomeClass.new=Class#new(rb_class_new_instance)SomeClass.allocate=Class#allocate(rb_class_allocate_instance)SomeClass#initialize=Object#initialize(rb_obj_dummy)

Icouldsaythattheallocateclassmethodistophysicallyinitialize,theinitializeistologicallyinitialize.Themechanismlikethis,inotherwordsthemechanismthatanobjectcreationisdividedintoallocate/initializeandnewpresidesthem,iscalledthe“allocationframework”.

CreatingUserDefinedObjectsNext,we’llexamineabouttheinstancecreationsoftheclassesdefinedinextensionlibraries.Asitiscalleduser-defined,itsstructisnotdecided,withouttellinghowtoallocateit,rubydon’tunderstandhowtocreateitsobject.Let’slookathowtotellit.

Data_Wrap_Struct()

Whicheveritisuser-definedornot,itscreationmechanismitselfcanfollowtheallocationframework.ItmeansthatwhendefininganewSomeClassclassinC,weoverwritebothSomeClass.allocateandSomeClass#initialize.

Page 285: Ruby Hacking Guide

Let’slookattheallocatesidefirst.Here,itdoesthephysicalinitialization.Whatisnecessarytoallocate?Imentionedthattheinstanceoftheuser-definedclassisapairofstructRDataandauser-preparedstruct.We’llassumethatthestructisoftypestructmy.InordertocreateaVALUEbasedonthestructmy,youcanuseData_Wrap_Struct().Thisishowtouse:

structmy*ptr=malloc(sizeof(structmy));/*arbitrarilyallocateintheheap*/VALUEval=Data_Wrap_Struct(data_class,mark_f,free_f,ptr);

data_classistheclassthatvalbelongsto,ptristhepointertobewrapped.mark_fis(thepointerto)thefunctiontomarkthisstruct.However,thisdoesnotmarktheptritselfandisusedwhenthestructpointedbyptrcontainsVALUE.Ontheotherhand,free_fisthefunctiontofreetheptritself.Theargumentofthebothfunctionsisptr.Goingbackalittleandreadingthecodetomarkmayhelpyoutounderstandthingsaroundhereinoneshot.

Let’salsolookatthecontentofData_Wrap_Struct().

▼Data_Wrap_Struct()

369#defineData_Wrap_Struct(klass,mark,free,sval)\370rb_data_object_alloc(klass,sval,\(RUBY_DATA_FUNC)mark,\(RUBY_DATA_FUNC)free)

365typedefvoid(*RUBY_DATA_FUNC)_((void*));

(ruby.h)

Page 286: Ruby Hacking Guide

Mostofitisdelegatedtorb_object_alloc().

▼rb_data_object_alloc()

310VALUE311rb_data_object_alloc(klass,datap,dmark,dfree)312VALUEklass;313void*datap;314RUBY_DATA_FUNCdmark;315RUBY_DATA_FUNCdfree;316{317NEWOBJ(data,structRData);318OBJSETUP(data,klass,T_DATA);319data->data=datap;320data->dfree=dfree;321data->dmark=dmark;322323return(VALUE)data;324}

(gc.c)

Thisisnotcomplicated.Asthesameastheordinaryobjects,itpreparesaRVALUEbyusingNEWOBJ()OBJSETUP(),andsetsthemembers.

Here,let’sgobacktoallocate.We’vesucceededtocreateaVALUEbynow,sotherestisputtingitinanarbitraryfunctionanddefiningthefunctiononaclassbyrb_define_singleton_method().

Data_Get_Struct()

Thenextthingisinitialize.Notonlyforinitialize,themethodsneedawaytopulloutthestructmy*fromthepreviouslycreated

Page 287: Ruby Hacking Guide

VALUE.Inordertodoit,youcanusetheData_Get_Struct()macro.

▼Data_Get_Struct()

378#defineData_Get_Struct(obj,type,sval)do{\379Check_Type(obj,T_DATA);\380sval=(type*)DATA_PTR(obj);\381}while(0)

360#defineDATA_PTR(dta)(RDATA(dta)->data)

(ruby.h)

Asyousee,itjusttakesthepointer(tostructmy)fromamemberofRData.Thisissimple.Check_Type()justchecksthestructtype.

TheIssuesoftheAllocationFrameworkSo,I’veexplainedinnocentlyuntilnow,butactuallythecurrentallocationframeworkhasafatalissue.Ijustdescribedthattheobjectcreatedwithallocateappearstotheinitializeortheothermethods,butifthepassedobjectthatwascreatedwithallocateisnotofthesameclass,itmustbeaveryseriousproblem.Forexample,iftheobjectcreatedwiththedefaultObjct.allocate(Class#allocate)ispassedtothemethodofString,thiscauseaseriousproblem.ThatisbecauseeventhoughthemethodsofStringarewrittenbasedontheassumptionthatastructoftypestructRStringisgiven,thegivenobjectisactuallyastructRObject.Inordertoavoidsuchsituation,theobjectcreatedwithC.allocatemustbepassedonlytothemethodsofCoritssubclasses.

Page 288: Ruby Hacking Guide

Ofcourse,thisisalwaystruewhenthingsareordinarilydone.AsC.allocatecreatestheinstanceoftheclassC,itisnotpassedtothemethodsoftheotherclasses.Asanexception,itispossiblethatitispassedtothemethodofObject,butthemethodsofObjectdoesnotdependonthestructtype.

However,whatifitisnotordinarilydone?SinceC.allocateisexposedattheRubylevel,thoughI’venotdescribedaboutthemyet,bymakinguseofaliasorsuperorsomething,thedefinitionofallocatecanbemovedtoanotherclass.Inthisway,youcancreateanobjectwhoseclassisStringbutwhoseactualstructtypeisstructRObject.ItmeansthatyoucanfreelyletrubydownfromtheRubylevel.Thisisaproblem.

ThesourceoftheissueisthatallocateisexposedtotheRubylevelasamethod.Converselyspeaking,asolutionistodefinethecontentofallocateontheclassbyusingawaythatisanythingbutamethod.So,

rb_define_allocator(rb_cMy,my_allocate);

analternativelikethisiscurrentlyindiscussion.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5

Page 289: Ruby Hacking Guide

License

Page 290: Ruby Hacking Guide

RubyHackingGuide

TranslatedbyVincentISAMBART

Page 291: Ruby Hacking Guide

Chapter6:Variables

andconstants

Outlineofthischapter

RubyvariablesInRubytherearequitealotofdifferenttypesofvariablesandconstants.Let’slinethemup,startingfromthelargestscope.

GlobalvariablesConstantsClassvariablesInstancevariablesLocalvariables

Instancevariableswerealreadyexplainedinchapter2“Objects”.Inthischapterwe’lltalkabout:

GlobalvariablesClassvariablesConstants

Page 292: Ruby Hacking Guide

Wewilltalkaboutlocalvariablesinthethirdpartofthebook.

APIforvariablesTheobjectofthischapter’sanalysisisvariable.c.LetmefirstintroducetheAPIswhichwouldbetheentrypoints.

VALUErb_iv_get(VALUEobj,char*name)VALUErb_ivar_get(VALUEobj,IDname)VALUErb_iv_set(VALUEobj,char*name,VALUEval)VALUErb_ivar_set(VALUEobj,IDname,VALUEval)

ThesearetheAPIstoaccessinstancevariableswhichhavealreadybeendescribed.Theyareshownhereagainbecausetheirdefinitionsareinvariable.c.

VALUErb_cv_get(VALUEklass,char*name)VALUErb_cvar_get(VALUEklass,IDname)VALUErb_cv_set(VALUEklass,char*name,VALUEval)VALUErb_cvar_set(VALUEklass,IDname,VALUEval)

ThesefunctionsaretheAPIforaccessingclassvariables.Classvariablesbelongdirectlytoclassessothefunctionstakeaclassasparameter.Thereareintwogroups,dependingiftheirnamestartswithrb_Xvorrb_Xvar.Thedifferenceliesinthetypeofthevariable“name”.Theoneswithashorternamearegenerallyeasiertousebecausetheytakeachar*.TheoneswithalongernamearemoreforinternaluseastheytakeaID.

VALUErb_const_get(VALUEklass,IDname)VALUErb_const_get_at(VALUEklass,IDname)

Page 293: Ruby Hacking Guide

VALUErb_const_set(VALUEklass,IDname,VALUEval)

Thesefunctionsareforaccessingconstants.Constantsalsobelongtoclassessotheytakeclassesasparameter.rb_const_get()followsthesuperclasschain,whereasrb_const_get_at()doesnot(itjustlooksinklass).

structglobal_entry*rb_global_entry(IDname)VALUErb_gv_get(char*name)VALUErb_gvar_get(structglobal_entry*ent)VALUErb_gv_set(char*name,VALUEval)VALUErb_gvar_set(structglobal_entry*ent,VALUEval)

Theselastfunctionsareforaccessingglobalvariables.Theyarealittledifferentfromtheothersduetotheuseofstructglobal_entry.We’llexplainthiswhiledescribingtheimplementation.

PointsofthischapterThemostimportantpointwhentalkingaboutvariablesis“Whereandhowarevariablesstored?”,inotherwords:datastructures.

Thesecondmostimportantmatterishowwesearchforthevalues.ThescopesofRubyvariablesandconstantsarequitecomplicatedbecausevariablesandconstantsaresometimesinherited,sometimeslookedforoutsideofthelocalscope…Tohaveabetterunderstanding,youshouldthinkbycomparingtheimplementationwiththespecification,like“Itbehaveslikethisinthissituationsoitsimplementationcouldn’tbeotherthenthis!”

Page 294: Ruby Hacking Guide

Classvariables

Classvariablesarevariablesthatbelongtoclasses.InJavaorC++theyarecalledstaticvariables.Theycanbeaccessedfromboththeclassoritsinstances.But“fromaninstance”or“fromtheclass”isinformationonlyavailableintheevaluator,andwedonothaveoneforthemoment.SofromtheClevelit’slikehavingnoaccessrange.We’lljustfocusonthewaythesevariablesarestored.

ReadingThefunctionstogetaclassvariablearerb_cvar_get()andrb_cv_get().ThefunctionwiththelongernametakesIDasparameterandtheonewiththeshorteronetakeschar*.BecausetheonetakinganIDseemsclosertotheinternals,we’lllookatit.

▼rb_cvar_get()

1508VALUE1509rb_cvar_get(klass,id)1510VALUEklass;1511IDid;1512{1513VALUEvalue;1514VALUEtmp;15151516tmp=klass;1517while(tmp){1518if(RCLASS(tmp)->iv_tbl){1519if(st_lookup(RCLASS(tmp)->iv_tbl,id,&value)){1520if(RTEST(ruby_verbose)){1521cvar_override_check(id,tmp);1522}

Page 295: Ruby Hacking Guide

1523returnvalue;1524}1525}1526tmp=RCLASS(tmp)->super;1527}15281529rb_name_error(id,"uninitializedclassvariable%sin%s",1530rb_id2name(id),rb_class2name(klass));1531returnQnil;/*notreached*/1532}

(variable.c)

Thisfunctionreadsaclassvariableinklass.

Errormanagementfunctionslikerb_raise()canbesimplyignoredlikeIsaidbefore.Therb_name_error()thatappearsthistimeisafunctionforraisinganexception,soitcanbeignoredforthesamereasons.Inruby,youcanassumethatallfunctionsendingwith_errorraiseanexception.

Afterremovingallthis,wecanseethatitisjustfollowingtheklass‘ssuperclasschainonebyoneandsearchingineachiv_tbl.…Atthispoint,I’dlikeyoutosay“What?iv_tblistheinstancevariablestable,isn’tit?”Asamatteroffact,classvariablesarestoredintheinstancevariabletable.

WecandothisbecausewhencreatingIDs,thewholenameofthevariablesistakenintoaccount,includingtheprefix:rb_intern()willreturndifferentIDsfor“@var”and“@@var”.AttheRubylevel,thevariabletypeisdeterminedonlybytheprefixsothere’snowaytoaccessaclassvariablecalled@varfromRuby.

Page 296: Ruby Hacking Guide

Constants

It’salittleabruptbutI’dlikeyoutorememberthemembersofstructRClass.Ifweexcludethebasicmember,structRClasscontains:

VALUEsuper

structst_table*iv_tbl

structst_table*m_tbl

Then,consideringthat:

1. constantsbelongtoaclass2. wecan’tseeanytablededicatedtoconstantsinstructRClass3. classvariablesandinstancevariablesarebothiniv_tbl

Coulditmeanthattheconstantsarealso…

Assignmentrb_const_set()isafunctiontosetthevalueofconstants:itsetstheconstantidintheclassklasstothevalueval.

▼rb_const_set()

1377void1378rb_const_set(klass,id,val)1379VALUEklass;1380IDid;1381VALUEval;

Page 297: Ruby Hacking Guide

1382{1383mod_av_set(klass,id,val,Qtrue);1384}

(variable.c)

mod_av_set()doesallthehardwork:

▼mod_av_set()

1352staticvoid1353mod_av_set(klass,id,val,isconst)1354VALUEklass;1355IDid;1356VALUEval;1357intisconst;1358{1359char*dest=isconst?"constant":"classvariable";13601361if(!OBJ_TAINTED(klass)&&rb_safe_level()>=4)1362rb_raise(rb_eSecurityError,"Insecure:can'tset%s",dest);1363if(OBJ_FROZEN(klass))rb_error_frozen("class/module");1364if(!RCLASS(klass)->iv_tbl){1365RCLASS(klass)->iv_tbl=st_init_numtable();1366}1367elseif(isconst){1368if(st_lookup(RCLASS(klass)->iv_tbl,id,0)||1369(klass==rb_cObject&&st_lookup(rb_class_tbl,id,0))){1370rb_warn("alreadyinitialized%s%s",dest,rb_id2name(id));1371}1372}13731374st_insert(RCLASS(klass)->iv_tbl,id,val);1375}

(variable.c)

Youcanthistimeagainignorethewarningchecks(rb_raise(),rb_error_frozen()andrb_warn()).Here’swhat’sleft:

Page 298: Ruby Hacking Guide

▼mod_av_set()(onlytheimportantpart)

if(!RCLASS(klass)->iv_tbl){RCLASS(klass)->iv_tbl=st_init_numtable();}st_insert(RCLASS(klass)->iv_tbl,id,val);

We’renowsureconstantsalsoresideintheinstancetable.Itmeansintheiv_tblofstructRClass,thefollowingaremixedtogether:

1. theclass’sowninstancevariables2. classvariables3. constants

ReadingWenowknowhowtheconstantsarestored.We’llnowcheckhowtheyreallywork.

rb_const_get()

We’llnowlookatrb_const_get(),thefunctiontoreadaconstant.Thisfunctionreturnstheconstantreferredtobyidfromtheclassklass.

▼rb_const_get()

1156VALUE1157rb_const_get(klass,id)1158VALUEklass;

Page 299: Ruby Hacking Guide

1159IDid;1160{1161VALUEvalue,tmp;1162intmod_retry=0;11631164tmp=klass;1165retry:1166while(tmp){1167if(RCLASS(tmp)->iv_tbl&&st_lookup(RCLASS(tmp)->iv_tbl,id,&value)){1168returnvalue;1169}1170if(tmp==rb_cObject&&top_const_get(id,&value))returnvalue;1171tmp=RCLASS(tmp)->super;1172}1173if(!mod_retry&&BUILTIN_TYPE(klass)==T_MODULE){1174mod_retry=1;1175tmp=rb_cObject;1176gotoretry;1177}11781179/*Uninitializedconstant*/1180if(klass&&klass!=rb_cObject){1181rb_name_error(id,"uninitializedconstant%sat%s",1182rb_id2name(id),1183RSTRING(rb_class_path(klass))->ptr);1184}1185else{/*global_uninitialized*/1186rb_name_error(id,"uninitializedconstant%s",rb_id2name(id));1187}1188returnQnil;/*notreached*/1189}

(variable.c)

There’salotofcodeintheway.First,weshouldatleastremovetherb_name_error()inthesecondhalf.Inthemiddle,what’saroundmod_entryseemstobeaspecialhandlingformodules.Let’salsoremovethatforthetimebeing.Thefunctiongetsreducedtothis:

Page 300: Ruby Hacking Guide

▼rb_const_get(simplified)

VALUErb_const_get(klass,id)VALUEklass;IDid;{VALUEvalue,tmp;

tmp=klass;while(tmp){if(RCLASS(tmp)->iv_tbl&&st_lookup(RCLASS(tmp)->iv_tbl,id,&value)){returnvalue;}if(tmp==rb_cObject&&top_const_get(id,&value))returnvalue;tmp=RCLASS(tmp)->super;}}

Nowitshouldbeprettyeasytounderstand.Thefunctionsearchesfortheconstantiniv_tblwhileclimbingklass’ssuperclasschain.Thatmeans:

classAConst="ok"endclassB<Ap(Const)#canbeaccessedend

Theonlyproblemremainingistop_const_get().Thisfunctionisonlycalledforrb_cObjectsotopmustmean“top-level”.Ifyoudon’tremember,atthetop-level,theclassisObject.Thismeansthesameas“intheclassstatementdefiningC,theclassbecomesC”,meaningthat“thetop-level’sclassisObject”.

Page 301: Ruby Hacking Guide

#theclassofthetop-levelisObjectclassA#theclassisAclassB#theclassisBendend

Sotop_const_get()probablydoessomethingspecifictothetoplevel.

top_const_get()

Let’slookatthistop_const_getfunction.Itlooksuptheidconstantwritesthevalueinklasspandreturns.

▼top_const_get()

1102staticint1103top_const_get(id,klassp)1104IDid;1105VALUE*klassp;1106{1107/*pre-definedclass*/1108if(st_lookup(rb_class_tbl,id,klassp))returnQtrue;11091110/*autoload*/1111if(autoload_tbl&&st_lookup(autoload_tbl,id,0)){1112rb_autoload_load(id);1113*klassp=rb_const_get(rb_cObject,id);1114returnQtrue;1115}1116returnQfalse;1117}

(variable.c)

Page 302: Ruby Hacking Guide

rb_class_tblwasalreadymentionedinchapter4“Classesandmodules”.It’sthetableforstoringtheclassesdefinedatthetop-level.Built-inclasseslikeStringorArrayhaveforexampleanentryinit.That’swhyweshouldnotforgettosearchinthistablewhenlookingfortop-levelconstants.

Thenextblockisrelatedtoautoloading.Itisdesignedtobeabletoregisteralibrarythatisloadedautomaticallywhenaccessingaparticulartop-levelconstantforthefirsttime.Thiscanbeusedlikethis:

autoload(:VeryBigClass,"verybigclass")#VeryBigClassisdefinedinit

Afterthis,whenVeryBigClassisaccessedforthefirsttime,theverybigclasslibraryisloaded(withrequire).AslongasVeryBigClassisdefinedinthelibrary,executioncancontinuesmoothly.It’sanefficientapproach,whenalibraryistoobigandalotoftimeisspentonloading.

Thisautoloadisprocessedbyrb_autoload_xxxx().Wewon’tdiscussautoloadfurtherinthischapterbecausetherewillprobablybeabigchangeinhowitworkssoon.

(translator’snote:Thewayautoloadworksdidchangein1.8:autoloadedconstantsdonotneedtobedefinedattop-levelanymore).

Otherclasses?

Page 303: Ruby Hacking Guide

Butwheredidthecodeforlookingupconstantsinotherclassesendup?Afterall,constantsarefirstlookedupintheoutsideclasses,theninthesuperclasses.

Infact,wedonotyethaveenoughknowledgetolookatthat.Theoutsideclasseschangedependingonthelocationintheprogram.Inotherwordsitdependsoftheprogramcontext.Soweneedfirsttounderstandhowtheinternalstateoftheevaluatorishandled.Specifically,thissearchinotherclassesisdoneintheev_const_get()functionofeval.c.We’lllookatitandfinishwiththeconstantsinthethirdpartofthebook.

Globalvariables

GeneralremarksGlobalvariablescanbeaccessedfromanywhere.Orputtheotherwayaround,thereisnoneedtorestrictaccesstothem.Becausetheyarenotattachedtoanycontext,thetableonlyhastobeatoneplace,andthere’snoneedtodoanycheck.Thereforeimplementationisverysimple.

Butthereisstillquitealotofcode.ThereasonforthisisthatglobalvariablesofRubyareequippedwithsomegimmickswhichmakeithardtoregardthemasmerevariables.Functionslikethefollowingareonlyavailableforglobalvariables:

Page 304: Ruby Hacking Guide

youcan“hook”accessofglobalvariablesyoucanaliasthemwithalias

Let’sexplainthissimply.

Aliasesofvariablesalias$newname$oldname

Afterthis,youcanuse$newnameinsteadof$oldname.aliasforvariablesismainlyacounter-measurefor“symbolvariables”.“symbolvariables”arevariablesinheritedfromPerllike$=or$0.$=decidesifduringstringcomparisonupperandlowercaselettersshouldbedifferentiated.$0showsthenameofthemainRubyprogram.Therearesomeothersymbolvariablesbutanywayastheirnameisonlyonecharacterlong,theyaredifficulttorememberforpeoplewhodon’tknowPerl.So,aliaseswerecreatedtomakethemalittleeasiertounderstand.

Thatsaid,currentlysymbolvariablesarenotrecommended,andaremovedonebyoneinsingletonmethodsofsuitablemodules.Thecurrentschoolofthoughtisthat$=andotherswillbeabolishedin2.0.

HooksYoucan“hook”readandwriteofglobalvariables.

AlthoughhookscanbealsobesetattheRubylevel,Ithinkthe

Page 305: Ruby Hacking Guide

purposeofitseemsrathertopreparethespecialvariablesforsystemuselike$KCODEatClevel.$KCODEisthevariablecontainingtheencodingtheinterpretercurrentlyusestohandlestrings.Essentiallyonlyspecialstringslike"EUC"or"UTF8"canbeassignedtoit,butthisistoobothersomesoitisdesignedsothat"e"or"u"canalsobeused.

p($KCODE)#"NONE"(default)$KCODE="e"p($KCODE)#"EUC"$KCODE="u"p($KCODE)#"UTF8"

Knowingthatyoucanhookassignmentofglobalvariables,youshouldunderstandeasilyhowthiscanbedone.Bytheway,$KCODE’sKcomesfrom“kanji”(thenameofChinesecharactersinJapanese).

Youmightsaythatevenwithaliasorhooks,globalvariablesjustaren’tusedmuch,soit’sfunctionalitythatdoesn’treallymater.It’sadequatenottotalkmuchaboutunusedfunctions,andI’dliketousemorepagesfortheanalysisoftheparserandevaluator.That’swhyI’llproceedwiththeexplanationbelowwhosedegreeofhalf-heartedis85%.

DatastructureIsaidthatthepointwhenlookingathowvariablesworkisthewaytheyarestored.First,I’dlikeyoutofirmlygraspthestructureused

Page 306: Ruby Hacking Guide

byglobalvariables.

▼Datastructureforglobalvariables

21staticst_table*rb_global_tbl;

334structglobal_entry{335structglobal_variable*var;336IDid;337};

324structglobal_variable{325intcounter;/*referencecounter*/326void*data;/*valueofthevariable*/327VALUE(*getter)();/*functiontogetthevariable*/328void(*setter)();/*functiontosetthevariable*/329void(*marker)();/*functiontomarkthevariable*/330intblock_trace;331structtrace_var*trace;332};

(variable.c)

rb_global_tblisthemaintable.Allglobalvariablesarestoredinthistable.Thekeysofthistableareofcoursevariablenames(ID).Avalueisexpressedbyastructglobal_entryandastructglobal_variable(figure1).

Page 307: Ruby Hacking Guide

Figure1:Globalvariablestableatexecutiontime

Thestructurerepresentingthevariablesissplitintwotobeabletocreatealiases.Whenanaliasisestablished,twoglobal_entryspointtothesamestructglobal_variable.

It’satthistimethatthereferencecounter(thecountermemberofstructglobal_variable)isnecessary.Iexplainedthegeneralideaofareferencecounterintheprevioussection“Garbagecollection”.Reviewingitbriefly,whenanewreferencetothestructureismade,thecounterinincrementedby1.Whenthereferenceisnotusedanymore,thecounterisdecreasedby1.Whenthecounterreaches0,thestructureisnolongerusefulsofree()canbecalled.

WhenhooksaresetattheRubylevel,alistofstructtrace_varsisstoredinthetracememberofstructglobal_variable,butIwon’ttalkaboutit,andomitstructtrace_var.

Reading

Page 308: Ruby Hacking Guide

Youcanhaveageneralunderstandingofglobalvariablesjustbylookingathowtheyareread.Thefunctionsforreadingthemarerb_gv_get()andrb_gvar_get().

▼rb_gv_get()rb_gvar_get()

716VALUE717rb_gv_get(name)718constchar*name;719{720structglobal_entry*entry;721722entry=rb_global_entry(global_id(name));723returnrb_gvar_get(entry);724}

649VALUE650rb_gvar_get(entry)651structglobal_entry*entry;652{653structglobal_variable*var=entry->var;654return(*var->getter)(entry->id,var->data,var);655}

(variable.c)

Asubstantialpartofthecontentseemstoturnaroundtherb_global_entry()function,butthatdoesnotpreventusunderstandingwhat’sgoingon.global_idisafunctionthatconvertsachar*toIDandchecksifit’stheIDofaglobalvariable.(*var->getter)(...)isofcourseafunctioncallusingthefunctionpointervar->getter.Ifpisafunctionpointer,(*p)(arg)callsthefunction.

Page 309: Ruby Hacking Guide

Butthemainpartisstillrb_global_entry().

▼rb_global_entry()

351structglobal_entry*352rb_global_entry(id)353IDid;354{355structglobal_entry*entry;356357if(!st_lookup(rb_global_tbl,id,&entry)){358structglobal_variable*var;359entry=ALLOC(structglobal_entry);360st_add_direct(rb_global_tbl,id,entry);361var=ALLOC(structglobal_variable);362entry->id=id;363entry->var=var;364var->counter=1;365var->data=0;366var->getter=undef_getter;367var->setter=undef_setter;368var->marker=undef_marker;369370var->block_trace=0;371var->trace=0;372}373returnentry;374}

(variable.c)

Themaintreatmentisonlydonebythest_lookup()atthebeginning.What’sdoneafterwardsisjustcreating(andstoring)anewentry.As,whenaccessinganonexistingglobalvariable,anentryisautomaticallycreated,rb_global_entry()willneverreturnNULL.

Page 310: Ruby Hacking Guide

Thiswasmainlydoneforspeed.Whentheparserfindsaglobalvariable,itgetsthecorrespondingstructglobal_entry.Whenreadingthevalueofthevariable,thevalueisjustobtainedfromtheentry(usingrb_gv_get()).

Let’snowcontinuealittlewiththecodethatfollows.var->getterandothersaresettoundef_xxxx.undefprobablymeansthattheyarethesetter/getter/markerforaglobalvariablewhosestateisundefined.

undef_getter()justshowsawarningandreturnsnil,asevenundefinedglobalvariablescanberead.undef_setter()isalittlebitinterestingsolet’slookatit.

▼undef_setter()

385staticvoid386undef_setter(val,id,data,var)387VALUEval;388IDid;389void*data;390structglobal_variable*var;391{392var->getter=val_getter;393var->setter=val_setter;394var->marker=val_marker;395396var->data=(void*)val;397}

(variable.c)

val_getter()takesthevaluefromentry->dataandreturnsit.

Page 311: Ruby Hacking Guide

val_getter()justputsavalueinentry->data.Settinghandlersthiswayallowsusnottoneedspecialhandlingforundefinedvariables(figure2).Skillfullydone,isn’tit?

Figure2:Settingandconsultationofglobalvariables

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 312: Ruby Hacking Guide

RubyHackingGuide

TranslatedbyCliffordEscobarCAOILE&ocha-

Page 313: Ruby Hacking Guide

Chapter7:Security

FundamentalsIsaysecuritybutIdon’tmeanpasswordsorencryption.TheRubysecurityfeatureisusedforhandlinguntrustedobjectsinaenvironmentlikeCGIprogramming.

Forexample,whenyouwanttoconvertastringrepresentinganumberintoainteger,youcanusetheevalmethod.However.evalisamethodthat“runsastringasaRubyprogram.”Ifyouevalastringfromaunknownpersonfromthenetwork,itisverydangerous.Howeverfortheprogrammertofullydifferentiatebetweensafeandunsafethingsisverytiresomeandcumbersome.Therefore,itisforcertainthatamistakewillbemade.So,letusmakeitpartofthelanguage,wasreasoningforthisfeature.

Sothen,howRubyprotectusfromthatsortofdanger?Causesofdangerousoperations,forexample,openingunintendedfiles,areroughlydividedintotwogroups:

DangerousdataDangerouscode

Fortheformer,thecodethathandlesthesevaluesiscreatedbytheprogrammersthemselves,sothereforeitis(relatively)safe.For

Page 314: Ruby Hacking Guide

thelatter,theprogramcodeabsolutelycannotbetrusted.

Becausethesolutionisvastlydifferentbetweenthetwocauses,itisimportanttodifferentiatethembylevel.Thisarecalledsecuritylevels.TheRubysecuritylevelisrepresentedbythe$SAFEglobalvariable.Thevaluerangesfromminimumvalue0tomaximumvalue4.Whenthevariableisassigned,thelevelincreases.Oncethelevelisraiseditcanneverbelowered.Andforeachlevel,theoperationsarelimited.

Iwillnotexplainlevel1or3.Level0isthenormalprogramenvironmentandthesecuritysystemisnotrunning.Level2handlesdangerousvalues.Level4handlesdangerouscode.Wecanskip0andmoveontoexplainindetaillevels2and4.

((errata:Level1handlesdangerousvalues.“Level2hasnousecurrently”isright.))

Level1Thislevelisfordangerousdata,forexample,innormalCGIapplications,etc.

Aper-object“taintedmark”servesasthebasisfortheLevel1implementation.Allobjectsreadinexternallyaremarkedtainted,andanyattempttoevalorFile.openwithataintedobjectwillcauseanexceptiontoberaisedandtheattemptwillbestopped.

Thistaintedmarkis“infectious”.Forexample,whentakingapart

Page 315: Ruby Hacking Guide

ofataintedstring,thatpartisalsotainted.

Level4Thislevelisfordangerousprograms,forexample,runningexternal(unknown)programs,etc.

Atlevel1,operationsandthedataitusesarechecked,butatlevel4,operationsthemselvesarerestricted.Forexample,exit,fileI/O,threadmanipulation,redefiningmethods,etc.Ofcourse,thetaintedmarkinformationisused,butbasicallytheoperationsarethecriteria.

UnitofSecurity$SAFElookslikeaglobalvariablebutisinactualityathreadlocalvariable.Inotherwords,Ruby’ssecuritysystemworksonunitsofthread.InJavaand.NET,rightscanbesetpercomponent(object),butRubydoesnotimplementthat.TheassumedmaintargetwasprobablyCGI.

Therefore,ifonewantstoraisethesecuritylevelofonepartoftheprogram,thenitshouldbemadeintoadifferentthreadandhaveitssecuritylevelraised.Ihaven’tyetexplainedhowtocreateathread,butIwillshowanexamplehere:

#Raisethesecuritylevelinadifferentthreadp($SAFE)#0isthedefaultThread.fork{#Startadifferentthread$SAFE=4#Raisethelevel

Page 316: Ruby Hacking Guide

eval(str)#Runthedangerousprogram}p($SAFE)#Outsideoftheblock,thelevelisstill0

Reliabilityof$SAFEEvenwithimplementingthespreadingoftaintedmarks,orrestrictingoperations,ultimatelyitisstillhandledmanually.Inotherwords,internallibrariesandexternallibrariesmustbecompletelycompatibleandiftheydon’t,thenthepartwaythe“tainted”operationswillnotspreadandthesecuritywillbelost.Andactuallythiskindofholeisoftenreported.Forthisreason,thiswriterdoesnotwhollytrustit.

Thatisnottosay,ofcourse,thatallRubyprogramsaredangerous.Evenat$SAFE=0itispossibletowriteasecureprogram,andevenat$SAFE=4itispossibletowriteaprogramthatfitsyourwhim.However,onecannotputtoomuchconfidenceon$SAFE(yet).

Inthefirstplace,functionalityandsecuritydonotgotogether.Itiscommonsensethataddingnewfeaturescanmakeholeseasiertoopen.Thereforeitisprudenttothinkthatrubycanprobablybedangerous.

ImplementationFromnowon,we’llstarttolookintoitsimplementation.Inordertowhollygraspthesecuritysystemofruby,wehavetolookat“whereisbeingchecked”ratherthanitsmechanism.However,this

Page 317: Ruby Hacking Guide

timewedon’thaveenoughpagestodoit,andjustlistingthemupisnotinteresting.Therefore,inthischapter,I’llonlydescribeaboutthemechanismusedforsecuritychecks.TheAPIstocheckaremainlythesebelowtwo:

rb_secure(n):Ifmorethanorequaltoleveln,itwouldraiseSecurityError.SafeStringValue():Ifmorethanorequaltolevel1andastringistainted,thenitwouldraiseanexception.

Wewon’treadSafeStringValue()here.

TaintedMarkThetaintmarkis,tobeconcrete,theFL_TAINTflag,whichissettobasic->flags,andwhatisusedtoinfectitistheOBJ_INFECT()macro.Hereisitsusage.

OBJ_TAINT(obj)/*setFL_TAINTtoobj*/OBJ_TAINTED(obj)/*checkifFL_TAINTissettoobj*/OBJ_INFECT(dest,src)/*infectFL_TAINTfromsrctodest*/

SinceOBJ_TAINT()andOBJ_TAINTED()canbeassumednotimportant,let’sbrieflylookoveronlyOBJ_INFECT().

▼OBJ_INFECT

441#defineOBJ_INFECT(x,s)do{\if(FL_ABLE(x)&&FL_ABLE(s))\RBASIC(x)->flags|=RBASIC(s)->flags&FL_TAINT;\

Page 318: Ruby Hacking Guide

}while(0)

(ruby.h)

FL_ABLE()checksiftheargumentVALUEisapointerornot.Ifthebothobjectsarepointers(itmeanseachofthemhasitsflagsmember),itwouldpropagatetheflag.

$SAFE▼ruby_safe_level

124intruby_safe_level=0;

7401staticvoid7402safe_setter(val)7403VALUEval;7404{7405intlevel=NUM2INT(val);74067407if(level<ruby_safe_level){7408rb_raise(rb_eSecurityError,"triedtodowngradesafelevelfrom%dto%d",7409ruby_safe_level,level);7410}7411ruby_safe_level=level;7412curr_thread->safe=level;7413}

(eval.c)

Thesubstanceof$SAFEisruby_safe_levelineval.c.AsIpreviouslywrote,$SAFEislocaltoeachthread,Itneedstobewrittenineval.cwheretheimplementationofthreadsislocated.Inotherwords,itisineval.conlybecauseoftherestrictionsofC,butitcan

Page 319: Ruby Hacking Guide

essentiallybelocatedinanotherplace.

safe_setter()isthesetterofthe$SAFEglobalvariable.Itmeans,becausethisfunctionistheonlywaytoaccessitfromRubylevel,thesecuritylevelcannotbelowered.

However,asyoucansee,fromClevel,becausestaticisnotattachedtoruby_safe_level,youcanignoretheinterfaceandmodifythesecuritylevel.

rb_secure()

▼rb_secure()

136void137rb_secure(level)138intlevel;139{140if(level<=ruby_safe_level){141rb_raise(rb_eSecurityError,"Insecureoperation`%s'atlevel%d",142rb_id2name(ruby_frame->last_func),ruby_safe_level);143}144}

(eval.c)

Ifthecurrentsafelevelismorethanorequaltolevel,thiswouldraiseSecurityError.It’ssimple.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

Page 320: Ruby Hacking Guide

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 321: Ruby Hacking Guide

RubyHackingGuide

Page 322: Ruby Hacking Guide

Chapter8:RubyLanguageDetails

I’lltalkaboutthedetailsofRuby’ssyntaxandevaluation,whichhaven’tbeencoveredyet.Ididn’tintendacompleteexposition,soIleftouteverythingwhichdoesn’tcomeupinthisbook.That’swhyyouwon’tbeabletowriteRubyprogramsjustbyreadingthis.Acompleteexpositioncanbefoundinthe\footnote{Rubyreferencemanual:archives/ruby-refm.tar.gzintheattachedCD-ROM}

ReaderswhoknowRubycanskipoverthischapter.

Literals

TheexpressivenessofRuby’sliteralsisextremelyhigh.Inmyopinion,whatmakesRubyascriptlanguageisfirstlytheexistenceofthetoplevel,secondlyit’stheexpressivenessofitsliterals.Thirdlyitmightbetherichnessofitsstandardlibrary.

Asingleliteralalreadyhasenormouspower,butevenmorewhenmultipleliteralsarecombined.EspeciallytheabilityofcreatingcomplexliteralsthathashandarrayliteralsarecombinedisthebiggestadvantageofRuby’sliteral.Onecanwrite,forinstance,ahashofarraysofregularexpressionsbyconstructingstraightforwardly.

Page 323: Ruby Hacking Guide

Whatkindofexpressionsarevalid?Let’slookatthemonebyone.

StringsStringsandregularexpressionscan’tbemissinginascriptinglanguage.TheexpressivenessofRuby’sstringisveryvariousevenmorethantheotherRuby’sliterals.

SingleQuotedStrings'string'#「string」'\\begin{document}'#「\begin{document}」'\n'#「\n」backslashandann,notanewline'\1'#「\1」backslashand1'\''#「'」

Thisisthesimplestform.InC,whatenclosedinsinglequotesbecomesacharacter,butinRuby,itbecomesastring.Let’scallthisa'-string.Thebackslashescapeisineffectonlyfor\itselfand'.Ifoneputsabackslashinfrontofanothercharacterthebackslashremainsasinthefourthexample.

AndRuby’sstringsaren’tdividedbynewlinecharacters.Ifwewriteastringoverseverallinesthenewlinesarecontainedinthestring.

'multilinestring'

Andifthe-Koptionisgiventotherubycommand,multibytestringswillbeaccepted.AtpresentthethreeencodingsEUC-JP(-Ke),Shift

Page 324: Ruby Hacking Guide

JIS(-Ks),andUTF8(-Ku)canbespecified.

'「漢字が通る」と「マルチバイト⽂字が通る」はちょっと違う'#'There'salittledifferencebetween"Kanjiareaccepted"and"Multibytecharactersareaccepted".'

DoubleQuotedStrings"string"#「string」"\n"#newline"\x0f"#abytegiveninhexadecimalform"page#{n}.html"#embeddingacommand

Withdoublequoteswecanusecommandexpansionandbackslashnotation.ThebackslashnotationissomethingclassicalthatisalsosupportedinC,forinstance,\nisanewline,\bisabackspace.InRuby,Ctrl-CandESCcanalsobeexpressed,that’sconvenient.However,merelylistingthewholenotationisnotfun,regardingitsimplementation,itjustmeansalargenumberofcasestobehandledandthere’snothingespeciallyinteresting.Therefore,theyareentirelyleftouthere.

Ontheotherhand,expressionexpansionisevenmorefantastic.WecanwriteanarbitraryRubyexpressioninside#{}anditwillbeevaluatedatruntimeandembeddedintothestring.Therearenolimitationslikeonlyonevariableoronlyonemethod.Gettingthisfar,itisnotamereliteralanymorebuttheentirethingcanbeconsideredasanexpressiontoexpressastring.

"embedded#{lvar}expression""embedded#{@ivar}expression"

Page 325: Ruby Hacking Guide

"embedded#{1+1}expression""embedded#{method_call(arg)}expression""embedded#{"stringinstring"}expression"

Stringswith%%q(string)#sameas'string'%Q(string)#sameas"string"%(string)#sameas%Q(string)or"string"

Ifalotofseparatorcharactersappearinastring,escapingallofthembecomesaburden.Inthatcasetheseparatorcharacterscanbechangedbyusing%.Inthefollowingexample,thesamestringiswrittenasa"-stringand%-string.

"<ahref=\"http://i.loveruby.net#{path}\">"%Q(<ahref="http://i.loveruby.net#{path}">)

Thebothexpressionshasthesamelength,butthe%-oneisalotnicertolookat.Whenwehavemorecharacterstoescapeinit,%-stringwouldalsohaveadvantageinlength.

Herewehaveusedparenthesesasdelimiters,butsomethingelseisfine,too.Likebracketsorbracesor#.Almosteverysymbolisfine,even%.

%q#thisisstring#%q[thisisstring]%q%thisisstring%

HereDocuments

Page 326: Ruby Hacking Guide

Heredocumentisasyntaxwhichcanexpressstringsspanningmultiplelines.Anormalstringstartsrightafterthedelimiter"andeverythinguntiltheending"wouldbethecontent.Whenusingheredocument,thelinesbetweenthelinewhichcontainsthestarting<<EOSandthelinewhichcontainstheendingEOSwouldbethecontent.

"thecharactersbetweenthestartingsymbolandtheendingsymbolwillbecomeastring."

<<EOSAlllinesbetweenthestartingandtheendinglineareinthisheredocumentEOS

HereweusedEOSasidentifierbutanywordisfine.Preciselyspeaking,allthecharactermatching[a-zA-Z_0-9]andmulti-bytecharacterscanbeused.

Thecharacteristicofheredocumentisthatthedelimitersare“thelinescontainingthestartingidentifierortheendingidentifier”.Thelinewhichcontainsthestartsymbolisthestartingdelimiter.Therefore,thepositionofthestartidentifierinthelineisnotimportant.Takingadvantageofthis,itdoesn’tmatterthat,forinstance,itiswritteninthemiddleofanexpression:

printf(<<EOS,count_n(str))count=%dEOS

Page 327: Ruby Hacking Guide

Inthiscasethestring"count=%d\n"goesintheplaceof<<EOS.Soit’sthesameasthefollowing.

printf("count=%d\n",count_n(str))

Thepositionofthestartingidentifierisreallynotrestricted,butonthecontrary,therearestrictrulesfortheendingsymbol:Itmustbeatthebeginningofthelineandtheremustnotbeanotherletterinthatline.Howeverifwewritethestartsymbolwithaminuslikethis<<-EOSwecanindentthelinewiththeendsymbol.

<<-EOSItwouldbeconvenientifonecouldindentthecontentofaheredocument.Butthat'snotpossible.Ifyouwantthat,writingamethodtodeleteindentsisusuallyawaytogo.Butbewareoftabs.EOS

Furthermore,thestartsymbolcanbeenclosedinsingleordoublequotes.Thenthepropertiesofthewholeheredocumentchange.Whenwechange<<EOSto<<"EOS"wecanuseembeddedexpressionsandbackslashnotation.

<<"EOS"Onedayis#{24*60*60}seconds.Incredible.EOS

But<<'EOS'isnotthesameasasinglequotedstring.Itstartsthecompleteliteralmode.Everythingevenbackslashesgointothestringastheyaretyped.Thisisusefulforastringwhichcontains

Page 328: Ruby Hacking Guide

manybackslashes.

InPart2,I’llexplainhowtoparseaheredocument.ButI’dlikeyoutotrytoguessitbefore.

CharactersRubystringsarebytesequences,therearenocharacterobjects.InsteadtherearethefollowingexpressionswhichreturntheintegerswhichcorrespondacertaincharacterinASCIIcode.

?a#theintegerwhichcorrespondsto"a"?.#theintegerwhichcorrespondsto"."?\n#LF?\C-a#Ctrl-a

RegularExpressions/regexp//^Content-Length:/i/正規表現//\/\*.*?\*\//m#AnexpressionwhichmatchesCcomments/reg#{1+1}exp/#thesameas/reg2exp/

Whatiscontainedbetweenslashesisaregularexpression.Regularexpressionsarealanguagetodesignatestringpatterns.Forexample

/abc/

Thisregularexpressionmatchesastringwherethere’sana

Page 329: Ruby Hacking Guide

followedbyabfollowedbyac.Itmatches“abc”or“fffffffabc”or“abcxxxxx”.

Onecandesignatemorespecialpatterns.

/^From:/

Thismatchesastringwherethere’saFromfollowedbya:atthebeginningofaline.Thereareseveralmoreexpressionsofthiskind,suchthatonecancreatequitecomplexpatterns.

Theusesareinfinite:Changingthematchedparttoanotherstring,deletingthematchedpart,determiningifthere’sonematchandsoon…

Amoreconcreteusecasewouldbe,forinstance,extractingtheFrom:headerfromamail,orchangingthe\ntoan\r,orcheckingifastringlookslikeamailaddress.

Sincetheregularexpressionitselfisanindependentlanguage,ithasitsownparserandevaluatorwhicharedifferentfromruby.Theycanbefoundinregex.c.Hence,it’senoughforrubytobeabletocutouttheregularexpressionpartfromaRubyprogramandfeedit.Asaconsequence,theyaretreatedalmostthesameasstringsfromthegrammaticalpointofview.Almostallofthefeatureswhichstringshavelikeescapes,backslashnotationsandembeddedexpressionscanbeusedinthesamewayinregularexpressions.

Page 330: Ruby Hacking Guide

However,wecansaytheyaretreatedasthesameasstringsonlywhenweareintheviewpointof“Ruby’ssyntax”.Asmentionedbefore,sinceregularexpressionitselfisalanguage,naturallywehavetofollowitslanguageconstraints.Todescriberegularexpressionindetail,it’ssolargethatonemorecanbewritten,soI’dlikeyoutoreadanotherbookforthissubject.Irecommend“MasteringRegularExpression”byJeffreyE.F.Friedl.

RegularExpressionswith%Alsoaswithstrings,regularexpressionsalsohaveasyntaxforchangingdelimiters.Inthiscaseitis%r.Tounderstandthis,lookingatsomeexamplesareenoughtounderstand.

%r(regexp)%r[/\*.*?\*/]#matchesaCcomment%r("(?:[^"\\]+|\\.)*")#matchesastringinC%r{reg#{1+1}exp}#embeddingaRubyexpression

ArraysAcomma-separatedlistenclosedinbrackets[]isanarrayliteral.

[1,2,3]['This','is','an','array','of','string']

[/regexp/,{'hash'=>3},4,'string',?\C-a]

lvar=$gvar=@ivar=@@cvar=nil[lvar,$gvar,@ivar,@@cvar][Object.new(),Object.new(),Object.new()]

Page 331: Ruby Hacking Guide

Ruby’sarray(Array)isalistofarbitraryobjects.Fromasyntacticalstandpoint,it’scharacteristicisthatarbitraryexpressionscanbeelements.Asmentionedearlier,anarrayofhashesofregularexpressionscaneasilybemade.Notjustliteralsbutalsoexpressionswhichvariablesormethodcallscombinedtogethercanalsobewrittenstraightforwardly.

Notethatthisis“anexpressionwhichgeneratesanarrayobject”aswiththeotherliterals.

i=0whilei<5p([1,2,3].id)#Eachtimeanotherobjectidisshown.i+=1end

WordArraysWhenwritingscriptsoneusesarraysofstringsalot,hencethereisaspecialnotationonlyforarraysofstrings.Thatis%w.Withanexampleit’simmediatelyobvious.

%w(alphabetagammadelta)#['alpha','beta','gamma','delta']%w(⽉⽕⽔⽊⾦⼟⽇)%w(JanFebMarAprMayJunJulAugSepOctNovDec)

There’salso%Wwhereexpressionscanbeembedded.It’safeatureimplementedfairlyrecently.

n=5

Page 332: Ruby Hacking Guide

%w(list0list#{n})#['list0','list#{n}']%W(list0list#{n})#['list0','list5']

Theauthorhasn’tcomeupwithagooduseof%Wyet.

HashesHashtablesaredatastructurewhichstoreaone-to-onerelationbetweenarbitraryobjects.Bywritingasfollows,theywillbeexpressionstogeneratetables.

{'key'=>'value','key2'=>'value2'}{3=>0,'string'=>5,['array']=>9}{Object.new()=>3,Object.new()=>'string'}

#Ofcoursewecanputitinseverallines.{0=>0,1=>3,2=>6}

Weexplainedhashesindetailinthethirdchapter“NamesandNametables”.Theyarefastlookuptableswhichallocatememoryslotsdependingonthehashvalues.InRubygrammar,bothkeysandvaluescanbearbitraryexpressions.

Furthermore,whenusedasanargumentofamethodcall,the{...}canbeomittedunderacertaincondition.

some_method(arg,key=>value,key2=>value2)#some_method(arg,{key=>value,key2=>value2})#sameasabove

Withthiswecanimitatenamed(keyword)arguments.

Page 333: Ruby Hacking Guide

button.set_geometry('x'=>80,'y'=>'240')

Ofcourseinthiscaseset_geometrymustacceptahashasinput.Thoughrealkeywordargumentswillbetransformedintoparametervariables,it’snotthecaseforthisbecausethisisjusta“imitation”.

RangesRangeliteralsareoddballswhichdon’tappearinmostotherlanguages.HerearesomeexpressionswhichgenerateRangeobjects.

0..5#from0to5containing50...5#from0to5notcontaining51+2..9+0#from3to9containing9'a'..'z'#stringsfrom'a'to'z'containing'z'

Iftherearetwodotsthelastelementisincluded.Iftherearethreedotsitisnotincluded.Notonlyintegersbutalsofloatsandstringscanbemadeintoranges,evenarangebetweenarbitraryobjectscanbecreatedifyou’dattempt.However,thisisaspecificationofRangeclass,whichistheclassofrangeobjects,(itmeansalibrary),thisisnotamatterofgrammar.Fromtheparser’sstandpoint,itjustenablestoconcatenatearbitraryexpressionswith...Ifarangecannotbegeneratedwiththeobjectsastheevaluatedresults,itwouldbearuntimeerror.

Bytheway,becausetheprecedenceof..and...isquitelow,

Page 334: Ruby Hacking Guide

sometimesitisinterpretedinasurprisingway.

1..5.to_a()#1..(5.to_a())

IthinkmypersonalityisrelativelybentforRubygrammar,butsomehowIdon’tlikeonlythisspecification.

SymbolsInPart1,wetalkedaboutsymbolsatlength.It’ssomethingcorrespondsone-to-onetoanarbitrarystring.InRubysymbolsareexpressedwitha:infront.

:identifier:abcde

Theseexamplesareprettynormal.Actually,besidesthem,allvariablenamesandmethodnamescanbecomesymbolswitha:infront.Likethis:

:$gvar:@ivar:@@cvar:CONST

Moreover,thoughwehaven’ttalkedthisyet,[]orattr=canbeusedasmethodnames,sonaturallytheycanalsobeusedassymbols.

:[]:attr=

Page 335: Ruby Hacking Guide

Whenoneusesthesesymbolsasvaluesinanarray,it’lllookquitecomplicated.

NumericalValuesThisistheleastinteresting.OnepossiblethingIcanintroducehereisthat,whenwritingamillion,

1_000_000

aswrittenabove,wecanuseunderscoredelimitersinthemiddle.Buteventhisisn’tparticularlyinteresting.Fromhereoninthisbook,we’llcompletelyforgetaboutnumericalvalues.

Methods

Let’stalkaboutthedefinitionandcallingofmethods.

DefinitionandCallsdefsome_method(arg)....end

classCdefsome_method(arg)....end

Page 336: Ruby Hacking Guide

end

Methodsaredefinedwithdef.Iftheyaredefinedattopleveltheybecomefunctionstylemethods,insideaclasstheybecomemethodsofthisclass.Tocallamethodwhichwasdefinedinaclass,oneusuallyhastocreateaninstancewithnewasshownbelow.

C.new().some_method(0)

TheReturnValueofMethodsThereturnvalueofamethodis,ifareturnisexecutedinthemiddle,itsvalue.Otherwise,it’sthevalueofthestatementwhichwasexecutedlast.

defone()#1isreturnedreturn1999end

deftwo()#2isreturned9992end

defthree()#3isreturnediftruethen3else999endend

Page 337: Ruby Hacking Guide

Ifthemethodbodyisempty,itwouldautomaticallybenil,andanexpressionwithoutavaluecannotputattheend.Henceeverymethodhasareturnvalue.

OptionalArgumentsOptionalargumentscanalsobedefined.Ifthenumberofargumentsdoesn’tsuffice,theparametersareautomaticallyassignedtodefaultvalues.

defsome_method(arg=9)#defaultvalueis9pargend

some_method(0)#0isshown.some_method()#Thedefaultvalue9isshown.

Therecanalsobeseveraloptionalarguments.Butinthatcasetheymustallcomeattheendoftheargumentlist.Ifelementsinthemiddleofthelistwereoptional,howthecorrespondencesoftheargumentswouldbeveryunclear.

defright_decl(arg1,arg2,darg1=nil,darg2=nil)....end

#Thisisnotpossibledefwrong_decl(arg,default=nil,arg2)#Amiddleargumentcannotbeoptional....end

Omittingargumentparentheses

Page 338: Ruby Hacking Guide

Infact,theparenthesesofamethodcallcanbeomitted.

puts'Hello,World!'#puts("Hello,World")obj=Object.new#obj=Object.new()

InPythonwecangetthemethodobjectbyleavingoutparentheses,butthereisnosuchthinginRuby.

Ifyou’dliketo,youcanomitmoreparentheses.

puts(File.basenamefname)#puts(File.basename(fname))sameastheabove

Ifwelikewecanevenleaveoutmore

putsFile.basenamefname#puts(File.basename(fname))sameastheabove

However,recentlythiskindof“nestedomissions”becameacauseofwarnings.It’slikelythatthiswillnotpassanymoreinRuby2.0.

Actuallyeventheparenthesesoftheparametersdefinitioncanalsobeomitted.

defsome_methodparam1,param2,param3end

defother_method#withoutarguments...weseethisalotend

Parenthesesareoftenleftoutinmethodcalls,butleavingoutparenthesesinthedefinitionisnotverypopular.Howeverifthere

Page 339: Ruby Hacking Guide

arenoarguments,theparenthesesarefrequentlyomitted.

ArgumentsandListsBecauseArgumentsformalistofobjects,there’snothingoddifwecandosomethingconverse:extractingalist(anarray)asarguments,asthefollowingexample.

defdelegate(a,b,c)p(a,b,c)end

list=[1,2,3]delegate(*list)#identicaltodelegate(1,2,3)

Inthiswaywecandistributeanarrayintoarguments.Let’scallthisdevicea*argumentnow.Hereweusedalocalvariablefordemonstration,butofcoursethereisnolimitation.Wecanalsodirectlyputaliteraloramethodcallinstead.

m(*[1,2,3])#Wecouldhavewrittentheexpandedforminthefirstplace...m(*mcall())

The*argumentcanbeusedtogetherwithordinaryarguments,butthe*argumentmustcomelast.Otherwise,thecorrespondencestoparametervariablescannotbedeterminedinasingleway.

Inthedefinitionontheotherhandwecanhandletheargumentsinbulkwhenweputa*infrontoftheparametervariable.

defsome_method(*args)

Page 340: Ruby Hacking Guide

pargsend

some_method()#prints[]some_method(0)#prints[0]some_method(0,1)#prints[0,1]

Thesurplusargumentsaregatheredinanarray.Onlyone*parametercanbedeclared.Itmustalsocomeafterthedefaultarguments.

defsome_method0(arg,*rest)enddefsome_method1(arg,darg=nil,*rest)end

Ifwecombinelistexpansionandbulkreceptiontogether,theargumentsofonemethodcanbepassedasawholetoanothermethod.Thismightbethemostpracticaluseofthe*parameter.

#amethodwhichpassesitsargumentstoother_methoddefdelegate(*args)other_method(*args)end

defother_method(a,b,c)returna+b+cend

delegate(0,1,2)#sameasother_method(0,1,2)delegate(10,20,30)#sameasother_method(10,20,30)

VariousMethodCallExpressionsBeingjustasinglefeatureas‘methodcall’doesnotmeanits

Page 341: Ruby Hacking Guide

representationisalsosingle.Hereisaboutso-calledsyntacticsugar.InRubythereisatonofit,andtheyarereallyattractiveforapersonwhohasafetishforparsers.Forinstancetheexamplesbelowareallmethodcalls.

1+2#1.+(2)a==b#a.==(b)~/regexp/#/regexp/.~obj.attr=val#obj.attr=(val)obj[i]#obj.[](i)obj[k]=v#obj.[]=(k,v)<code>cvsdiffabstract.rd</code>#Kernel.`('cvsdiffabstract.rd')

It’shardtobelieveuntilyougetusedtoit,butattr=,[]=,\`are(indeed)allmethodnames.Theycanappearasnamesinamethoddefinitionandcanalsobeusedassymbols.

classCdef[](index)enddef+(another)endendp(:attr=)p(:[]=)p(:`)

Astherearepeoplewhodon’tlikesweets,therearealsomanypeoplewhodislikesyntacticsugar.Maybetheyfeelunfairwhenthethingswhichareessentiallythesameappearinfakedlooks.(Why’severyonesoserious?)

Let’sseesomemoredetails.

Page 342: Ruby Hacking Guide

SymbolAppendicesobj.name?obj.name!

Firstasmallthing.It’sjustappendinga?ora!.CallandDefinitiondonotdiffer,soit’snottoopainful.Thereareconventionforwhattousethesemethodnames,butthereisnoenforcementonlanguagelevel.It’sjustaconventionathumanlevel.ThisisprobablyinfluencedfromLispinwhichagreatvarietyofcharacterscanbeusedinprocedurenames.

BinaryOperators1+2#1.+(2)

BinaryOperatorswillbeconvertedtoamethodcalltotheobjectonthelefthandside.Herethemethod+fromtheobject1iscalled.Aslistedbelowtherearemanyofthem.Therearethegeneraloperators+and-,alsotheequivalenceoperator==andthespaceshipoperator`<=>’asinPerl,allsorts.Theyarelistedinorderoftheirprecedence.

***/%+-<<>>&|^>>=<<=<=>======~

Page 343: Ruby Hacking Guide

Thesymbols&and|aremethods,butthedoublesymbols&&and||arebuilt-inoperators.RememberhowitisinC.

UnaryOperators+2-1.0~/regexp/

Thesearetheunaryoperators.Thereareonlythreeofthem:+-~.+and-workastheylooklike(bydefault).Theoperator~matchesastringoraregularexpressionwiththevariable$_.Withanintegeritstandsforbitconversion.

Todistinguishtheunary+fromthebinary+themethodnamesfortheunaryoperatorsare+@[email protected]+nor-n.

((errata:+or–astheprefixofanumericliteralisactuallyscannedasapartoftheliteral.Thisisakindofoptimizations.))

AttributeAssignmentobj.attr=val#obj.attr=(val)

Thisisanattributeassignmentfashion.Theabovewillbetranslatedintothemethodcallattr=.Whenusingthistogetherwithmethodcallswhoseparenthesesareomitted,wecanwritecodewhichlookslikeattributeaccess.

Page 344: Ruby Hacking Guide

classCdefi()@iend#Wecanwritethedefinitioninonelinedefi=(n)@i=nendend

c=C.newc.i=99pc.i#prints99

Howeveritwillturnoutbotharemethodcalls.Theyaresimilartoget/setpropertyinDelphiorslotaccessorsinCLOS.

Besides,wecannotdefineamethodsuchasobj.attr(arg)=,whichcantakeanotherargumentintheattributeassignmentfashion.

IndexNotationobj[i]#obj.[](i)

Theabovewillbetranslatedintoamethodcallfor[].Arrayandhashaccessarealsoimplementedwiththisdevice.

obj[i]=val#obj.[]=(i,val)

Indexassignmentfashion.Thisistranslatedintoacallforamethodnamed[]=.

super

Werelativelyoftenhaveasituationwherewewantaddalittlebittothebehaviourofanalreadyexistingmethodratherthan

Page 345: Ruby Hacking Guide

replacingit.Hereamechanismtocallamethodofthesuperclasswhenoverwritingamethodisrequired.InRuby,that’ssuper.

classAdeftestputs'inA'endendclassB<Adeftestsuper#invokesA#testendend

Ruby’ssuperdiffersfromtheoneinJava.Thissinglewordmeans“callthemethodwiththesamenameinthesuperclass”.superisareservedword.

Whenusingsuper,becarefulaboutthedifferencebetweensuperwithnoargumentsandsuperwhoseargumentsareomitted.Thesuperwhoseargumentsareomittedpassesallthegivenparametervariables.

classAdeftest(*args)pargsendend

classB<Adeftest(a,b,c)#superwithnoargumentssuper()#shows[]

#superwithomittedarguments.Sameresultassuper(a,b,c)super#shows[1,2,3]

Page 346: Ruby Hacking Guide

endend

B.new.test(1,2,3)

VisibilityInRuby,evenwhencallingthesamemethod,itcanbeorcannotbecalleddependingonthelocation(meaningtheobject).Thisfunctionalityisusuallycalled“visibility”(whetheritisvisible).InRuby,thebelowthreetypesofmethodscanbedefined.

public

private

protected

publicmethodscanbecalledfromanywhereinanyform.privatemethodscanonlybecalledinaform“syntactically”withoutareceiver.Ineffecttheycanonlybecalledbyinstancesoftheclassinwhichtheyweredefinedandininstancesofitssubclass.protectedmethodscanonlybecalledbyinstancesofthedefiningclassanditssubclasses.Itdiffersfromprivatethatmethodscanstillbecalledfromotherinstancesofthesameclass.

ThetermsarethesameasinC++butthemeaningisslightlydifferent.Becareful.

Usuallywecontrolvisibilityasshownbelow.

classC

Page 347: Ruby Hacking Guide

publicdefa1()end#becomespublicdefa2()end#becomespublic

privatedefb1()end#becomesprivatedefb2()end#becomesprivate

protecteddefc1()end#becomesprotecteddefc2()end#becomesprotectedend

Herepublic,privateand`protectedaremethodcallswithoutparentheses.Thesearen’tevenreservedwords.

publicandprivatecanalsobeusedwithanargumenttosetthevisibilityofaparticularmethod.Butitsmechanismisnotinteresting.We’llleavethisout.

ModulefunctionsGivenamodule‘M’.Iftherearetwomethodswiththeexactsamecontent

M.method_name

M#method_name(Visibilityisprivate)

thenwecallthisamodulefunction.

Itisnotapparentwhythisshouldbeuseful.Butlet’slookatthenextexamplewhichishappilyused.

Page 348: Ruby Hacking Guide

Math.sin(5)#Ifusedforafewtimesthisismoreconvenient

includeMathsin(5)#Ifusedmoreoftenthisismorepractical

It’simportantthatbothfunctionshavethesamecontent.Withadifferentselfbutwiththesamecodethebehaviorshouldstillbethesame.Instancevariablesbecomeextremelydifficulttouse.Hencesuchmethodisverylikelyamethodinwhichonlyproceduresarewritten(likesin).That’swhytheyarecalledmodule“functions”.

Iterators

Ruby’siteratorsdifferabitfromJava’sorC++’siteratorclassesor‘Iterator’designpattern.Preciselyspeaking,thoseiteratorsarecalledexterioriterators,Ruby’siteratorsareinterioriterators.Regardingthis,it’sdifficulttounderstandfromthedefinitionsolet’sexplainitwithaconcreteexample.

arr=[0,2,4,6.8]

Thisarrayisgivenandwewanttoaccesstheelementsinorder.InCstylewewouldwritethefollowing.

i=0whilei<arr.lengthprintarr[i]

Page 349: Ruby Hacking Guide

i+=1end

Usinganiteratorwecanwrite:

arr.eachdo|item|printitemend

Everythingfromeachdotoendisthecalltoaniteratormethod.Morepreciselyeachistheiteratormethodandbetweendoandendistheiteratorblock.Thepartbetweentheverticalbarsarecalledblockparameters,whichbecomevariablestoreceivetheparameterspassedfromtheiteratormethodtotheblock.

Sayingitalittleabstractly,aniteratorissomethinglikeapieceofcodewhichhasbeencutoutandpassed.Inourexamplethepieceprintitemhasbeencutoutandispassedtotheeachmethod.Theneachtakesalltheelementsofthearrayinorderandpassesthemtothecutoutpieceofcode.

Wecanalsothinktheotherwayround.Theotherpartsexceptprintitemarebeingcutoutandenclosedintotheeachmethod.

i=0whilei<arr.lengthprintarr[i]i+=1end

arr.eachdo|item|printitemend

Page 350: Ruby Hacking Guide

Comparisonwithhigherorderfunctions

WhatcomesclosestinCtoiteratorsarefunctionswhichreceivefunctionpointers,itmeanshigherorderfunctions.ButtherearetwopointsinwhichiteratorsinRubyandhigherorderfunctionsinCdiffer.

Firstly,Rubyiteratorscanonlytakeoneblock.Forinstancewecan’tdothefollowing.

#Mistake.Severalblockscannotbepassed.array_of_array.eachdo|i|....enddo|j|....end

Secondly,Ruby’sblockscansharelocalvariableswiththecodeoutside.

lvar='ok'[0,1,2].eachdo|i|plvar#Canacceslocalvariableoutsidetheblock.end

That’swhereiteratorsareconvenient.

Butvariablescanonlybesharedwiththeoutside.Theycannotbesharedwiththeinsideoftheiteratormethod(e.g.each).Puttingit

Page 351: Ruby Hacking Guide

intuitively,onlythevariablesintheplacewhichlooksofthesourcecodecontinuedarevisible.

BlockLocalVariablesLocalvariableswhichareassignedinsideablockstaylocaltothatblock,itmeanstheybecomeblocklocalvariables.Let’scheckitout.

[0].eachdoi=0pi#0end

Fornow,tocreateablock,weapplyeachonanarrayoflength1(Wecanfullyleaveouttheblockparameter).Inthatblock,theivariableisfirstassigned..meaningdeclared.Thismakesiblocklocal.

Itissaidblocklocal,soitshouldnotbeabletoaccessfromtheoutside.Let’stestit.

%ruby-e'[0].eachdoi=0endpi#Hereoccursanerror.'-e:5:undefinedlocalvariableormethod`i'for#<Object:0x40163a9c>(NameError)

Whenwereferencedablocklocalvariablefromoutsidetheblock,

Page 352: Ruby Hacking Guide

surelyanerroroccured.Withoutadoubtitstayedlocaltotheblock.

Iteratorscanalsobenestedrepeatedly.Eachtimethenewblockcreatesanotherscope.

lvar=0[1].eachdovar1=1[2].eachdovar2=2[3].eachdovar3=3#Herelvar,var1,var2,var3canbeseenend#Herelvar,var1,var2canbeseenend#Herelvar,var1canbeseenend#Hereonlylvarcanbeseen

There’sonepointwhichyouhavetokeepinmind.Differingfromnowadays’majorlanguagesRuby’sblocklocalvariablesdon’tdoshadowing.ShadowingmeansforinstanceinCthatinthecodebelowthetwodeclaredvariablesiaredifferent.

{inti=3;printf("%d\n",i);/*3*/{inti=99;printf("%d\n",i);/*99*/}printf("%d\n",i);/*3(元に戻った)*/}

Page 353: Ruby Hacking Guide

Insidetheblocktheiinsideovershadowstheioutside.That’swhyit’scalledshadowing.

ButwhathappenswithblocklocalvariablesofRubywherethere’snoshadowing.Let’slookatthisexample.

i=0pi#0[0].eachdoi=1pi#1endpi#1thechangeispreserved

Evenwhenweassigniinsidetheblock,ifthereisthesamenameoutside,itwouldbeused.Thereforewhenweassigntoinsidei,thevalueofoutsideiwouldbechanged.Onthispointtherecamemanycomplains:“Thisiserrorprone.Pleasedoshadowing.”Eachtimethere’snearlyflamingbuttillnownoconclusionwasreached.

ThesyntaxofiteratorsTherearesomesmallertopicsleft.

First,therearetwowaystowriteaniterator.Oneisthedo~endasusedabove,theotheroneistheenclosinginbraces.Thetwoexpressionsbelowhaveexactlythesamemeaning.

arr.eachdo|i|putsiend

Page 354: Ruby Hacking Guide

arr.each{|i|#Theauthorlikesafourspaceindentationforputsi#aniteratorwithbraces.}

Butgrammaticallytheprecedenceisdifferent.Thebracesbindmuchstrongerthando~end.

mmdo....end#m(m)do....endmm{....}#m(m(){....})

Anditeratorsaredefinitelymethods,sotherearealsoiteratorsthattakearguments.

re=/^\d/#regularexpressiontomatchadigitatthebeginningoftheline$stdin.grep(re)do|line|#lookrepeatedlyforthisregularexpression....end

yield

Ofcourseuserscanwritetheirowniterators.Methodswhichhaveayieldintheirdefinitiontextareiterators.Let’strytowriteaniteratorwiththesameeffectasArray#each:

#addingthedefinitiontotheArrayclassclassArraydefmy_eachi=0whilei<self.lengthyieldself[i]i+=1endendend

Page 355: Ruby Hacking Guide

#thisistheoriginaleach[0,1,2,3,4].eachdo|i|piend

#my_eachworksthesame[0,1,2,3,4].my_eachdo|i|piend

yieldcallstheblock.Atthispointcontrolispassedtotheblock,whentheexecutionoftheblockfinishesitreturnsbacktothesamelocation.Thinkaboutitlikeacharacteristicfunctioncall.Whenthepresentmethoddoesnothaveablockaruntimeerrorwilloccur.

%ruby-e'[0,1,2].each'-e:1:in`each':noblockgiven(LocalJumpError)from-e:1

Proc

Isaid,thatiteratorsarelikecutoutcodewhichispassedasanargument.Butwecanevenmoredirectlymakecodetoanobjectandcarryitaround.

twice=Proc.new{|n|n*2}ptwice.call(9)#18willbeprinted

Inshort,itislikeafunction.Asmightbeexpectedfromthefactitiscreatedwithnew,thereturnvalueofProc.newisaninstanceoftheProcclass.

Page 356: Ruby Hacking Guide

Proc.newlookssurelylikeaniteratoranditisindeedso.Itisanordinaryiterator.There’sonlysomemysticmechanisminsideProc.newwhichturnsaniteratorblockintoanobject.

BesidesthereisafunctionstylemethodlambdaprovidedwhichhasthesameeffectasProc.new.Choosewhateversuitsyou.

twice=lambda{|n|n*2}

IteratorsandProcWhydidwestarttalkingallofasuddenaboutProc?BecausethereisadeeprelationshipbetweeniteratorsandProc.Infact,iteratorblocksandProcobjectsarequitethesamething.That’swhyonecanbetransformedintotheother.

First,toturnaniteratorblockintoaProcobjectonehastoputan&infrontoftheparametername.

defprint_block(&block)pblockend

print_block()doend#Showssomethinglike<Proc:0x40155884>print_block()#Withoutablocknilisprinted

Withan&infrontoftheargumentname,theblockistransformedtoaProcobjectandassignedtothevariable.Ifthemethodisnotaniterator(there’snoblockattached)nilisassigned.

Page 357: Ruby Hacking Guide

Andintheotherdirection,ifwewanttopassaProctoaniteratorwealsouse&.

block=Proc.new{|i|pi}[0,1,2].each(&block)

Thiscodemeansexactlythesameasthecodebelow.

[0,1,2].each{|i|pi}

Ifwecombinethesetwo,wecandelegateaniteratorblocktoamethodsomewhereelse.

defeach_item(&block)[0,1,2].each(&block)end

each_itemdo|i|#sameas[0,1,2].eachdo|i|piend

Expressions

“Expressions”inRubyarethingswithwhichwecancreateotherexpressionsorstatementsbycombiningwiththeothers.Forinstanceamethodcallcanbeanothermethodcall’sargument,soitisanexpression.Thesamegoesforliterals.Butliteralsandmethodcallsarenotalwayscombinationsofelements.Onthe

Page 358: Ruby Hacking Guide

contrary,“expressions”,whichI’mgoingtointroduce,alwaysconsistsofsomeelements.

if

Weprobablydonotneedtoexplaintheifexpression.Iftheconditionalexpressionistrue,thebodyisexecuted.AsexplainedinPart1,everyobjectexceptnilandfalseistrueinRuby.

ifcond0then....elsifcond1then....elsifcond2then....else....end

elsif/else-clausescanbeomitted.Eachthenaswell.Buttherearesomefinerrequirementsconcerningthen.Forthiskindofthing,lookingatsomeexamplesisthebestwaytounderstand.HereonlythingI’dsayisthatthebelowcodesarevalid.

#1#4ifcondthen.....endifcondthen....end#2ifcond;....end#5ifcond#3thenifcondthen;....end....end

Page 359: Ruby Hacking Guide

AndinRuby,ifisanexpression,sothereisthevalueoftheentireifexpression.Itisthevalueofthebodywhereaconditionexpressionismet.Forexample,iftheconditionofthefirstifistrue,thevaluewouldbetheoneofitsbody.

p(iftruethen1else2end)#=>1p(iffalsethen1else2end)#=>2p(iffalsethen1elsiftruethen2else3end)#=>2

Ifthere’snomatch,orthematchedclauseisempty,thevaluewouldbenil.

p(iffalsethen1end)#=>nilp(iftruethenend)#=>nil

unless

Anifwithanegatedconditionisanunless.Thefollowingtwoexpressionshavethesamemeaning.

unlesscondthenifnot(cond)then........endend

unlesscanalsohaveattachedelseclausesbutanyelsifcannotbeattached.Needlesstosay,thencanbeomitted.

unlessalsohasavalueanditsconditiontodecideiscompletelythesameasif.Itmeanstheentirevaluewouldbethevalueofthebodyofthematchedclause.Ifthere’snomatchorthematched

Page 360: Ruby Hacking Guide

clauseisempty,thevaluewouldbenil.

and&&or||

Themostlikelyutilizationoftheandisprobablyabooleanoperation.Forinstanceintheconditionalexpressionofanif.

ifcond1andcond2puts'ok'end

ButasinPerl,shorLisp,itcanalsobeusedasaconditionalbranchexpression.Thetwofollowingexpressionshavethesamemeaning.

ifinvalid?(key)invalid?(key)andreturnnilreturnnilend

&&andandhavethesamemeaning.Differentisthebindingorder.

methodarg0&&arg1#method(arg0&&arg1)methodarg0andarg1#method(arg0)andarg1

Basicallythesymbolicoperatorcreatesanexpressionwhichcanbeanargument(arg).Thealphabeticaloperatorcreatesanexpressionwhichcannotbecomeanargument(expr).

Asforand,iftheevaluationofthelefthandsideistrue,therighthandsidewillalsobeevaluated.

Ontheotherhandoristheoppositeofand.Iftheevaluationofthe

Page 361: Ruby Hacking Guide

lefthandsideisfalse,therighthandsidewillalsobeevaluated.

valid?(key)orreturnnil

orand||havethesamerelationshipas&&andand.Onlytheprecedenceisdifferent.

TheConditionalOperatorThereisaconditionaloperatorsimilartoC:

cond?iftrue:iffalse

Thespacebetweenthesymbolsisimportant.Iftheybumptogetherthefollowingweirdnesshappens.

cond?iftrue:iffalse#cond?(iftrue(:iffalse))

Thevalueoftheconditionaloperatoristhevalueofthelastexecutedexpression.Eitherthevalueofthetruesideorthevalueofthefalseside.

whileuntil

Here’sawhileexpression.

whileconddo....end

Page 362: Ruby Hacking Guide

Thisisthesimplestloopsyntax.Aslongascondistruethebodyisexecuted.Thedocanbeomitted.

untilio_ready?(id)dosleep0.5end

untilcreatesaloopwhoseconditiondefinitionisopposite.Aslongastheconditionisfalseitisexecuted.Thedocanbeomitted.

Naturallythereisalsojumpsyntaxestoexitaloop.breakasinC/C++/Javaisalsobreak,butcontinueisnext.PerhapsnexthascomefromPerl.

i=0whiletrueifi>10break#exittheloopelsifi%2==0i*=2next#nextloopiterationendi+=1end

AndthereisanotherPerlism:theredo.

whilecond#(A)....redo....end

Page 363: Ruby Hacking Guide

Itwillreturnto(A)andrepeatfromthere.Whatdiffersfromnextisitdoesnotcheckthecondition.

Imightcomeintotheworldtop100,iftheamountofRubyprogramswouldbecounted,butIhaven’tusedredoyet.ItdoesnotseemtobenecessaryafterallbecauseI’velivedhappilydespiteofit.

case

Aspecialformoftheifexpression.Itperformsbranchingonaseriesofconditions.Thefollowingleftandrightexpressionsareidenticalinmeaning.

casevaluewhencond1thenifcond1===value........whencond2thenelsifcond2===value........whencond3,cond4thenelsifcond3===valueorcond4===value........elseelse........endend

Thethreefoldequals===is,asthesameasthe==,actuallyamethodcall.Noticethatthereceiveristheobjectonthelefthandside.Concretely,ifitisthe===ofanArray,itwouldcheckifitcontainsthevalueasitselement.IfitisaHash,ittestswhetherithasthevalueasitskey.Ifitsisanregularexpression,ittestsifthevaluematches.Andsoon.Sincecasehasmanygrammaticalelements,to

Page 364: Ruby Hacking Guide

listthemallwouldbetedious,thuswewillnotcovertheminthisbook.

ExceptionsThisisacontrolstructurewhichcanpassovermethodboundariesandtransmiterrors.ReaderswhoareacquaintedtoC++orJavawillknowaboutexceptions.Rubyexceptionsarebasicallythesame.

InRubyexceptionscomeintheformofthefunctionstylemethodraise.raiseisnotareservedword.

raiseArgumentError,"wrongnumberofargument"

InRubyexceptionareinstancesoftheExceptionclassandit’ssubclasses.Thisformtakesanexceptionclassasitsfirstargumentandanerrormessageasitssecondargument.IntheabovecaseaninstanceofArgumentErroriscreatedand“thrown”.Exceptionobjectwouldditchthepartaftertheraiseandstarttoreturnupwardsthemethodcallstack.

defraise_exceptionraiseArgumentError,"wrongnumberofargument"#thecodeaftertheexceptionwillnotbeexecutedputs'afterraise'endraise_exception()

Ifnothingblockstheexceptionitwillmoveonandonandfinallyit

Page 365: Ruby Hacking Guide

willreachthetoplevel.Whenthere’snoplacetoreturnanymore,rubygivesoutamessageandendswithanon-zeroexitcode.

%rubyraise.rbraise.rb:2:in`raise_exception':wrongnumberofargument(ArgumentError)fromraise.rb:7

Howeveranexitwouldbesufficientforthis,andforanexceptionthereshouldbeawaytosethandlers.InRuby,begin~rescue~endisusedforthis.Itresemblesthetry~catchinC++andJava.

defraise_exceptionraiseArgumentError,"wrongnumberofargument"end

beginraise_exception()rescueArgumentError=>errthenputs'exceptioncatched'perrend

rescueisacontrolstructurewhichcapturesexceptions,itcatchesexceptionobjectsofthespecifiedclassanditssubclasses.Intheaboveexample,aninstanceofArgumentErrorcomesflyingintotheplacewhereArgumentErroristargeted,soitmatchesthisrescue.By=>errtheexceptionobjectwillbeassignedtothelocalvariableerr,afterthattherescuepartisexecuted.

%rubyrescue.rbexceptioncatched#<ArgumentError:wrongnumberofargument>

Page 366: Ruby Hacking Guide

Whenanexceptionisrescued,itwillgothroughtherescueanditwillstarttoexecutethesubsequentasifnothinghappened,butwecanalsomakeitretryfromthebegin.Todoso,retryisused.

begin#theplacetoreturn....rescueArgumentError=>errthenretry#retryyourlifeend

Wecanomitthe=>errandthethenafterrescue.Wecanalsoleaveouttheexceptionclass.Inthiscase,itmeansasthesameaswhentheStandardErrorclassisspecified.

Ifwewanttocatchmoreexceptionclasses,wecanjustwritetheminline.Whenwewanttohandledifferenterrorsdifferently,wecanspecifyseveralrescueclauses.

beginraiseIOError,'portnotready'rescueArgumentError,TypeErrorrescueIOErrorrescueNameErrorend

Whenwritteninthisway,arescueclausethatmatchestheexceptionclassissearchedinorderfromthetop.Onlythematchedclausewillbeexecuted.Forinstance,onlytheclauseofIOErrorwillbeexecutedintheabovecase.

Ontheotherhand,whenthereisanelseclause,itisexecutedonlywhenthereisnoexception.

Page 367: Ruby Hacking Guide

beginnil#OfcourseherewillnoerroroccurrescueArgumentError#Thispartwillnotbeexecutedelse#Thispartwillbeexecutedend

Moreoveranensureclausewillbeexecutedineverycase:whenthereisnoexception,whenthereisanexception,rescuedornot.

beginf=File.open('/etc/passwd')#dostuffensure#thispartwillbeexecutedanywayf.closeend

Bytheway,thisbeginexpressionalsohasavalue.Thevalueofthewholebegin~endexpressionisthevalueofthepartwhichwasexecutedlastamongbegin/rescue/elseclauses.Itmeansthelaststatementoftheclausesasidefromensure.Thereasonwhytheensureisnotcountedisprobablybecauseensureisusuallyusedforcleanup(thusitisnotamainline).

VariablesandConstantsReferringavariableoraconstant.Thevalueistheobjectthevariablepointsto.Wealreadytalkedintoomuchdetailaboutthevariousbehaviors.

Page 368: Ruby Hacking Guide

lvar@ivar@@cvarCONST$gvar

Iwanttoaddonemorething.Amongthevariablesstartingwith$,therearespecialkinds.Theyarenotnecessarilyglobalvariablesandsomehavestrangenames.

FirstthePerlishvariables$_and$~.$_savesthereturnvalueofgetsandothermethods,$~containsthelastmatchofaregularexpression.Theyareincrediblevariableswhicharelocalvariablesandsimultaneouslythreadlocalvariables.

Andthe$!toholdtheexceptionobjectwhenanerrorisoccured,the$?toholdthestatusofachildprocess,the$SAFEtorepresentthesecuritylevel,theyareallthreadlocal.

AssignmentVariableassignmentsareallperformedby=.Allvariablesaretypeless.Whatissavedisareferencetoanobject.Asitsimplementation,itwasaVALUE(pointer).

var=1obj=Object.new@ivar='string'@@cvar=['array']PI=3.1415926535$gvar={'key'=>'value'}

Page 369: Ruby Hacking Guide

However,asmentionedearlierobj.attr=valisnotanassignmentbutamethodcall.

SelfAssignmentvar+=1

ThissyntaxisalsoinC/C++/Java.InRuby,

var=var+1

itisashortcutofthiscode.DifferingfromC,theRuby+isamethodandthuspartofthelibrary.InC,thewholemeaningof+=isbuiltinthelanguageprocessoritself.AndinC++,+=and*=canbewhollyoverwritten,butwecannotdothisinRuby.InRuby+=isalwaysdefinedasanoperationofthecombinationof+andassignment.

Wecanalsocombineselfassignmentandanattribute-access-flavormethod.Theresultmorelookslikeanattribute.

classCdefi()@iend#Amethoddefinitioncanbewritteninoneline.defi=(n)@i=nendend

obj=C.newobj.i=1obj.i+=2#obj.i=obj.i+2pobj.i#3

Page 370: Ruby Hacking Guide

Ifthereis+=theremightalsobe++butthisisnotthecase.Whyisthatso?InRubyassignmentisdealtwithonthelanguagelevel.Butontheotherhandmethodsareinthelibrary.Keepingthesetwo,theworldofvariablesandtheworldofobjects,strictlyapartisanimportantpeculiarityofRuby.If++wereintroducedtheseparationmighteasilybebroken.That’swhythere’sno++

Somepeopledon’twanttogowithoutthebrevityof++.Ithasbeenproposedagainandagaininthemailinglistbutwasalwaysturneddown.Iamalsoinfavorof++butnotasmuchasIcan’tdowithout,andIhavenotfeltsomuchneedsof++inRubyinthefirstplace,soI’vekeptsilentanddecidedtoforgetaboutit.

defined?

defined?isasyntaxofaquitedifferentcolorinRuby.Ittellswhetheranexpressionvalueis“defined”ornotatruntime.

var=1defined?(var)#=>true

Inotherwordsittellswhetheravaluecanbeobtainedfromtheexpressionreceivedasitsargument(isitokaytocallitso?)whentheexpressionisevaluated.Thatsaidbutofcourseyoucan’twriteanexpressioncausingaparseerror,anditcouldnotdetectiftheexpressionissomethingcontainingamethodcallwhichraisesanerrorinit.

Iwouldhavelovedtotellyoumoreaboutdefined?butitwillnot

Page 371: Ruby Hacking Guide

appearagaininthisbook.Whatapity.

Statements

Astatementiswhatbasicallycannotbecombinedwiththeothersyntaxes,inotherwords,theyarelinedvertically.

Butitdoesnotmeanthere’snoevaluatedvalue.Forinstancetherearereturnvaluesforclassdefinitionstatementsandmethoddefinitionstatements.Howeverthisisrarelyrecommendedandisn’tuseful,you’dbetterregardthemlightlyinthisway.Herewealsoskipaboutthevalueofeachstatement.

TheEndingofastatementUptonowwejustsaid“Fornowoneline’sonestatement”.ButRuby’sstatementending’saren’tthatstraightforward.

FirstastatementcanbeendedexplicitlywithasemicolonasinC.Ofcoursethenwecanwritetwoandmorestatementsinoneline.

puts'Hello,World!';puts'Hello,Worldoncemore!'

Ontheotherhand,whentheexpressionapparentlycontinues,suchasjustafteropenedparentheses,dyadicoperators,oracomma,thestatementcontinuesautomatically.

Page 372: Ruby Hacking Guide

#1+3*method(6,7+8)1+3*method(6,7+8)

Butit’salsototallynoproblemtouseabackslashtoexplicitlyindicatethecontinuation.

p1+\2

TheModifiersifandunlessTheifmodifierisanirregularversionofthenormalifTheprogramsontheleftandrightmeanexactlythesame.

on_true()ifcondifcondon_true()end

Theunlessisthenegativeversion.Guardstatements(statementswhichexcludeexceptionalconditions)canbeconvenientlywrittenwithit.

TheModifierswhileanduntilwhileanduntilalsohaveabacknotation.

process()whilehave_content?sleep(1)untilready?

Page 373: Ruby Hacking Guide

Combiningthiswithbeginandendgivesado-while-looplikeinC.

beginres=get_response(id)endwhileneed_continue?(res)

ClassDefinitionclassC<SuperClass....end

DefinestheclassCwhichinheritsfromSuperClass

WetalkedquiteextensivelyaboutclassesinPart1.Thisstatementwillbeexecuted,theclasstobedefinedwillbecomeselfwithinthestatement,arbitraryexpressionscanbewrittenwithin.Classdefinitionscanbenested.TheyformthefoundationofRubyexecutionimage.

MethodDefinitiondefm(arg)end

I’vealreadywrittenaboutmethoddefinitionandwon’taddmore.Thissectionisputtomakeitclearthattheyalsobelongtostatements.

Page 374: Ruby Hacking Guide

SingletonmethoddefinitionWealreadytalkedalotaboutsingletonmethodsinPart1.Theydonotbelongtoclassesbuttoobjects,infact,theybelongtosingletonclasses.Wedefinesingletonmethodsbyputtingthereceiverinfrontofthemethodname.Parameterdeclarationisdonethesamewaylikewithordinarymethods.

defobj.some_methodend

defobj.some_method2(arg1,arg2,darg=nil,*rest,&block)end

DefinitionofSingletonmethodsclass<<obj....end

Fromtheviewpointofpurposes,itisthestatementtodefinesomesingletonmethodsinabundle.Fromtheviewpointofmeasures,itisthestatementinwhichthesingletonclassofobjbecomesselfwhenexecuted.InallovertheRubyprogram,thisistheonlyplacewhereasingletonclassisexposed.

class<<objpself#=>#<Class:#<Object:0x40156fcc>>#SingletonClass「(obj)」defa()end#defobj.adefb()end#defobj.bend

Page 375: Ruby Hacking Guide

MultipleAssignmentWithamultipleassignment,severalassignmentscanbedoneallatonce.Thefollowingisthesimplestcase:

a,b,c=1,2,3

It’sexactlythesameasthefollowing.

a=1b=2c=3

Justbeingconciseisnotinteresting.infact,whenanarraycomesintobemixed,itbecomessomethingfunforthefirsttime.

a,b,c=[1,2,3]

Thisalsohasthesameresultastheabove.Furthermore,therighthandsidedoesnotneedtobeagrammaticallistoraliteral.Itcanalsobeavariableoramethodcall.

tmp=[1,2,3]a,b,c=tmpret1,ret2=some_method()#some_methodmightprobablyreturnseveralvalues

Preciselyspeakingitisasfollows.Herewe’llassumeobjis(theobjectof)thevalueofthelefthandside,

1. objifitisanarray2. ifitsto_arymethodisdefined,itisusedtoconvertobjtoan

Page 376: Ruby Hacking Guide

array.3. [obj]

Decidetheright-handsidebyfollowingthisprocedureandperformassignments.Itmeanstheevaluationoftheright-handsideandtheoperationofassignmentsaretotallyindependentfromeachother.

Anditgoeson,boththeleftandrighthandsidecanbeinfinitelynested.

a,(b,c,d)=[1,[2,3,4]]a,(b,(c,d))=[1,[2,[3,4]]](a,b),(c,d)=[[1,2],[3,4]]

Astheresultoftheexecutionofthisprogram,eachlinewillbea=1b=2c=3d=4.

Anditgoeson.Thelefthandsidecanbeindexorparameterassignments.

i=0arr=[]arr[i],arr[i+1],arr[i+2]=0,2,4parr#[0,2,4]

obj.attr0,obj.attr1,obj.attr2="a","b","c"

Andlikewithmethodparameters,*canbeusedtoreceiveinabundle.

first,*rest=0,1,2,3,4

Page 377: Ruby Hacking Guide

pfirst#0prest#[1,2,3,4]

Whenallofthemareusedallatonce,it’sextremelyconfusing.

BlockparameterandmultipleassignmentWebrushedoverblockparameterswhenweweretalkingaboutiterators.Butthereisadeeprelationshipbetweenthemandmultipleassignment.Forinstanceinthefollowingcase.

array.eachdo|i|....end

Everytimewhentheblockiscalled,theyieldedargumentsaremulti-assignedtoi.Herethere’sonlyonevariableonthelefthandside,soitdoesnotlooklikemultiassignment.Butiftherearetwoormorevariables,itwouldalittlemorelooklikeit.Forinstance,Hash#eachisanrepeatedoperationonthepairsofkeysandvalues,sousuallywecallitlikethis:

hash.eachdo|key,value|....end

Inthiscase,eacharrayconsistofakeyandavalueisyieldedfromthehash.

Page 378: Ruby Hacking Guide

Hencewecanalsodoesthefollowingthingbyusingnestedmultipleassignment.

#[[key,value],index]areyieldedhash.each_with_indexdo|(key,value),index|....end

alias

classCaliasneworigend

Defininganothermethodnewwiththesamebodyasthealreadydefinedmethodorig.aliasaresimilartohardlinksinaunixfilesystem.Theyareameansofassigningmultiplenamestoonemethodbody.Tosaythisinversely,becausethenamesthemselvesareindependentofeachother,evenifonemethodnameisoverwrittenbyasubclassmethod,theotheronestillremainswiththesamebehavior.

undef

classCundefmethod_nameend

ProhibitsthecallingofC#method_name.It’snotjustasimplerevokingofthedefinition.Ifthereevenwereamethodinthesuperclassitwouldalsobeforbidden.Inotherwordsthemethodisexchanged

Page 379: Ruby Hacking Guide

forasignwhichsays“Thismethodmustnotbecalled”.

undefisextremelypowerful,onceitissetitcannotbedeletedfromtheRubylevelbecauseitisusedtocoverupcontradictionsintheinternalstructure.Onlyoneleftmeasureisinheritinganddefiningamethodinthelowerclass.Eveninthatcase,callingsuperwouldcauseanerroroccurring.

ThemethodwhichcorrespondstounlinkinafilesystemisModule#remove_method.Whiledefiningaclass,selfreferstothatclass,wecancallitasfollows(RememberthatClassisasubclassofModule.)

classCremove_method(:method_name)end

Butevenwitharemove_methodonecannotcanceltheundef.It’sbecausethesignputupbyundefprohibitsanykindofsearches.

((errata:Itcanberedefinedbyusingdef))

Somemoresmalltopics

Comments#examplesofbadcomments.

Page 380: Ruby Hacking Guide

1+1#compute1+1.aliasmy_idid#my_idisanaliasofid.

Froma#totheendoflineisacomment.Itdoesn’thaveameaningfortheprogram.

Embeddeddocuments=beginThisisanembeddeddocument.It'ssocalledbecauseitisembeddedintheprogram.Plainandsimple.=end

Anembeddeddocumentstretchesfroman=beginoutsideastringatthebeginningofalinetoa=end.Theinteriorcanbearbitrary.Theprogramignoresitasamerecomment.

Multi-bytestringsWhentheglobalvariable$KCODEissettoeitherEUC,SJISorUTF8,stringsencodedineuc-jp,shift_jis,orutf8respectivelycanbeusedinastringofadata.

Andiftheoption-Ke,-Ksor-KuisgiventotherubycommandmultibytestringscanbeusedwithintheRubycode.Stringliterals,regularexpressionsandevenoperatornamescancontainmultibytecharacters.Henceitispossibletodosomethinglikethis:

def表⽰(arg)putsarg

Page 381: Ruby Hacking Guide

end

表⽰'にほんご'

ButIreallycannotrecommenddoingthingslikethat.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 382: Ruby Hacking Guide

RubyHackingGuide

TranslatedbyVincentISAMBART&ocha-

Page 383: Ruby Hacking Guide

Chapter9:yacccrash

course

Outline

ParserandscannerHowtowriteparsersforprogramminglanguageshasbeenanactiveareaofresearchforalongtime,andthereisaquitefirmestablishedtacticfordoingit.Ifwelimitourselvestoagrammarnottoostrange(orambiguous),wecansolvethisproblembyfollowingthismethod.

Thefirstpartconsistsinsplittingastringinalistofwords(ortokens).Thisiscalledascannerorlexer.Theterm“lexicalanalyzer”isalsoused,butistoocomplicatedtosaysowe’llusethenamescanner.

Whenspeakingaboutscanners,thecommonsensefirstsays“therearegenerallyspacesattheendofaword”.Andinpractice,itwasmadelikethisinmostprogramminglanguages,becauseit’stheeasiestway.

Page 384: Ruby Hacking Guide

Therecanalsobeexceptions.Forexample,intheoldFortran,whitespacesdidnothaveanymeaning.Thismeansawhitespacedidnotendaword,andyoucouldputspacesinthenameofavariable.Howeverthatmadetheparsingverycomplicatedsothecompilervendors,onebyone,startedignoringthatstandard.FinallyFortran90followedthistrendandmadethefactthatwhitespaceshaveanimpactthestandard.

Bytheway,itseemsthereasonwhitespaceshadnotmeaninginFortran77wasthatwhenwritingprogramsonpunchcardsitwaseasytomakeerrorsinthenumberofspaces.

ListofsymbolsIsaidthatthescannerspitsoutalistofwords(tokens),but,tobeexact,whatthescannercreatesisalistof“symbols”,notwords.

Whataresymbols?Let’stakenumbersasanexample.Inaprogramminglanguage,1,2,3,99areall“numbers”.Theycanallbehandledthesamewaybythegrammar.Wherewecanwrite1,wecanalsowrite2or3.That’swhytheparserdoesnotneedtohandlethemindifferentways.Fornumbers,“number”isenough.

“number”,“identifier”andotherscanbegroupedtogetheras“symbol”.ButbecarefulnottomixthiswiththeSymbolclass.

Thescannerfirstsplitsthestringintowordsanddetermineswhatthesesymbolsare.Forexample,NUMBERorDIGITfornumbers,IDENTIFIERfornameslike“name”,IFforthereservedwordif.These

Page 385: Ruby Hacking Guide

symbolsarethengiventothenextphase.

ParsergeneratorThelistofwordsandsymbolsspittedoutbythescanneraregoingtobeusedtoformatree.Thistreeiscalledasyntaxtree.

Thename“parser”isalsosometimesusedtoincludeboththescannerandthecreationofthesyntaxtree.However,wewillusethenarrowsenseof“parser”,thecreationofthesyntaxtree.Howdoesthisparsermakeatreefromthelistofsymbols?Inotherwords,onwhatshouldwefocustofindthetreecorrespondingtoapieceofcode?

Thefirstwayistofocusonthemeaningofthewords.Forexample,let’ssupposewefindthewordvar.Ifthedefinitionofthelocalvariablevarhasbeenfoundbeforethis,we’llunderstandit’sthereadingofalocalvariable.

Anotherwaysistoonlyfocusonwhatwesee.Forexample,ifafteranidentifiedcomesa‘=’,we’llunderstandit’sanassignment.Ifthereservedwordifappears,we’llunderstandit’sthestartofanifstatement.

Thelatermethod,focusingonlyonwhatwesee,isthecurrenttrend.Inotherwordsthelanguagemustbedesignedtobeanalyzedjustbylookingatthelistofsymbols.Thechoicewasbecausethiswayissimpler,canbemoreeasilygeneralizedandcanthereforebe

Page 386: Ruby Hacking Guide

automatizedusingtools.Thesetoolsarecalledparsergenerators.

ThemostusedparsergeneratorunderUNIXisyacc.Likemanyothers,ruby‘sparseriswrittenusingyacc.Theinputfileforthistoolisparser.y.That’swhytobeabletoreadruby’sparser,weneedtounderstandyacctosomeextent.(Note:Startingfrom1.9,rubyrequiresbisoninsteadofyacc.However,bisonismainlyyaccwithadditionalfunctionality,sothisdoesnotdiminishtheinterestofthischapter.)

Thischapterwillbeasimplepresentationofyacctobeabletounderstandparse.y,andthereforewewilllimitourselvestowhat’sneededtoreadparse.y.Ifyouwanttoknowmoreaboutparsersandparsergenerators,IrecommendyouabookIwrotecalled“Rubyを256倍使うための本無道編”(Thebooktouse256timesmoreofRuby-Unreasonablebook).IdonotrecommenditbecauseIwroteit,butbecauseinthisfieldit’stheeasiestbooktounderstand.Andbesidesit’scheapsostakeswillbelow.

Nevertheless,ifyouwouldlikeabookfromsomeoneelse(orcan’treadJapanese),IrecommendO’Reilly’s“lex&yaccprogramming”byJohnR.Levine,TonyMasonandDougBrown.Andifyourarestillnotsatisfied,youcanalsoread“Compilers”(alsoknownasthe“dragonbook”becauseofthedragononitscover)byAlfredV.Aho,RaviSethiandJeffreyD.Ullman.

Page 387: Ruby Hacking Guide

Grammar

GrammarfileTheinputfileforyacciscalled“grammarfile”,asit’sthefilewherethegrammariswritten.Theconventionistonamethisgrammarfile*.y.ItwillbegiventoyaccwhowillgenerateCsourcecode.Thisfilecanthenbecompiledasusual(figure1showsthefullprocess).

Figure1:Filedependencies

Theoutputfilenameisalwaysy.tab.candcan’tbechanged.Therecentversionsofyaccusuallyallowtochangeitonthecommandline,butforcompatibilityitwassafertokeepy.tab.c.Bytheway,itseemsthetabofy.tab.ccomesfromtable,aslotsofhugetablesaredefinedinit.It’sgoodtohavealookatthefileonce.

Thegrammarfile’scontenthasthefollowingform:

Page 388: Ruby Hacking Guide

▼Generalformofthegrammarfile

%{Header%}%union....%token....%type....

%%Rulespart%%Userdefinedpart

yacc‘sinputfileisfirstdividedin3partsby%%.Thefirstpartifcalledthedefinitionpart,hasalotofdefinitionsandsetups.Between%{and%}wecanwriteanythingwewantinC,likeforexamplenecessarymacros.Afterthat,theinstructionsstartingwith%arespecialyaccinstructions.Everytimeweuseone,we’llexplainit.

Themiddlepartofthefileiscalledtherulespart,andisthemostessentialpartforyacc.It’swhereiswrittenthegrammarwewanttoparse.We’llexplainitindetailsinthenextsection.

Thelastpartofthefile,theuserdefinedpart,canbeusedfreelybytheuser.yaccjustcopiesthispartverbatimintheoutputfile.It’susedforexampletoputauxiliaryroutinesneededbytheparser.

Whatdoesyaccdo.Whatyacctakescareofismainlythisrulespartinthemiddle.yacc

Page 389: Ruby Hacking Guide

takesthegrammarwrittenthereanduseittomakeafunctioncalledyyparse().It’stheparser,inthenarrowsenseoftheword.

Inthenarrowsense,soitmeansascannerisneeded.However,yaccwon’ttakecareofit,itmustbedonebytheuser.Thescanneristhefunctionnamedyylex().

Evenifyacccreatesyyparse(),itonlytakescareofitscorepart.The“actions”we’llmentionlaterisoutofitsscope.Youcanthinkthepartdonebyyaccistoosmall,butthat’snotthecase.That’sbecausethis“corepart”isoverlyimportantthatyaccsurvivedtothisdayeventhoughwekeepcomplainingaboutit.

Butwhatonearthisthiscorepart?That’swhatwe’regoingtosee.

BNFWhenwewanttowriteaparserinC,itscodewillbe“cutthestringthisway,makethisanifstatement…”Whenusingparsergenerators,wesaytheopposite,thatis“Iwouldliketoparsethisgrammar.”Doingthiscreatesforusaparsertohandlethegrammar.Thismeanstellingthespecificationgivesustheimplementation.That’stheconvenientpointofyacc.

Buthowcanwetellthespecification?Withyacc,themethodofdescriptionusedistheBNF(Backus-NaurForm).Let’slookataverysimpleexample.

if_stmt:IFexprTHENstmtEND

Page 390: Ruby Hacking Guide

Let’sseeseparatelywhat’sattheleftandattherightofthe“:”.Thepartontheleftside,if_stmt,isequaltotherightpart…iswhatImeanhere.Inotherwords,I’msayingthat:

if_stmtandIFexprTHENstmtENDareequivalent.

Here,if_stmt,IF,expr…areall“symbols”.expristheabbreviationofexpression,stmtofstatement.Itmustbeforsurethedeclarationoftheifstatement.

Onedefinitioniscalledarule.Thepartattheleftof“:”iscalledtheleftsideandtherightpartcalledtherightside.Thisisquiteeasytoremember.

Butsomethingismissing.Wedonotwantanifstatementwithoutbeingabletouseelse.Andevenifwecouldwriteelse,havingtoalwayswritetheelseevenwhenit’suselesswouldbecumbersome.Inthiscasewecoulddothefollowing:

if_stmt:IFexprTHENstmtEND|IFexprTHENstmtELSEstmtEND

“|”means“or”.

if_stmtiseither“IFexprTHENstmtEND”or“`IFexprTHENstmtELSEstmtEND`”.

That’sit.

Page 391: Ruby Hacking Guide

HereIwouldlikeyoutopayattentiontothesplitdonewith|.Withjustthis,onemoreruleisadded.Infact,punctuatingwith|isjustashorterwaytorepeattheleftside.Thepreviousexamplehasexactlythesamemeaningasthefollowing:

if_stmt:IFexprTHENstmtENDif_stmt:IFexprTHENstmtELSEstmtEND

Thismeanstworulesaredefinedintheexample.

Thisisnotenoughtocompletethedefinitionoftheifstatement.That’sbecausethesymbolsexprandstmtarenotsentbythescanner,theirrulesmustbedefined.TobeclosertoRuby,let’sboldlyaddsomerules.

stmt:if_stmt|IDENTIFIER'='expr/*assignment*/|expr

if_stmt:IFexprTHENstmtEND|IFexprTHENstmtELSEstmtEND

expr:IDENTIFIER/*readingavariable*/|NUMBER/*integerconstant*/|funcall/*FUNctionCALL*/

funcall:IDENTIFIER'('args')'

args:expr/*onlyoneparameter*/

Iusedtwonewelements.First,commentsofthesameformasinC,andcharacterexpressedusing'='.This'='isalsoofcourseasymbol.Symbolslike“=”aredifferentfromnumbersasthereis

Page 392: Ruby Hacking Guide

onlyonevarietyforthem.That’swhyforsymbolswherecanalsouse'='.Itwouldbegreattobeabletouseforstringsfor,forexample,reservedwords,butduetolimitationsoftheClanguagethiscannotbedone.

Weaddruleslikethis,tothepointwecompletewritingallthegrammar.Withyacc,theleftsideofthefirstwrittenruleis“thewholegrammarwewanttoexpress”.Sointhisexample,stmtexpressesthewholeprogram.

Itwasalittletooabstract.Let’sexplainthisalittlemoreconcretely.By“stmtexpressesthewholeprogram”,Imeanstmtandtherowsofsymbolsexpressedasequivalentbytherules,areallrecognizedasgrammar.Forexample,stmtandstmtareequivalent.Ofcourse.Thenexprisequivalenttostmt.That’sexpressedlikethisintherule.Then,NUMBERandstmtareequivalent.That’sbecauseNUMBERisexprandexprisstmt.

Wecanalsosaythatmorecomplicatedthingsareequivalent.

stmt↓if_stmt↓IFexprTHENstmtEND↓↓IFIDENTIFIERTHENexprEND↓IFIDENTIFIERTHENNUMBEREND

Whenithasexpandeduntilhere,allelementsbecomethesymbols

Page 393: Ruby Hacking Guide

sentbythescanner.Itmeanssuchsequenceofsymbolsiscorrectasaprogram.Orputtingittheotherwayaround,ifthissequenceofsymbolsissentbythescanner,theparsercanunderstanditintheoppositeorderofexpanding.

IFIDENTIFIERTHENNUMBEREND↓IFIDENTIFIERTHENexprEND↓↓IFexprTHENstmtEND↓if_stmt↓stmt

Andstmtisasymbolexpressingthewholeprogram.That’swhythissequenceofsymbolsisacorrectprogramfortheparser.Whenit’sthecase,theparsingroutineyyparse()endsreturning0.

Bytheway,thetechnicaltermexpressingthattheparsersucceededisthatit“accepted”theinput.Theparserislikeagovernmentoffice:ifyoudonotfillthedocumentsintheboxesexactlylikeheaskedyouto,he’llrefusethem.Theacceptedsequencesofsymbolsaretheonesforwhichtheboxeswherefilledcorrectly.Parserandgovernmentofficearestrangelysimilarforinstanceinthefactthattheycareaboutdetailsinspecificationandthattheyusecomplicatedterms.

Terminalsymbolsandnonterminalsymbols

Page 394: Ruby Hacking Guide

Well,intheconfusionofthemomentIusedwithoutexplainingittheexpression“symbolscomingfromthescanner”.Solet’sexplainthis.Iuseoneword“symbol”buttherearetwotypes.

Thefirsttypeofthesymbolsaretheonessentbythescanner.Theyareforexample,IF,THEN,END,'=',…Theyarecalledterminalsymbols.That’sbecauselikebeforewhenwedidthequickexpansionwefindthemalignedattheend.Inthischapterterminalsymbolsarealwayswrittenincapitalletters.However,symbolslike'='betweenquotesarespecial.Symbolslikethisareallterminalsymbols,withoutexception.

Theothertypeofsymbolsaretheonesthatnevercomefromthescanner,forexampleif_stmt,exprorstmt.Theyarecallednonterminalsymbols.Astheydon’tcomefromthescanner,theyonlyexistintheparser.Nonterminalsymbolsalsoalwaysappearatonemomentortheotherastheleftsideofarule.Inthischapter,nonterminalsymbolsarealwayswritteninlowercaseletters.

HowtotestI’mnowgoingtotellyouthewaytoprocessthegrammarfilewithyacc.

%tokenABCDE%%list:ABC|de

de:DE

Page 395: Ruby Hacking Guide

First,putallterminalsymbolsusedafter%token.However,youdonothavetotypethesymbolswithquotes(like'=').Then,put%%tomarkachangeofsectionandwritethegrammar.That’sall.

Let’snowprocessthis.

%yaccfirst.y%lsfirst.yy.tab.c%

LikemostUnixtools,“silencemeanssuccess”.

There’salsoimplementationsofyaccthatneedsemicolonsattheendof(groupsof)rules.Whenit’sthecaseweneedtodothefollowing:

%tokenABCDE%%list:ABC|de;

de:DE;

IhatethesesemicolonssointhisbookI’llneverusethem.

VoidrulesLet’snowlookalittlemoreatsomeoftheestablishedwaysofgrammardescription.I’llfirstintroducevoidrules.

Page 396: Ruby Hacking Guide

void:

There’snothingontherightside,thisruleis“void”.Forexample,thetwofollowingtargetsmeansexactlythesamething.

target:ABC

target:AvoidBvoidCvoid:

Whatistheuseofsuchathing?It’sveryuseful.Forexampleinthefollowingcase.

if_stmt:IFexprTHENstmtsopt_elseEND

opt_else:|ELSEstmts

Usingvoidrules,wecanexpresscleverlythefactthat“theelsesectionmaybeomitted”.Comparedtotherulesmadepreviouslyusingtwodefinitions,thiswayisshorterandwedonothavetodispersetheburden.

RecursivedefinitionsThefollowingexampleisstillalittlehardtounderstand.

list:ITEM/*rule1*/|listITEM/*rule2*/

Thisexpressesalistofoneormoreitems,inotherwordsanyof

Page 397: Ruby Hacking Guide

thefollowinglistsofsymbols:

ITEMITEMITEMITEMITEMITEMITEMITEMITEMITEM:

Doyouunderstandwhy?First,accordingtorule1listcanbereadITEM.Ifyoumergethiswithrule2,listcanbeITEMITEM.

list:listITEM=ITEMITEM

WenowunderstandthatthelistofsymbolsITEMITEMissimilartolist.Byapplyingagainrule2tolist,wecansaythat3ITEMarealsosimilartolist.Byquicklycontinuingthisprocess,thelistcangrowtoanysize.Thisissomethinglikemathematicalinduction.

I’llnowshowyouthenextexample.Thefollowingexampleexpressesthelistswith0ormoreITEM.

list:|listITEM

Firstthefirstlinemeans“listisequivalentto(void)”.ByvoidImeanthelistwith0ITEM.Then,bylookingatrule2wecansaythat“listITEM”isequivalentto1ITEM.That’sbecauselistisequivalenttovoid.

list:listITEM

Page 398: Ruby Hacking Guide

=(void)ITEM=ITEM

Byapplyingthesameoperationsofreplacementmultipletimes,wecanunderstandthatlististheexpressionalistof0ormoreitems.

Withthisknowledge,“listsof2ormoreITEM”or“listsof3ormoreITEM”areeasy,andwecanevencreate“listsofanevennumberofelements”.

list:|listITEMITEM

Constructionofvalues

ThisabstracttalklastedlongenoughsointhissectionI’dreallyliketogoonwithamoreconcretetalk.

ShiftandreduceUpuntilnow,variouswaystowritegrammarshavebeenexplained,butwhatwewantisbeingabletobuildasyntaxtree.However,I’mafraidtosay,onlytellingittherulesisnotenoughtobeabletoletitbuildasyntaxtree,asmightbeexpected.Therefore,thistime,I’lltellyouthewaytobuildasyntaxtreebyaddingsomethingtotherules.

Page 399: Ruby Hacking Guide

We’llfirstseewhattheparserdoesduringtheexecution.We’llusethefollowingsimplegrammarasanexample.

%tokenABC%%program:ABC

Intheparserthereisastackcalledthesemanticstack.Theparserpushesonitallthesymbolscomingfromthescanner.Thismoveiscalled“shiftingthesymbols”.

[AB]←Cshift

Andwhenanyoftherightsideofaruleisequaltotheendofthestack,itis“interpreted”.Whenthishappens,thesequenceoftheright-handsideisreplacedbythesymboloftheleft-handside.

[ABC]↓reduction[program]

Thismoveiscalled“reduceABC”toprogram".Thistermisalittlepresumptuous,butinshortitislike,whenyouhaveenoughnumberoftilesofhakuandhatsuandchurespectively,itbecomes“Bigthreedragons”inJapaneseMahjong,…thismightbeirrelevant.

Andsinceprogramexpressesthewholeprogram,ifthere’sonlyaprogramonthestack,itprobablymeansthewholeprogramisfoundout.Therefore,iftheinputisjustfinishedhere,itisaccepted.

Page 400: Ruby Hacking Guide

Let’strywithalittlemorecomplicatedgrammar.

%tokenIFESTHENEND%%program:if

if:IFexprTHENstmtsEND

expr:E

stmts:S|stmtsS

Theinputfromthescanneristhis.

IFETHENSSSEND

Thetransitionsofthesemanticstackinthiscaseareshownbelow.

Stack MoveemptyatfirstIF shiftIFIFE shiftEIFexpr reduceEtoexprIFexprTHEN shiftTHENIFexprTHENS shiftSIFexprTHENstmts reduceStostmtsIFexprTHENstmtsS shiftSIFexprTHENstmts reducestmtsStostmtsIFexprTHENstmtsS shiftSIFexprTHENstmts reducestmtsStostmtsIFexprTHENstmtsEND shiftENDif reduceIFexprTHENstmtsENDtoif

Page 401: Ruby Hacking Guide

program reduceiftoprogramaccept.

Astheendofthissection,there’sonethingtobecautiouswith.areductiondoesnotalwaysmeansdecreasingthesymbols.Ifthere’savoidrule,it’spossiblethatasymbolisgeneratedoutof“void”.

ActionNow,I’llstarttodescribetheimportantparts.Whichevershiftingorreducing,doingseveralthingsonlyinsideofthesemanticstackisnotmeaningful.Sinceourultimategoalwasbuildingasyntaxtree,itcannotbesufficientwithoutleadingtoit.Howdoesyaccdoitforus?Theansweryaccmadeisthat“weshallenabletohookthemomentwhentheparserperformingareduction.”Thehooksarecalledactionsoftheparser.Anactioncanbewrittenatthelastoftheruleasfollows.

program:ABC{/*Hereisanaction*/}

Thepartbetween{and}istheaction.Ifyouwritelikethis,atthemomentreducingABCtoprogramthisactionwillbeexecuted.Whateveryoudoasanactionisfree.IfitisaCcode,almostallthingscanbewritten.

ThevalueofasymbolThisisfurthermoreimportantbut,eachsymbolhas“itsvalue”.

Page 402: Ruby Hacking Guide

Bothterminalandnonterminalsymbolsdo.Asforterminalsymbols,sincetheycomefromthescanner,theirvaluesarealsogivenbythescanner.Forexample,1or9ormaybe108foraNUMBERsymbol.ForanIDENTIFIERsymbol,itmightbe"attr"or"name"or"sym".Anythingisfine.Eachsymbolanditsvaluearepushedtogetheronthesemanticstack.ThenextfigureshowsthestatejustthemomentSisshiftedwithitsvalue.

IFexprTHENstmtsSvaluevaluevaluevaluevalue

Accordingtothepreviousrule,stmtsScanbereducedtostmts.Ifanactioniswrittenattherule,itwouldbeexecuted,butatthatmoment,thevaluesofthesymbolscorrespondingtotheright-handsidearepassedtotheaction.

IFexprTHENstmtsS/*Stack*/v1v2v3v4v5↓↓stmts:stmtsS/*Rule*/↓↓{$1+$2;}/*Action*/

Thiswayanactioncantakethevalueofeachsymbolcorrespondingtotheright-handsideofarulethrough$1,$2,$3,…yaccwillrewritethekindsof$1and$2tothenotationtopointtothestack.HoweverbecauseitiswritteninClanguageitneedstohandle,forinstance,types,butbecauseitistiresome,let’sassumetheirtypesareofintforthemoment.

Page 403: Ruby Hacking Guide

Next,insteaditwillpushthesymboloftheleft-handside,butbecauseallsymbolshavetheirvaluestheleft-handsidesymbolmustalsohaveitsvalue.Itisexpressedas$$inactions,thevalueof$$whenleavinganactionwillbethevalueoftheleft-handsidesymbol.

IFexprTHENstmtsS/*thestackjustbeforereducing*/v1v2v3v4v5↓↓stmts:stmtsS/*therulethattheright-handsidematchestheend*/↑↓↓{$$=$1+$2;}/*itsaction*/

IFexprTHENstmts/*thestackafterreducing*/v1v2v3(v4+v5)

Astheendofthissection,thisisjustanextra.Thevalueofasymbolissometimescalled“semanticvalue”.Thereforethestacktoputthemisthe“semanticvaluestack”,anditiscalled“semanticstack”forshort.

yaccandtypesIt’sreallycumbersomebutwithouttalkingabouttypeswecannotfinishthistalk.Whatisthetypeofthevalueofasymbol?Tosaythebottomlinefirst,itwillbethetypenamedYYSTYPE.ThismustbetheabbreviationofeitherYYStackTYPEorSemanticvalueTYPE.AndYYSTYPEisobviouslythetypedefofsomewhatanothertype.Thetypeistheuniondefinedwiththeinstructionnamed%unioninthedefinitionpart.

Page 404: Ruby Hacking Guide

Wehavenotwritten%unionbeforebutitdidnotcauseanerror.Why?Thisisbecauseyaccconsideratelyprocesswiththedefaultvaluewithoutasking.ThedefaultvalueinCshouldnaturallybeint.Therefore,YYSTYPEisintbydefault.

Asforanexampleofayaccbookoracalculator,intcanbeusedunchanged.Butinordertobuildasyntaxtree,wewanttousestructsandpointersandtheothervariousthings.Thereforeforinstance,weuse%unionasfollows.

%union{structnode{inttype;structnode*left;structnode*right;}*node;intnum;char*str;}

Becausethisisnotforpracticaluse,thearbitrarynamesareusedfortypesandmembers.NoticethatitisdifferentfromtheordinalCbutthere’snosemicolonattheendofthe%uniconblock.

And,ifthisiswritten,itwouldlooklikethefollowinginy.tab.c.

typedefunion{structnode{inttype;structnode*left;structnode*right;}*node;intnum;char*str;

Page 405: Ruby Hacking Guide

}YYSTYPE;

And,asforthesemanticstack,

YYSTYPEyyvs[256];/*thesubstanceofthestack(yyvs=YYValueStack)*/YYSTYPE*yyvsp=yyvs;/*thepointertotheendofthestack*/

wecanexpectsomethinglikethis.Therefore,thevaluesofthesymbolsappearinactionswouldbe

/*theactionbeforeprocessedbyyacc*/target:ABC{func($1,$2,$3);}

/*afterconverted,itsappearanceiny.tab.c*/{func(yyvsp[-2],yyvsp[-1],yyvsp[0]);;

naturallylikethis.

Inthiscase,becausethedefaultvalueintisused,itcanbeaccessedjustbyreferringtothestack.IfYYSTYPEisaunion,itisnecessarytoalsospecifyoneofitsmembers.Therearetwowaystodothat,onewayisassociatingwitheachsymbol,anotherwayisspecifyingeverytime.

Generally,thewayofassociatingwitheachtypeisused.Byusing%tokenforterminalsymbolsandusing%typefornonterminalsymbols,itiswrittenasfollows.

%token<num>ABC/*AllofthevaluesofABCisoftypeint*/%type<str>target/*Allofthevaluesoftargetisoftypechar**/

Page 406: Ruby Hacking Guide

Ontheotherhand,ifyou’dliketospecifyeverytime,youcanwriteamembernameintonextto$asfollows.

%union{char*str;}%%target:{$<str>$="Inshort,thisisliketypecasting";}

You’dbetteravoidusingthismethodifpossible.Definingamemberforeachsymbolisbasic.

Couplingtheparserandthescannertogether

Afterall,I’vefinishedtotalkallaboutthisandthatofthevaluesinsidetheparser.Fortherest,I’lltalkingabouttheconnectingprotocolwiththescanner,thentheheartofthisstorywillbeallfinished.

First,we’dliketomakesurethatImentionedthatthescannerwastheyylex()function.each(terminal)symbolitselfisreturned(asint)asareturnvalueofthefunction.Sincetheconstantswiththesamenamesofsymbolsaredefined(#define)byyacc,wecanwriteNUMBERforaNUMBER.Anditsvalueispassedbyputtingitintoaglobalvariablenamedyylval.ThisyylvalisalsooftypeYYSTYPE,andtheexactlysamethingsastheparsercanbesaid.Inotherwords,ifitisdefinedin%unionitwouldbecomeaunion.Butthistimethememberisnotautomaticallyselected,itsmembernamehastobemanuallywritten.Theverysimpleexampleswouldlooklikethe

Page 407: Ruby Hacking Guide

following.

staticintyylex(){yylval.str=next_token();returnSTRING;}

Figure2summarizestherelationshipsdescribedbynow.I’dlikeyoutocheckonebyone.yylval,$$,$1,$2…allofthesevariablesthatbecometheinterfacesareoftypeYYSTYPE.

Figure2:Relationshipsamongyaccrelatedvariables&functions

EmbeddedAction

Page 408: Ruby Hacking Guide

Anactioniswrittenatthelastofarule,ishowitwasexplained.However,actuallyitcanbewritteninthemiddleofarule.

target:AB{puts("embeddedaction");}CD

Thisiscalled“embeddedaction”.Anembeddedactionismerelyasyntacticsugarofthefollowingdefinition:

target:ABdummyCD

dummy:/*voidrule*/{puts("embeddedaction");}

Fromthisexample,youmightbeabletotelleverythingincludingwhenitisexecuted.Thevalueofasymbolcanalsobetaken.Inotherwords,inthisexample,thevalueoftheembeddedactionwillcomeoutas$3.

PracticalTopics

ConflictsI’mnotafraidofyaccanymore.

Ifyouthoughtso,itistoonaive.Whyeveryoneisafraidsomuch

Page 409: Ruby Hacking Guide

aboutyacc,thereasonisgoingtoberevealed.

Upuntilnow,Iwrotenotsocarefully“whentheright-handsideoftherulematchestheendofthestack”,butwhathappensifthere’sarulelikethis:

target:ABC|ABC

WhenthesequenceofsymbolsABCactuallycomesout,itwouldbehardtodeterminewhichistheruletomatch.Suchthingcannotbeinterpretedevenbyhumans.Thereforeyaccalsocannotunderstandthis.Whenyaccfindoutanoddgrammarlikethis,itwouldcomplainthatareduce/reduceconflictoccurs.Itmeansmultiplerulesarepossibletoreduceatthesametime.

%yaccrrconf.yconflicts:1reduce/reduce

Butusually,Ithinkyouwon’tdosuchthingsexceptasanaccident.Buthowaboutthenextexample?Thedescribedsymbolsequenceiscompletelythesame.

target:abc|Abc

abc:ABC

bc:BC

Thisisrelativelypossible.Especiallywheneachpartis

Page 410: Ruby Hacking Guide

complicatedlymovedwhiledevelopingrules,itisoftenthecasethatthiskindofrulesaremadewithoutnoticing.

There’salsoasimilarpattern,asfollows:

target:abc|abC

abc:ABC

ab:AB

WhenthesymbolsequenceABCcomesout,it’shardtodeterminewhetheritshouldchooseoneabcorthecombinationofabandC.Inthiscase,yaccwillcomplainthatashift/reduceconflictoccurs.Thismeansthere’rebothashift-ableruleandareduce-ableruleatthesametime.

%yaccsrconf.yconflicts:1shift/reduce

Thefamousexampleofshift/reduceconflictsis“thehangingelseproblem”.Forexample,theifstatementofClanguagecausesthisproblem.I’lldescribeitbysimplifyingthecase:

stmt:expr';'|if

expr:IDENTIFIER

if:IF'('expr')'stmt|IF'('expr')'stmtELSEstmt

Page 411: Ruby Hacking Guide

Inthisrule,theexpressionisonlyIDENTIFIER(variable),thesubstanceofifisonlyonestatement.Now,whathappensifthenextprogramisparsedwiththisgrammar?

if(cond)if(cond)true_stmt;elsefalse_stmt;

Ifitiswrittenthisway,wemightfeellikeit’squiteobvious.Butactually,thiscanbeinterpretedasfollows.

if(cond){if(cond)true_stmt;}else{false_stmt;}

Thequestionis“betweenthetwoifs,insideoneoroutsideoue,whichistheonetowhichtheelseshouldbeattached?”.

Howevershift/reduceconflictsarerelativelylessharmfulthanreduce/reduceconflicts,becauseusuallytheycanbesolvedbychoosingshift.Choosingshiftisalmostequivalentto“connectingtheelementsclosertoeachother”anditiseasytomatchhumaninstincts.Infact,thehangingelsecanalsobesolvedbyshiftingit.Hence,theyaccfollowsthistrend,itchosesshiftbydefaultwhenashift/reduceconflictoccurs.

Page 412: Ruby Hacking Guide

Look-aheadAsanexperiment,I’dlikeyoutoprocessthenextgrammarwithyacc.

%tokenABC%%target:ABC/*rule1*/|AB/*rule2*/

Wecan’thelpexpectingthereshouldbeaconflict.AtthetimewhenithasreaduntilAB,therule1wouldattempttoshift,therule2wouldattempttoreduce.Inotherwords,thisshouldcauseashift/reduceconflict.However,….

%yaccconf.y%

It’sodd,there’snoconflict.Why?

Infact,theparsercreatedwithyacccanlookaheadonlyonesymbol.Beforeactuallydoingshiftorreduce,itcandecidewhattodobypeekingthenextsymbol.

Therefore,itisalsoconsideredforuswhengeneratingtheparser,iftherulecanbedeterminedbyasinglelook-ahead,conflictswouldbeavoided.Inthepreviousrules,forinstance,ifCcomesrightafterAB,onlytherule1ispossibleanditwouldbechose(shift).Iftheinputhasfinished,therule2wouldbechose(reduce).

Page 413: Ruby Hacking Guide

Noticethattheword“look-ahead”hastwomeanings:onethingisthelook-aheadwhileprocessing*.ywithyacc.Theotherthingisthelook-aheadwhileactuallyexecutingthegeneratedparser.Thelook-aheadduringtheexecutionisnotsodifficult,butthelook-aheadofyaccitselfisprettycomplicated.That’sbecauseitneedstopredictallpossibleinputpatternsanddecidesitsbehaviorsfromonlythegrammarrules.

However,because“allpossible”isactuallyimpossible,ithandles“mostof”patterns.Howbroadrangeoverallpatternsitcancoverupshowsthestrengthofalook-aheadalgorithm.Thelook-aheadalgorithmthatyaccuseswhenprocessinggrammarfilesisLALR,whichisrelativelypowerfulamongcurrentlyexistingalgorithmstoresolveconflicts.

Alotthingshavebeenintroduced,butyoudon’thavetosoworrybecausewhattodointhisbookisonlyreadingandnotwriting.WhatIwantedtoexplainhereisnotthelook-aheadofgrammarsbutthelook-aheadduringexecutions.

OperatorPrecedenceSinceabstracttalkshavelastedforlong,I’lltalkmoreconcretely.Let’strytodefinetherulesforinfixoperatorssuchas+or*.Therearealsoestablishedtacticsforthis,we’dbettertamelyfollowit.Somethinglikeacalculatorforarithmeticoperationsisdefinedbelow:

Page 414: Ruby Hacking Guide

expr:expr'+'expr|expr'-'expr|expr'*'expr|expr'/'expr|primary

primary:NUMBER|'('expr')'

primaryisthesmallestgrammarunit.Thepointisthatexprbetweenparenthesesbecomesaprimary.

Then,ifthisgrammariswrittentoanarbitraryfileandcompiled,theresultwouldbethis.

%yaccinfix.y16shift/reduceconflicts

Theyconflictaggressively.Thinkingfor5minutesisenoughtoseethatthisrulecausesaprobleminthefollowingandsimialrcases:

1-1-1

Thiscanbeinterpretedinbothofthenexttwoways.

(1-1)-11-(1-1)

Theformerisnaturalasannumericalexpression.Butwhatyaccdoesistheprocessoftheirappearances,theredoesnotcontainanymeanings.Asforthethingssuchasthemeaningthe-symbolhas,itisabsolutelynotconsideredatall.Inordertocorrectlyreflecta

Page 415: Ruby Hacking Guide

humanintention,wehavetospecifywhatwewantstepbystep.

Then,whatwecandoiswritingthisinthedefinitionpart.

%left'+''-'%left'*''/'

Theseinstructionsspecifiesboththeprecedenceandtheassociativityatthesametime.I’llexplaintheminorder.

Ithinkthattheterm“precedence”oftenappearswhentalkingaboutthegrammarofaprogramminglanguage.Describingitlogicallyiscomplicated,soifIputitinstinctively,itisabouttowhichoperatorparenthesesareattachedinthefollowingandsimilarcases.

1+2*3

If*hashigherprecedence,itwouldbethis.

1+(2*3)

If+hashigherprecedence,itwouldbethis.

(1+2)*3

Asshownabove,resolvingshift/reduceconflictsbydefiningthestrongeronesandweakeronesamongoperatorsisoperatorprecedence.

Page 416: Ruby Hacking Guide

However,iftheoperatorshasthesameprecedence,howcanitberesolved?Likethis,forinstance,

1-2-3

becausebothoperatorsare-,theirprecedencesarethecompletelysame.Inthiscase,itisresolvedbyusingtheassociativity.Associativityhasthreetypes:leftrightnonassoc,theywillbeinterpretedasfollows:

Associativity Interpretationleft(left-associative) (1–2)–3right(right-associative) 1–(2–3)nonassoc(non-associative) parseerror

Mostoftheoperatorsfornumericalexpressionsareleft-associative.Theright-associativeisusedmainlyfor=ofassignmentandnotofdenial.

a=b=1#(a=(b=1))notnota#(not(nota))

Therepresentativesofnon-associativeareprobablythecomparisonoperators.

a==b==c#parseerrora<=b<=c#parseerror

However,thisisnottheonlypossibility.InPython,forinstance,comparisonsbetweenthreetermsarepossible.

Page 417: Ruby Hacking Guide

Then,thepreviousinstructionsnamed%left%right%noassocareusedtospecifytheassociativitiesoftheirnames.And,precedenceisspecifiedastheorderoftheinstructions.Thelowertheoperatorswritten,thehighertheprecedencestheyhave.Iftheyarewritteninthesameline,theyhavethesamelevelofprecedence.

%left'+''-'/*left-associativeandthirdprecedence*/%left'*''/'/*left-associativeandsecondprecedence*/%right'!'/*right-associativeandfirstprecedence*/

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 418: Ruby Hacking Guide

RubyHackingGuide

TranslatedbyRobertGRAVINA&ocha-

Page 419: Ruby Hacking Guide

Chapter10:Parser

Outlineofthischapter

ParserconstructionThemainsourceoftheparserisparser.y.Becauseitis*.y,itistheinputforyaccandparse.cisgeneratedfromit.

Althoughonewouldexpectlex.ctocontainthescanner,thisisnotthecase.Thisfileiscreatedbygperf,takingthefilekeywordsasinput,anddefinesthereservedwordhashtable.Thistool-generatedlex.cis#includedin(thealsotool-generated)parse.c.Thedetailsofthisprocessissomewhatdifficulttoexplainatthistime,soweshallreturntothislater.

Figure1showstheparserconstructionprocess.ForthebenefitofthosereadersusingWindowswhomaynotbeaware,themv(move)commandcreatesanewcopyofafileandremovestheoriginal.ccis,ofcourse,theCcompilerandcpptheCpre-processor.

Page 420: Ruby Hacking Guide

Figure1:Parserconstructionprocess

Dissectingparse.yLet’snowlookatparse.yinabitmoredetail.Thefollowingfigurepresentsaroughoutlineofthecontentsofparse.y.

▼parse.y

%{header%}%union....%token....%type....

%%

rules

Page 421: Ruby Hacking Guide

%%usercodesectionparserinterfacescanner(characterstreamprocessing)syntaxtreeconstructionsemanticanalysislocalvariablemanagementIDimplementation

Asfortherulesanddefinitionspart,itisaspreviouslydescribed.Sincethispartisindeedtheheartoftheparser,I’llstarttoexplainitaheadoftheotherpartsinthenextsection.

Thereareaconsiderablenumberofsupportfunctionsdefinedintheusercodesection,butroughlyspeaking,theycanbedividedintothesixpartswrittenabove.Thefollowingtableshowswhereeachofpartsareexplainedinthisbook.

Part Chapter SectionParserinterface Thischapter Section3“Scanning”Scanner Thischapter Section3“Scanning”Syntaxtreeconstruction

Chapter12“Syntaxtreeconstruction”

Section2“Syntaxtreeconstruction”

Semanticanalysis Chapter12“Syntaxtreeconstruction”Section3“Semanticanalysis”

Localvariablemanagement

Chapter12“Syntaxtreeconstruction”

Section4“Localvariables”

IDimplementation

Chapter3“Namesandnametables”

Section2“IDandsymbols”

Page 422: Ruby Hacking Guide

Generalremarksaboutgrammarrules

CodingrulesThegrammarofrubyconformstoacodingstandardandisthuseasytoreadonceyouarefamiliarwithit.

Firstly,regardingsymbolnames,allnon-terminalsymbolsarewritteninlowercasecharacters.Terminalsymbolsareprefixedbysomelowercasecharacterandthenfollowedbyuppercase.Reservedwords(keywords)areprefixedwiththecharacterk.Otherterminalsymbolsareprefixedwiththecharactert.

▼Symbolnameexamples

Token Symbolname(non-terminalsymbol) bodystmtif kIFdef kDEFrescue kRESCUEvarname tIDENTIFIERConstName tCONST1 tINTEGER

TheonlyexceptionstotheserulesareklBEGINandklEND.Thesesymbolnamesrefertothereservedwordsfor“BEGIN”and“END”,respectively,andthelherestandsforlarge.Sincethereservedwordsbeginandendalreadyexist(naturally,withsymbolnameskBEGINandkEND),thesenon-standardsymbolnameswererequired.

Page 423: Ruby Hacking Guide

Importantsymbolsparse.ycontainsbothgrammarrulesandactions,however,fornowIwouldliketoconcentrateonthegrammarrulesalone.Thescriptsample/exyacc.rbcanbeusedtoextractthegrammarrulesfromthisfile.Asidefromthis,runningyacc-vwillcreatealogfiley.outputwhichalsocontainsthegrammarrules,howeveritisratherdifficulttoread.InthischapterIhaveusedaslightymodifiedversionofexyacc.rb\footnote{modifiedexyacc.rb:tools/exyacc2.rblocatedontheattachedCD-ROM}toextractthegrammarrules.

▼parse.y(rules)

program:compstmt

bodystmt:compstmtopt_rescueopt_elseopt_ensure

compstmt:stmtsopt_terms::

Theoutputisquitelong–over450linesofgrammarrules–andassuchIhaveonlyincludedthemostimportantpartsinthischapter.

Whichsymbols,then,arethemostimportant?Thenamessuchasprogram,expr,stmt,primary,argetc.arealwaysveryimportant.It’s

Page 424: Ruby Hacking Guide

becausetheyrepresentthegeneralpartsofthegrammaticalelementsofaprogramminglanguage.Thefollowingtableoutlinestheelementsweshouldgenerallyfocusoninthesyntaxofaprogram.

Syntaxelement PredictedsymbolnamesProgram programprogfileinputstmts

wholeSentence statementstmt

Expression expressionexprexp

Smallestelement primaryprim

Lefthandsideofanexpression lhs(lefthandside)Righthandsideofanexpression rhs(righthandside)

Functioncall funcallfunction_callcallfunction

Methodcall methodmethod_callcall

Argument argumentarg

Functiondefinition defundefinitionfunctionfndef

Declarations declarationdecl

Ingeneral,programminglanguagestendtohavethefollowinghierarchystructure.

Programelement Properties

Program Usuallyalistofstatements

Statement Whatcannotbecombinedwiththeothers.Asyntaxtreetrunk.

Expression Whatisacombinationbyitselfandcanalsobeapartofanotherexpression.Asyntaxtreeinternalnode.

Page 425: Ruby Hacking Guide

Primary Anelementwhichcannotbefurtherdecomposed.Asyntaxtreeleafnode.

ThestatementsarethingslikefunctiondefinitionsinCorclassdefinitionsinJava.Anexpressioncanbeaprocedurecall,anarithmeticexpressionetc.,whileaprimaryusuallyreferstoastringliteralornumber.Somelanguagesdonotcontainallofthesesymboltypes,howevertheygenerallycontainsomekindofhierarchyofsymbolssuchasprogram→stmt→expr→primary.

However,astructureatalowlevelcanbecontainedbyasuperiorstructure.Forexample,inCafunctioncallisanexpressionbutitcansolelybeput.Itmeansitisanexpressionbutitcanalsobeastatement.

Conversely,whensurroundedinparentheses,expressionsbecomeprimaries.Itisbecausethelowerthelevelofaelementthehighertheprecedenceithas.

Therangeofstatementsdifferconsiderablybetweenprogramminglanguages.Let’sconsiderassignmentasanexample.InC,becauseitispartofexpressions,wecanusethevalueofthewholeassignmentexpression.ButinPascal,assignmentisastatement,wecannotdosuchthing.Also,functionandclassdefinitionsaretypicallystatementshoweverinlanguagessuchasLispandScheme,sinceeverythingisanexpression,theydonothavestatementsinthefirstplace.RubyisclosetoLisp’sdesigninthisregard.

Page 426: Ruby Hacking Guide

ProgramstructureNowlet’sturnourattentiontothegrammarrulesofruby.Firstly,inyacc,thelefthandsideofthefirstrulerepresentstheentiregrammar.Currently,itisprogram.Followingfurtherandfurtherfromhere,asthesameastheestablishedtactic,thefourprogramstmtexprprimarywillbefound.Withaddingargtothem,let’slookattheirrules.

▼rubygrammar(outline)

program:compstmt

compstmt:stmtsopt_terms

stmts:none|stmt|stmtstermsstmt

stmt:kALIASfitemfitem|kALIAStGVARtGVAR::|expr

expr:kRETURNcall_args|kBREAKcall_args::|'!'command_call|arg

arg:lhs'='arg|var_lhstOP_ASGNarg|primary_value'['aref_args']'tOP_ASGNarg::

Page 427: Ruby Hacking Guide

|arg'?'arg':'arg|primary

primary:literal|strings::|tLPAREN_ARGexpr')'|tLPARENcompstmt')'::|kREDO|kRETRY

Ifwefocusonthelastruleofeachelement,wecanclearlymakeoutahierarchyofprogram→stmt→expr→arg→primary.

Also,we’dliketofocusonthisruleofprimary.

primary:literal::|tLPAREN_ARGexpr')'/*here*/

ThenametLPAREN_ARGcomesfromtforterminalsymbol,LforleftandPARENforparentheses–itistheopenparenthesis.Whythisisn’t'('iscoveredinthenextsection“Context-dependentscanner”.Anyway,thepurposeofthisruleisdemoteanexprtoaprimary.ThiscreatesacyclewhichcantheseeninFigure2,andthearrowshowshowthisruleisreducedduringparsing.

Page 428: Ruby Hacking Guide

Figure2:exprdemotion

Thenextruleisalsoparticularlyinteresting.

primary:literal::|tLPARENcompstmt')'/*here*/

Acompstmt,whichequalstotheentireprogram(program),canbedemotedtoaprimarywiththisrule.Thenextfigureillustratesthisruleinaction.

Page 429: Ruby Hacking Guide

Figure3:programdemotion

ThismeansthatforanysyntaxelementinRuby,ifwesurrounditwithparenthesisitwillbecomeaprimaryandcanbepassedasanargumenttoafunction,beusedastherighthandsideofanexpressionetc.Thisisanincrediblefact.Let’sactuallyconfirmit.

p((classC;end))p((defa()end))p((aliasaligets))p((iftruethennilelsenilend))p((1+1*1**1-1/1^1))

Ifweinvokerubywiththe-coption(syntaxcheck),wegetthefollowingoutput.

%ruby-cprimprog.rbSyntaxOK

Page 430: Ruby Hacking Guide

Indeed,it’shardtobelievebut,itcouldactuallypass.Apparently,wedidnotgetthewrongidea.

Ifwecareaboutthedetails,sincetherearewhatrejectedbythesemanticanalysis(seealsoChapter12“Syntaxtreeconstruction”),itisnotperfectlypossible.Forexamplepassingareturnstatementasanargumenttoafunctionwillresultinanerror.Butatleastattheleveloftheoutlooks,the“surroundinganythinginparenthesismeansitcanbepassedasanargumenttoafunction”ruledoeshold.

InthenextsectionIwillcoverthecontentsoftheimportantelementsonebyone.

program

▼program

program:compstmt

compstmt:stmtsopt_terms

stmts:none|stmt|stmtstermsstmt

Asmentionedearlier,programrepresentstheentiregrammarthatmeanstheentireprogram.Thatprogramequalstocompstmts,andcompstmtsisalmostequivalenttostmts.Thatstmtsisalistofstmtsdelimitedbyterms.Hence,theentireprogramisalistofstmtsdelimitedbyterms.

Page 431: Ruby Hacking Guide

termsis(ofcourse)anabbreviationfor“terminators”,thesymbolsthatterminatethesentences,suchassemicolonsornewlines.opt_termsmeans“OPTionalterms”.Thedefinitionsareasfollows:

▼opt_terms

opt_terms:|terms

terms:term|terms';'

term:';'|'\n'

Theinitial;or\nofatermscanbefollowedbyanynumberof;only;basedonthat,youmightstartthinkingthatifthereare2ormoreconsecutivenewlines,itcouldcauseaproblem.Let’stryandseewhatactuallyhappens.

1+1#firstnewline#secondnewline#thirdnewline1+1

Runthatwithruby-c.

%ruby-coptterms.rbSyntaxOK

Strange,itworked!Whatactuallyhappensisthis:consecutivenewlinesaresimplydiscardedbythescanner,whichreturnsonly

Page 432: Ruby Hacking Guide

thefirstnewlineinaseries.

Bytheway,althoughwesaidthatprogramisthesameascompstmt,ifthatwasreallytrue,youwouldquestionwhycompstmtexistsatall.Actually,thedistinctionisthereonlyforexecutionofsemanticactions.programexiststoexecuteanysemanticactionswhichshouldbedoneonceintheprocessingofanentireprogram.Ifitwasonlyaquestionofparsing,programcouldbeomittedwithnoproblemsatall.

Togeneralizethispoint,thegrammarrulescanbedividedinto2groups:thosewhichareneededforparsingtheprogramstructure,andthosewhichareneededforexecutionofsemanticactions.Thenonerulewhichwasmentionedearlierwhentalkingaboutstmtsisanotheronewhichexistsforexecutingactions—it’susedtoreturnaNULLpointerforanemptylistoftypeNODE*.

stmt

Nextisstmt.Thisoneisratherinvolved,sowe’lllookintoitabitatatime.

▼stmt(1)

stmt:kALIASfitemfitem|kALIAStGVARtGVAR|kALIAStGVARtBACK_REF|kALIAStGVARtNTH_REF|kUNDEFundef_list|stmtkIF_MODexpr_value|stmtkUNLESS_MODexpr_value

Page 433: Ruby Hacking Guide

|stmtkWHILE_MODexpr_value|stmtkUNTIL_MODexpr_value|stmtkRESCUE_MODstmt|klBEGIN'{'compstmt'}'|klEND'{'compstmt'}'

Lookingatthat,somehowthingsstarttomakesense.Thefirstfewhavealias,thenundef,thenthenextfewareallsomethingfollowedby_MOD—thoseshouldbestatementswithpostpositionmodifiers,asyoucanimagine.

expr_valueandprimary_valuearegrammarruleswhichexisttoexecutesemanticactions.Forexample,expr_valuerepresentsanexprwhichhasavalue.Expressionswhichdon’thavevaluesarereturnandbreak,orreturn/breakfollowedbyapostpositionmodifier,suchasanifclause.Foradetaileddefinitionofwhatitmeansto“haveavalue”,seechapter12,“SyntaxTreeConstruction”.Inthesameway,primary_valueisaprimarywhichhasavalue.

Asexplainedearlier,klBEGINandklENDrepresentBEGINandEND.

▼stmt(2)

|lhs'='command_call|mlhs'='command_call|var_lhstOP_ASGNcommand_call|primary_value'['aref_args']'tOP_ASGNcommand_call|primary_value'.'tIDENTIFIERtOP_ASGNcommand_call|primary_value'.'tCONSTANTtOP_ASGNcommand_call|primary_valuetCOLON2tIDENTIFIERtOP_ASGNcommand_call|backreftOP_ASGNcommand_call

Page 434: Ruby Hacking Guide

Lookingattheserulesallatonceistherightapproach.Thecommonpointisthattheyallhavecommand_callontheright-handside.command_callrepresentsamethodcallwiththeparenthesesomitted.Thenewsymbolswhichareintroducedhereareexplainedinthefollowingtable.Ihopeyou’llrefertothetableasyoucheckovereachgrammarrule.

lhs thelefthandsideofanassignment(LeftHandSide)

mlhs thelefthandsideofamultipleassignment(MultipleLeftHandSide)

var_lhs thelefthandsideofanassignmenttoakindofvariable(VARiableLeftHandSide)

tOP_ASGN compoundassignmentoperatorlike+=or*=(OPeratorASsiGN)

aref_args argumenttoa[]methodcall(ArrayREFerence)tIDENTIFIER identifierwhichcanbeusedasalocalvariabletCONSTANT constantidentifier(withleadinguppercaseletter)tCOLON2 ::backref $1$2$3...

arefisaLispjargon.There’salsoasetastheothersideofapair,whichisanabbreviationof“arrayset”.Thisabbreviationisusedatalotofplacesinthesourcecodeofruby.

▼stmt(3)

|lhs'='mrhs_basic|mlhs'='mrhs

Thesetwoaremultipleassignments.mrhshasthesamestructureas

Page 435: Ruby Hacking Guide

mlhsanditmeansmultiplerhs(therighthandside).We’vecometorecognizethatknowingthemeaningsofnamesmakesthecomprehensionmucheasier.

▼stmt(4)

|expr

Lastly,itjoinstoexpr.

expr

▼expr

expr:kRETURNcall_args|kBREAKcall_args|kNEXTcall_args|command_call|exprkANDexpr|exprkORexpr|kNOTexpr|'!'command_call|arg

Expression.Theexpressionofrubyisverysmallingrammar.That’sbecausethoseordinarycontainedinexpraremostlywentintoarg.Converselyspeaking,thosewhocouldnotgotoargarelefthere.Andwhatareleftare,again,methodcallswithoutparentheses.call_argsisanbareargumentlist,command_callis,aspreviouslymentioned,amethodwithoutparentheses.Ifthiskindofthingswascontainedinthe“small”unit,itwouldcauseconflicts

Page 436: Ruby Hacking Guide

tremendously.

However,thesetwobelowareofdifferentkind.

exprkANDexprexprkORexpr

kANDis“and”,andkORis“or”.Sincethesetwohavetheirrolesascontrolstructures,theymustbecontainedinthe“big”syntaxunitwhichislargerthancommand_call.Andsincecommand_calliscontainedinexpr,atleasttheyneedtobeexprtogowell.Forexample,thefollowingusageispossible…

valid_items.include?argorraiseArgumentError,'invalidarg'#valid_items.include?(arg)orraise(ArgumentError,'invalidarg')

However,iftheruleofkORexistedinarginsteadofexpr,itwouldbejoinedasfollows.

valid_items.include?((argorraise))ArgumentError,'invalidarg'

Obviously,thiswouldendupaparseerror.

arg

▼arg

arg:lhs'='arg|var_lhstOP_ASGNarg|primary_value'['aref_args']'tOP_ASGNarg|primary_value'.'tIDENTIFIERtOP_ASGNarg

Page 437: Ruby Hacking Guide

|primary_value'.'tCONSTANTtOP_ASGNarg|primary_valuetCOLON2tIDENTIFIERtOP_ASGNarg|backreftOP_ASGNarg|argtDOT2arg|argtDOT3arg|arg'+'arg|arg'-'arg|arg'*'arg|arg'/'arg|arg'%'arg|argtPOWarg|tUPLUSarg|tUMINUSarg|arg'|'arg|arg'^'arg|arg'&'arg|argtCMParg|arg'>'arg|argtGEQarg|arg'<'arg|argtLEQarg|argtEQarg|argtEQQarg|argtNEQarg|argtMATCHarg|argtNMATCHarg|'!'arg|'~'arg|argtLSHFTarg|argtRSHFTarg|argtANDOParg|argtOROParg|kDEFINEDopt_nlarg|arg'?'arg':'arg|primary

Althoughtherearemanyruleshere,thecomplexityofthegrammarisnotproportionatetothenumberofrules.Agrammarthatmerelyhasalotofcasescanbehandledveryeasilybyyacc,rather,thedepthorrecursiveoftheruleshasmoreinfluencesthe

Page 438: Ruby Hacking Guide

complexity.

Then,itmakesuscuriousabouttherulesaredefinedrecursivelyintheformofargOPargattheplaceforoperators,butbecauseforalloftheseoperatorstheiroperatorprecedencesaredefined,thisisvirtuallyonlyamereenumeration.Let’scutthe“mereenumeration”outfromtheargrulebymerging.

arg:lhs'='arg/*1*/|primaryT_opeqarg/*2*/|argT_infixarg/*3*/|T_prearg/*4*/|arg'?'arg':'arg/*5*/|primary/*6*/

There’snomeaningtodistinguishterminalsymbolsfromlistsofterminalsymbols,theyareallexpressedwithsymbolswithT_.opeqisoperator+equal,T_prerepresentstheprepositionaloperatorssuchas'!'and'~',T_infixrepresentstheinfixoperatorssuchas'*'and'%'.

Toavoidconflictsinthisstructure,thingslikewrittenbelowbecomeimportant(but,thesedoesnotcoverall).

T_infixshouldnotcontain'='.

Sinceargspartiallyoverlapslhs,if'='iscontained,therule1andtherule3cannotbedistinguished.

T_opeqandT_infixshouldnothaveanycommonrule.

Page 439: Ruby Hacking Guide

Sinceargscontainsprimary,iftheyhaveanycommonrule,therule2andtherule3cannotbedistinguished.

T_infixshouldnotcontain'?'.

Ifitcontains,therule3and5wouldproduceashift/reduceconflict.

T_preshouldnotcontain'?'or':'.

Ifitcontains,therule4and5wouldconflictinaverycomplicatedway.

Theconclusionisallrequirementsaremetandthisgrammardoesnotconflict.Wecouldsayit’samatterofcourse.

primary

Becauseprimaryhasalotofgrammarrules,we’llsplitthemupandshowtheminparts.

▼primary(1)

primary:literal|strings|xstring|regexp|words|qwords

Literals.literalisforSymbolliterals(:sym)andnumbers.

Page 440: Ruby Hacking Guide

▼primary(2)

|var_ref|backref|tFID

Variables.var_refisforlocalvariablesandinstancevariablesandetc.backrefisfor$1$2$3…tFIDisfortheidentifierswith!or?,say,include?reject!.There’snopossibilityoftFIDbeingalocalvariable,evenifitappearssolely,itbecomesamethodcallattheparserlevel.

▼primary(3)

|kBEGINbodystmtkEND

bodystmtcontainsrescueandensure.Itmeansthisisthebeginoftheexceptioncontrol.

▼primary(4)

|tLPAREN_ARGexpr')'|tLPARENcompstmt')'

Thishasalreadydescribed.Syntaxdemoting.

▼primary(5)

|primary_valuetCOLON2tCONSTANT

Page 441: Ruby Hacking Guide

|tCOLON3cname

Constantreferences.tCONSTANTisforconstantnames(capitalizedidentifiers).

BothtCOLON2andtCOLON3are::,buttCOLON3representsonlythe::whichmeansthetoplevel.Inotherwords,itisthe::of::Const.The::ofNet::SMTPistCOLON2.

Thereasonwhydifferentsymbolsareusedforthesametokenistodealwiththemethodswithoutparentheses.Forexample,itistodistinguishthenexttwofromeachother:

pNet::HTTP#p(Net::HTTP)pNet::HTTP#p(Net(::HTTP))

Ifthere’saspaceoradelimitercharactersuchasanopenparenthesisjustbeforeit,itbecomestCOLON3.Intheothercases,itbecomestCOLON2.

▼primary(6)

|primary_value'['aref_args']'

Index-formcalls,forinstance,arr[i].

▼primary(7)

|tLBRACKaref_args']'|tLBRACEassoc_list'}'

Page 442: Ruby Hacking Guide

ArrayliteralsandHashliterals.ThistLBRACKrepresentsalso'[','['meansa'['withoutaspaceinfrontofit.Thenecessityofthisdifferentiationisalsoasideeffectofmethodcallswithoutparentheses.

Theterminalsymbolsofthisruleisveryincomprehensiblebecausetheydiffersinjustacharacter.Thefollowingtableshowshowtoreadeachtypeofparentheses,soI’dlikeyoutomakeuseofitwhenreading.

▼Englishnamesforeachparentheses

Symbol EnglishName() parentheses{} braces[] brackets

▼primary(8)

|kRETURN|kYIELD'('call_args')'|kYIELD'('')'|kYIELD|kDEFINEDopt_nl'('expr')'

Syntaxeswhoseformsaresimilartomethodcalls.Respectively,return,yield,defined?.

Thereargumentsforyield,butreturndoesnothaveanyarguments.Why?Thefundamentalreasonisthatyielditselfhasitsreturnvaluebutreturndoesnot.However,evenifthere’snot

Page 443: Ruby Hacking Guide

anyargumentshere,itdoesnotmeanyoucannotpassvalues,ofcourse.Therewasthefollowingruleinexpr.

kRETURNcall_args

call_argsisabareargumentlist,soitcandealwithreturn1orreturnnil.Thingslikereturn(1)arehandledasreturn(1).Forthisreason,surroundingthemultipleargumentsofareturnwithparenthesesasinthefollowingcodeshouldbeimpossible.

return(1,2,3)#interpretedasreturn(1,2,3)andresultsinparseerror

Youcouldunderstandmoreaboutaroundhereifyouwillcheckthisagainafterreadingthenextchapter“Finite-StateScanner”.

▼primary(9)

|operationbrace_block|method_call|method_callbrace_block

Methodcalls.method_calliswitharguments(alsowithparentheses),operationiswithoutbothargumentsandparentheses,brace_blockiseither{~}ordo~endandifitisattachedtoamethod,themethodisaniterator.Forthequestion“Eventhoughitisbrace,whyisdo~endcontainedinit?”,there’sareasonthatismoreabyssalthanMarianTrench,butagaintheonlywaytounderstandisreadingthenextchapter“Finite-StateScanner”.

Page 444: Ruby Hacking Guide

▼primary(10)

|kIFexpr_valuethencompstmtif_tailkEND#if|kUNLESSexpr_valuethencompstmtopt_elsekEND#unless|kWHILEexpr_valuedocompstmtkEND#while|kUNTILexpr_valuedocompstmtkEND#until|kCASEexpr_valueopt_termscase_bodykEND#case|kCASEopt_termscase_bodykEND#case(Form2)|kFORblock_varkINexpr_valuedocompstmtkEND#for

Thebasiccontrolstructures.Alittleunexpectedly,thingsappeartobethisbigareputinsideprimary,whichis“small”.Becauseprimaryisalsoarg,wecanalsodosomethinglikethis.

p(iftruethen'ok'end)#shows"ok"

Imentioned“almostallsyntaxelementsareexpressions”wasoneofthetraitsofRuby.Itisconcretelyexpressedbythefactthatifandwhileareinprimary.

Whyistherenoproblemifthese“big”elementsarecontainedinprimary?That’sbecausetheRuby’ssyntaxhasatraitthat“itbeginswiththeterminalsymbolAandendswiththeterminalsymbolB”.Inthenextsection,we’llthinkaboutthispointagain.

▼primary(11)

|kCLASScnamesuperclassbodystmtkEND#classdefinition|kCLASStLSHFTexprtermbodystmtkEND#singletonclassdefinition|kMODULEcnamebodystmtkEND#moduledefinition|kDEFfnamef_arglistbodystmtkEND#methoddefinition|kDEFsingletondot_or_colonfnamef_arglistbodystmtkEND#singletonmethoddefinition

Page 445: Ruby Hacking Guide

Definitionstatements.I’vecalledthemtheclassstatementsandtheclassstatements,butessentiallyIshouldhavebeencalledthemtheclassprimaries,probably.Theseareallfitthepattern“beginningwiththeterminalsymbolAandendingwithB”,evenifsuchrulesareincreasedalotmore,itwouldneverbeaproblem.

▼primary(12)

|kBREAK|kNEXT|kREDO|kRETRY

Variousjumps.Theseare,well,notimportantfromtheviewpointofgrammar.

ConflictingListsIntheprevioussection,thequestion“isitallrightthatifisinsuchprimary?”wassuggested.Toproofpreciselyisnoteasy,butexplaininginstinctivelyisrelativelyeasy.Here,let’ssimulatewithasmallruledefinedasfollows:

%tokenABo%%element:Aitem_listB

item_list:|item_listitem

item:element

Page 446: Ruby Hacking Guide

|o

elementistheelementthatwearegoingtoexamine.Forexample,ifwethinkaboutif,itwouldbeif.elementisalistthatstartswiththeterminalsymbolAandendswithB.Asforif,itstartswithifandendswithend.Theocontentsaremethodsorvariablereferencesorliterals.Foranelementofthelist,theoorelementisnesting.

Withtheparserbasedonthisgrammar,let’strytoparsethefollowinginput.

AAoooBoAoAoooBoBB

Theyarenestingtoomanytimesforhumanstocomprehendwithoutsomehelpssuchasindents.Butitbecomesrelativelyeasyifyouthinkinthenextway.Becauseit’scertainthatAandBwhichcontainonlyseveralobetweenthemaregoingtoappear,replacethemtoasingleowhentheyappear.Allwehavetodoisrepeatingthisprocedure.Figure4showstheconsequence.

Figure4:parsealistwhichstartswithAandendswithB

However,iftheendingBismissing,…

%tokenAo

Page 447: Ruby Hacking Guide

%%element:Aitem_list/*Bisdeletedforanexperiment*/

item_list:|item_listitem

item:element|o

Iprocessedthiswithyaccandgot2shift/reduceconflicts.Itmeansthisgrammarisambiguous.IfwesimplytakeBoutfromthepreviousone,Theinputwouldbeasfollows.

AAooooAoAoooo

Thisishardtointerpretinanyway.However,therewasarulethat“chooseshiftifitisashift/reduceconflict”,let’sfollowitasanexperimentandparsetheinputwithshift(meaninginterior)whichtakesprecedence.(Figure5)

Figure5:parsealistoflistswhichstartwithA

Itcouldbeparsed.However,thisiscompletelydifferentfromtheintentionoftheinput,therebecomesnowaytosplitthelistinthemiddle.

Actually,themethodswithoutparenthesesofRubyisinthesimilarsituationtothis.It’snotsoeasytounderstandbutapairof

Page 448: Ruby Hacking Guide

amethodnameanditsfirstargumentisA.Thisisbecause,sincethere’snocommaonlybetweenthetwo,itcanberecognizedasthestartofanewlist.

Also,the“practical”HTMLcontainsthispattern.Itis,forinstance,when</p>or</i>isomitted.That’swhyyacccouldnotbeusedforordinaryHTMLatall.

Scanner

ParserOutlineI’llexplainabouttheoutlineoftheparserbeforemovingontothescanner.TakealookatFigure6.

Figure6:ParserInterface(CallGraph)

Page 449: Ruby Hacking Guide

Therearethreeofficialinterfacesoftheparser:rb_compile_cstr(),rb_compile_string(),rb_compile_file().TheyreadaprogramfromCstring,aRubystringobjectandaRubyIOobject,respectively,andcompileit.

Thesefunctions,directlyorindirectly,callyycompile(),andintheend,thecontrolwillbecompletelymovedtoyyparse(),whichisgeneratedbyyacc.Sincetheheartoftheparserisnothingbutyyparse(),it’snicetounderstandbyplacingyyparse()atthecenter.Inotherwords,functionsbeforemovingontoyyparse()areallpreparations,andfunctionsafteryyparse()aremerelychorefunctionsbeingpushedaroundbyyyparse().

Therestfunctionsinparse.yareauxiliaryfunctionscalledbyyylex(),andthesecanalsobeclearlycategorized.

First,theinputbufferisatthelowestlevelofthescanner.rubyisdesignedsothatyoucaninputsourceprogramsviabothRubyIOobjectsandstrings.Theinputbufferhidesthatandmakesitlooklikeasinglebytestream.

Thenextlevelisthetokenbuffer.Itreads1byteatatimefromtheinputbuffer,andkeepsthemuntilitwillformatoken.

Therefore,thewholestructureofyylexcanbedepictedasFigure7.

Page 450: Ruby Hacking Guide

Figure7:Thewholepictureofthescanner

TheinputbufferLet’sstartwiththeinputbuffer.Itsinterfacesareonlythethree:nextc(),pushback(),peek().

Althoughthisissortofinsistent,Isaidthefirstthingistoinvestigatedatastructures.Thevariablesusedbytheinputbufferarethefollowings:

▼theinputbuffer

2279staticchar*lex_pbeg;2280staticchar*lex_p;2281staticchar*lex_pend;

(parse.y)

Thebeginning,thecurrentpositionandtheendofthebuffer.Apparently,thisbufferseemsasimplesingle-linestringbuffer(Figure8).

Figure8:Theinputbuffer

Page 451: Ruby Hacking Guide

nextc()

Then,let’slookattheplacesusingthem.First,I’llstartwithnextc()thatseemsthemostorthodox.

▼nextc()

2468staticinlineint2469nextc()2470{2471intc;24722473if(lex_p==lex_pend){2474if(lex_input){2475VALUEv=lex_getline();24762477if(NIL_P(v))return-1;2478if(heredoc_end>0){2479ruby_sourceline=heredoc_end;2480heredoc_end=0;2481}2482ruby_sourceline++;2483lex_pbeg=lex_p=RSTRING(v)->ptr;2484lex_pend=lex_p+RSTRING(v)->len;2485lex_lastline=v;2486}2487else{2488lex_lastline=0;2489return-1;2490}2491}2492c=(unsignedchar)*lex_p++;2493if(c=='\r'&&lex_p<=lex_pend&&*lex_p=='\n'){2494lex_p++;2495c='\n';2496}24972498returnc;2499}

Page 452: Ruby Hacking Guide

(parse.y)

Itseemsthatthefirstifistotestifitreachestheendoftheinputbuffer.And,theifinsideofitseems,sincetheelsereturns-1(EOF),totesttheendofthewholeinput.Converselyspeaking,whentheinputends,lex_inputbecomes0.((errata:itdoesnot.lex_inputwillneverbecome0duringordinaryscan.))

Fromthis,wecanseethatstringsarecomingbitbybitintotheinputbuffer.Sincethenameofthefunctionwhichupdatesthebufferislex_getline,it’sdefinitethateachlinecomesinatatime.

Hereisthesummary:

if(reachedtheendofthebuffer)if(stillthere'smoreinput)readthenextlineelsereturnEOFmovethepointerforwardskipreadingCRofCRLFreturnc

Let’salsolookatthefunctionlex_getline(),whichprovideslines.Thevariablesusedbythisfunctionareshowntogetherinthefollowing.

▼lex_getline()

2276staticVALUE(*lex_gets)();/*getsfunction*/2277staticVALUElex_input;/*non-nilifFile*/

2420staticVALUE

Page 453: Ruby Hacking Guide

2421lex_getline()2422{2423VALUEline=(*lex_gets)(lex_input);2424if(ruby_debug_lines&&!NIL_P(line)){2425rb_ary_push(ruby_debug_lines,line);2426}2427returnline;2428}

(parse.y)

Exceptforthefirstline,thisisnotimportant.Apparently,lex_getsshouldbethepointertothefunctiontoreadaline,lex_inputshouldbetheactualinput.Isearchedtheplacewheresettinglex_getsandthisiswhatIfound:

▼setlex_gets

2430NODE*2431rb_compile_string(f,s,line)2432constchar*f;2433VALUEs;2434intline;2435{2436lex_gets=lex_get_str;2437lex_gets_ptr=0;2438lex_input=s;

2454NODE*2455rb_compile_file(f,file,start)2456constchar*f;2457VALUEfile;2458intstart;2459{2460lex_gets=rb_io_gets;2461lex_input=file;

(parse.y)

Page 454: Ruby Hacking Guide

rb_io_gets()isnotaexclusivefunctionfortheparserbutoneofthegeneral-purposelibraryofRuby.ItisthefunctiontoreadalinefromanIOobject.

Ontheotherhand,lex_get_str()isdefinedasfollows:

▼lex_get_str()

2398staticintlex_gets_ptr;

2400staticVALUE2401lex_get_str(s)2402VALUEs;2403{2404char*beg,*end,*pend;24052406beg=RSTRING(s)->ptr;2407if(lex_gets_ptr){2408if(RSTRING(s)->len==lex_gets_ptr)returnQnil;2409beg+=lex_gets_ptr;2410}2411pend=RSTRING(s)->ptr+RSTRING(s)->len;2412end=beg;2413while(end<pend){2414if(*end++=='\n')break;2415}2416lex_gets_ptr=end-RSTRING(s)->ptr;2417returnrb_str_new(beg,end-beg);2418}

(parse.y)

lex_gets_ptrrememberstheplaceithavealreadyread.Thismovesittothenext\n,andsimultaneouslycutoutattheplaceandreturnit.

Page 455: Ruby Hacking Guide

Here,let’sgobacktonextc.Asdescribed,bypreparingthetwofunctionswiththesameinterface,itswitchthefunctionpointerwheninitializingtheparser,andtheotherpartisusedincommon.Itcanalsobesaidthatthedifferenceofthecodeisconvertedtothedataandabsorbed.Therewasalsoasimilarmethodofst_table.

pushback()

Withtheknowledgeofthephysicalstructureofthebufferandnextc,wecanunderstandtheresteasily.pushback()writesbackacharacter.IfputitinC,itisungetc().

▼pushback()

2501staticvoid2502pushback(c)2503intc;2504{2505if(c==-1)return;2506lex_p--;2507}

(parse.y)

peek()

peek()checksthenextcharacterwithoutmovingthepointerforward.

▼peek()

Page 456: Ruby Hacking Guide

2509#definepeek(c)(lex_p!=lex_pend&&(c)==*lex_p)

(parse.y)

TheTokenBufferThetokenbufferisthebufferofthenextlevel.Itkeepsthestringsuntilatokenwillbeabletocutout.Therearethefiveinterfacesasfollows:

newtok beginanewtokentokadd addacharactertothebuffertokfix fixatokentok thepointertothebeginningofthebufferedstringtoklen thelengthofthebufferedstringtoklast thelastbyteofthebufferedstring

Now,we’llstartwiththedatastructures.

▼TheTokenBuffer

2271staticchar*tokenbuf=NULL;2272staticinttokidx,toksiz=0;

(parse.y)

tokenbufisthebuffer,tokidxistheendofthetoken(sinceitisofint,itseemsanindex),andtoksizisprobablythebufferlength.Thisisalsosimplystructured.Ifdepictingit,itwouldlooklikeFigure9.

Page 457: Ruby Hacking Guide

Figure9:Thetokenbuffer

Let’scontinuouslygototheinterfaceandreadnewtok(),whichstartsanewtoken.

▼newtok()

2516staticchar*2517newtok()2518{2519tokidx=0;2520if(!tokenbuf){2521toksiz=60;2522tokenbuf=ALLOC_N(char,60);2523}2524if(toksiz>4096){2525toksiz=60;2526REALLOC_N(tokenbuf,char,60);2527}2528returntokenbuf;2529}

(parse.y)

Theinitializinginterfaceofthewholebufferdoesnotexist,it’spossiblethatthebufferisnotinitialized.Therefore,thefirstifchecksitandinitializesit.ALLOC_N()isthemacrorubydefinesandisalmostthesameascalloc.

Page 458: Ruby Hacking Guide

Theinitialvalueoftheallocatinglengthis60,andifitbecomestoobig(>4096),itwouldbereturnedbacktosmall.Sinceatokenbecomingthislongisunlikely,thissizeisrealistic.

Next,let’slookatthetokadd()toaddacharactertotokenbuffer.

▼tokadd()

2531staticvoid2532tokadd(c)2533charc;2534{2535tokenbuf[tokidx++]=c;2536if(tokidx>=toksiz){2537toksiz*=2;2538REALLOC_N(tokenbuf,char,toksiz);2539}2540}

(parse.y)

Atthefirstline,acharacterisadded.Then,itchecksthetokenlengthandifitseemsabouttoexceedthebufferend,itperformsREALLOC_N().REALLOC_N()isarealloc()whichhasthesamewayofspecifyingargumentsascalloc().

Therestinterfacesaresummarizedbelow.

▼tokfix()tok()toklen()toklast()

2511#definetokfix()(tokenbuf[tokidx]='\0')2512#definetok()tokenbuf2513#definetoklen()tokidx2514#definetoklast()(tokidx>0?tokenbuf[tokidx-1]:0)

Page 459: Ruby Hacking Guide

(parse.y)

There’sprobablynoquestion.

yylex()

yylex()isverylong.Currently,therearemorethan1000lines.Themostofthemisoccupiedbyahugeswitchstatement,itbranchesbasedoneachcharacter.First,I’llshowthewholestructurethatsomepartsofitareleftout.

▼yylexoutline

3106staticint3107yylex()3108{3109staticIDlast_id=0;3110registerintc;3111intspace_seen=0;3112intcmd_state;31133114if(lex_strterm){/*...stringscan...*/3131returntoken;3132}3133cmd_state=command_start;3134command_start=Qfalse;3135retry:3136switch(c=nextc()){3137case'\0':/*NUL*/3138case'\004':/*^D*/3139case'\032':/*^Z*/3140case-1:/*endofscript.*/3141return0;31423143/*whitespaces*/

Page 460: Ruby Hacking Guide

3144case'':case'\t':case'\f':case'\r':3145case'\13':/*'\v'*/3146space_seen++;3147gotoretry;31483149case'#':/*it'sacomment*/3150while((c=nextc())!='\n'){3151if(c==-1)3152return0;3153}3154/*fallthrough*/3155case'\n':/*...omission...*/

casexxxx::break;:/*branchesalotforeachcharacter*/::4103default:4104if(!is_identchar(c)||ISDIGIT(c)){4105rb_compile_error("Invalidchar`\\%03o'inexpression",c);4106gotoretry;4107}41084109newtok();4110break;4111}

/*...dealwithordinaryidentifiers...*/}

(parse.y)

Asforthereturnvalueofyylex(),zeromeansthattheinputhasfinished,non-zeromeansasymbol.

Becarefulthataextremelyconcisevariablenamed“c”isusedall

Page 461: Ruby Hacking Guide

overthisfunction.space_seen++whenreadingaspacewillbecomehelpfullater.

Allithastodoastherestistokeepbranchingforeachcharacterandprocessingit,butsincecontinuousmonotonicprocedureislasting,itisboringforreaders.Therefore,we’llnarrowthemdowntoafewpoints.Inthisbooknotallcharacterswillbeexplained,butitiseasyifyouwillamplifythesamepattern.

'!'

Let’sstartwithwhatissimplefirst.

▼yylex–'!'

3205case'!':3206lex_state=EXPR_BEG;3207if((c=nextc())=='='){3208returntNEQ;3209}3210if(c=='~'){3211returntNMATCH;3212}3213pushback(c);3214return'!';

(parse.y)

Iwrouteoutthemeaningofthecode,soI’dlikeyoutoreadthembycomparingeachother.

case'!':movetoEXPR_BEGif(thenextcharacteris'='then){

Page 462: Ruby Hacking Guide

tokenis「!=(tNEQ)」}if(thenextcharacteris'~'then){tokenis「!~(tNMATCH)」}ifitisneither,pushthereadcharacterbacktokenis'!'

Thiscaseclauseisshort,butdescribestheimportantruleofthescanner.Itis“thelongestmatchrule”.Thetwocharacters"!="canbeinterpretedintwoways:“!and=”or“!=”,butinthiscase"!="mustbeselected.Thelongestmatchisessentialforscannersofprogramminglanguages.

And,lex_stateisthevariablerepresentsthestateofthescanner.Thiswillbediscussedtoomuchinthenextchapter“Finite-StateScanner”,youcanignoreitfornow.EXPR_BEGindicates“itisclearlyatthebeginning”.Thisisbecausewhicheveritis!ofnotoritis!=oritis!~,itsnextsymbolisthebeginningofanexpression.

'<'

Next,we’lltrytolookat'<'asanexampleofusingyylval(thevalueofasymbol).

▼yylex−'&gt;'

3296case'>':3297switch(lex_state){3298caseEXPR_FNAME:caseEXPR_DOT:3299lex_state=EXPR_ARG;break;3300default:

Page 463: Ruby Hacking Guide

3301lex_state=EXPR_BEG;break;3302}3303if((c=nextc())=='='){3304returntGEQ;3305}3306if(c=='>'){3307if((c=nextc())=='='){3308yylval.id=tRSHFT;3309lex_state=EXPR_BEG;3310returntOP_ASGN;3311}3312pushback(c);3313returntRSHFT;3314}3315pushback(c);3316return'>';

(parse.y)

Theplacesexceptforyylvalcanbeignored.Concentratingonlyonepointwhenreadingaprogramisessential.

Atthispoint,forthesymboltOP_ASGNof>>=,itsetitsvaluetRSHIFT.Sincetheusedunionmemberisid,itstypeisID.tOP_ASGNisthesymbolofselfassignment,itrepresentsallofthethingslike+=and-=and*=.Inordertodistinguishthemlater,itpassesthetypeoftheselfassignmentasavalue.

Thereasonwhytheselfassignmentsarebundledis,itmakestheruleshorter.Bundlingthingsthatcanbebundledatthescannerasmuchaspossiblemakestherulemoreconcise.Then,whyarethebinaryarithmeticoperatorsnotbundled?Itisbecausetheydiffersintheirprecedences.

Page 464: Ruby Hacking Guide

':'

Ifscanningiscompletelyindependentfromparsing,thistalkwouldbesimple.Butinreality,itisnotthatsimple.TheRubygrammarisparticularlycomplex,ithasasomewhatdifferentmeaningwhenthere’saspaceinfrontofit,thewaytosplittokensischangeddependingonthesituationaround.Thecodeof':'shownbelowisanexamplethataspacechangesthebehavior.

▼yylex−':'

3761case':':3762c=nextc();3763if(c==':'){3764if(lex_state==EXPR_BEG||lex_state==EXPR_MID||3765(IS_ARG()&&space_seen)){3766lex_state=EXPR_BEG;3767returntCOLON3;3768}3769lex_state=EXPR_DOT;3770returntCOLON2;3771}3772pushback(c);3773if(lex_state==EXPR_END||lex_state==EXPR_ENDARG||ISSPACE(c)){3774lex_state=EXPR_BEG;3775return':';3776}3777lex_state=EXPR_FNAME;3778returntSYMBEG;

(parse.y)

Again,ignoringthingsrelatingtolex_state,I’dlikeyoufocusonaroundspace_seen.

Page 465: Ruby Hacking Guide

space_seenisthevariablethatbecomestruewhenthere’saspacebeforeatoken.Ifitismet,meaningthere’saspaceinfrontof'::',itbecomestCOLON3,ifthere’snot,itseemstobecometCOLON2.ThisisasIexplainedatprimaryintheprevioussection.

IdentifierUntilnow,sincetherewereonlysymbols,itwasjustacharacteror2characters.Thistime,we’lllookatalittlelongthings.Itisthescanningpatternofidentifiers.

First,theoutlineofyylexwasasfollows:

yylex(...){switch(c=nextc()){casexxxx:....casexxxx:....default:}

thescanningcodeofidentifiers}

Thenextcodeisanextractfromtheendofthehugeswitch.Thisisrelativelylong,soI’llshowitwithcomments.

▼yylex—identifiers

4081case'@':/*aninstancevariableoraclassvariable*/4082c=nextc();

Page 466: Ruby Hacking Guide

4083newtok();4084tokadd('@');4085if(c=='@'){/*@@,meaningaclassvariable*/4086tokadd('@');4087c=nextc();4088}4089if(ISDIGIT(c)){/*@1andsuch*/4090if(tokidx==1){4091rb_compile_error("`@%c'isnotavalidinstancevariablename",c);4092}4093else{4094rb_compile_error("`@@%c'isnotavalidclassvariablename",c);4095}4096}4097if(!is_identchar(c)){/*astrangecharacterappearsnextto@*/4098pushback(c);4099return'@';4100}4101break;41024103default:4104if(!is_identchar(c)||ISDIGIT(c)){4105rb_compile_error("Invalidchar`\\%03o'inexpression",c);4106gotoretry;4107}41084109newtok();4110break;4111}41124113while(is_identchar(c)){/*betweencharactersthatcanbeusedasidentifieres*/4114tokadd(c);4115if(ismbchar(c)){/*ifitistheheadbyteofamulti-bytecharacter*/4116inti,len=mbclen(c)-1;41174118for(i=0;i<len;i++){4119c=nextc();4120tokadd(c);4121}4122}4123c=nextc();4124}4125if((c=='!'||c=='?')&&

Page 467: Ruby Hacking Guide

is_identchar(tok()[0])&&!peek('=')){/*theendcharacterofname!orname?*/4126tokadd(c);4127}4128else{4129pushback(c);4130}4131tokfix();

(parse.y)

Finally,I’dlikeyoufocusontheconditionattheplacewhereadding!or?.Thispartistointerpretinthenextway.

obj.m=1#obj.m=1(notobj.m=)obj.m!=1#obj.m!=1(notobj.m!)

((errata:thiscodeisnotrelatingtothatcondition))

Thisis“not”longest-match.The“longest-match”isaprinciplebutnotaconstraint.Sometimes,youcanrefuseit.

ThereservedwordsAfterscanningtheidentifiers,thereareabout100linesofthecodefurthertodeterminetheactualsymbols.Inthepreviouscode,instancevariables,classvariablesandlocalvariables,theyarescannedallatonce,buttheyarecategorizedhere.

ThisisOKbut,insideitthere’salittlestrangepart.Itistheparttofilterthereservedwords.Sincethereservedwordsarenotdifferentfromlocalvariablesinitscharactertype,scanninginabundleand

Page 468: Ruby Hacking Guide

categorizinglaterismoreefficient.

Then,assumethere’sstrthatisachar*string,howcanwedeterminewhetheritisareservedword?First,ofcourse,there’sawayofcomparingalotbyifstatementsandstrcmp().However,thisiscompletelynotsmart.Itisnotflexible.Itsspeedwillalsolinearlyincrease.Usually,onlythedatawouldbeseparatedtoalistorahashinordertokeepthecodeshort.

/*convertthecodetodata*/structentry{char*name;intsymbol;};structentry*table[]={{"if",kIF},{"unless",kUNLESS},{"while",kWHILE},/*……omission……*/};

{....returnlookup_symbol(table,tok());}

Then,howrubyisdoingisthat,itusesahashtable.Furthermore,itisaperfecthash.AsIsaidwhentalkingaboutst_table,ifyouknewthesetofthepossiblekeysbeforehand,sometimesyoucouldcreateahashfunctionthatneverconflicts.Asforthereservedwords,“thesetofthepossiblekeysisknownbeforehand”,soitislikelythatwecancreateaperfecthashfunction.

But,“beingabletocreate”andactuallycreatingaredifferent.Creatingmanuallyistoomuchcumbersome.Sincethereservedwordscanincreaseordecrease,thiskindofprocessmustbe

Page 469: Ruby Hacking Guide

automated.

Therefore,gperfcomesin.gperfisoneofGNUproducts,itgeneratesaperfectfunctionfromasetofvalues.Inordertoknowtheusageofgperfitselfindetail,Irecommendtodomangperf.Here,I’llonlydescribehowtousethegeneratedresult.

Inrubytheinputfileforgperfiskeywordsandtheoutputislex.c.parse.ydirectly#includeit.Basically,doing#includeCfilesisnotgood,butperformingnon-essentialfileseparationforjustonefunctionisworse.Particularly,inruby,there'sthepossibilitythatextern+functionsareusedbyextensionlibrarieswithoutbeingnoticed,thusthefunctionthatdoesnotwanttokeepitscompatibilityshouldbestatic.

Then,inthelex.c,afunctionnamedrb_reserved_word()isdefined.Bycallingitwiththechar*ofareservedwordaskey,youcanlookup.ThereturnvalueisNULLifnotfound,structkwtable*iffound(inotherwords,iftheargumentisareservedword).Thedefinitionofstructkwtableisasfollows:

▼kwtable

1structkwtable{char*name;intid[2];enumlex_statestate;};

(keywords)

nameisthenameofthereservedword,id[0]isitssymbol,id[1]isitssymbolasamodification(kIF_MODandsuch).lex_stateis“the

Page 470: Ruby Hacking Guide

lex_stateshouldbemovedtoafterreadingthisreservedword”.lex_statewillbeexplainedinthenextchapter.

Thisistheplacewhereactuallylookingup.

▼yylex()—identifier—callrb_reserved_word()

4173structkwtable*kw;41744175/*Seeifitisareservedword.*/4176kw=rb_reserved_word(tok(),toklen());4177if(kw){

(parse.y)

StringsThedoublequote(")partofyylex()isthis.

▼yylex−'"'

3318case'"':3319lex_strterm=NEW_STRTERM(str_dquote,'"',0);3320returntSTRING_BEG;

(parse.y)

Surprisinglyitfinishesafterscanningonlythefirstcharacter.Then,thistime,whentakingalookattherule,tSTRING_BEGisfoundinthefollowingpart:

▼rulesforstrings

Page 471: Ruby Hacking Guide

string1:tSTRING_BEGstring_contentstSTRING_END

string_contents:|string_contentsstring_content

string_content:tSTRING_CONTENT|tSTRING_DVARstring_dvar|tSTRING_DBEGterm_pushcompstmt'}'

string_dvar:tGVAR|tIVAR|tCVAR|backref

term_push:

Theserulesarethepartintroducedtodealwithembeddedexpressionsinsideofstrings.tSTRING_CONTENTisliteralpart,tSTRING_DBEGis"#{".tSTRING_DVARrepresents“#thatinfrontofavariable”.Forexample,

".....#$gvar...."

thiskindofsyntax.Ihavenotexplainedbutwhentheembeddedexpressionisonlyavariable,{and}canbeleftout.Butthisisoftennotrecommended.DofDVAR,DBEGseemstheabbreviationofdynamic.

And,backrefrepresentsthespecialvariablesrelatingtoregularexpressions,suchas$1$2or$&$'.

term_pushis“aruledefinedforitsaction”.

Page 472: Ruby Hacking Guide

Now,we’llgobacktoyylex()here.Ifitsimplyreturnstheparser,sinceitscontextisthe“interior”ofastring,itwouldbeaproblemifavariableandifandothersaresuddenlyscannedinthenextyylex().Whatplaysanimportantrolethereis…

case'"':lex_strterm=NEW_STRTERM(str_dquote,'"',0);returntSTRING_BEG;

…lex_strterm.Let’sgobacktothebeginningofyylex().

▼thebeginningofyylex()

3106staticint3107yylex()3108{3109staticIDlast_id=0;3110registerintc;3111intspace_seen=0;3112intcmd_state;31133114if(lex_strterm){/*scanningstring*/3131returntoken;3132}3133cmd_state=command_start;3134command_start=Qfalse;3135retry:3136switch(c=nextc()){

(parse.y)

Iflex_strtermexists,itentersthestringmodewithoutasking.Itmeans,converselyspeaking,ifthere’slex_strterm,itiswhilescanningstring,andwhenparsingtheembeddedexpressions

Page 473: Ruby Hacking Guide

insidestrings,youhavetosetlex_strtermto0.And,whentheembeddedexpressionends,youhavetosetitback.Thisisdoneinthefollowingpart:

▼string_content

1916string_content:....1917|tSTRING_DBEGterm_push1918{1919$<num>1=lex_strnest;1920$<node>$=lex_strterm;1921lex_strterm=0;1922lex_state=EXPR_BEG;1923}1924compstmt'}'1925{1926lex_strnest=$<num>1;1927quoted_term=$2;1928lex_strterm=$<node>3;1929if(($$=$4)&&nd_type($$)==NODE_NEWLINE){1930$$=$$->nd_next;1931rb_gc_force_recycle((VALUE)$4);1932}1933$$=NEW_EVSTR($$);1934}

(parse.y)

Intheembeddedaction,lex_streamissavedasthevalueoftSTRING_DBEG(virtually,thisisastackpush),itrecoversintheordinaryaction(pop).Thisisafairlysmartway.

Butwhyisitdoingthistediousthing?Can’titbedoneby,afterscanningnormally,callingyyparse()recursivelyatthepointwhenitfinds#{?There’sactuallyaproblem.yyparse()can’tbecalled

Page 474: Ruby Hacking Guide

recursively.Thisisthewellknownlimitofyacc.Sincetheyyvalthatisusedtoreceiveorpassavalueisaglobalvariable,carelessrecursivecallscandestroythevalue.Withbison(yaccofGNU),recursivecallsarepossiblebyusing%pure_parserdirective,butthecurrentrubydecidednottoassumebison.Inreality,byacc(Berkelyyacc)isoftenusedinBSD-derivedOSandWindowsandsuch,ifbisonisassumed,itcausesalittlecumbersome.

lex_strterm

Aswe’veseen,whenyouconsiderlex_streamasabooleanvalue,itrepresentswhetherornotthescannerisinthestringmode.Butitscontentsalsohasameaning.First,let’slookatitstype.

▼lex_strterm

72staticNODE*lex_strterm;

(parse.y)

ThisdefinitionshowsitstypeisNODE*.ThisisthetypeusedforsyntaxtreeandwillbediscussedindetailinChapter12:Syntaxtreeconstruction.Forthetimebeing,itisastructurewhichhasthreeelements,sinceitisVALUEyoudon’thavetofree()it,youshouldrememberonlythesetwopoints.

▼NEW_STRTERM()

2865#defineNEW_STRTERM(func,term,paren)\2866rb_node_newnode(NODE_STRTERM,(func),(term),(paren))

Page 475: Ruby Hacking Guide

(parse.y)

Thisisamacrotocreateanodetobestoredinlex_stream.First,termistheterminalcharacterofthestring.Forexample,ifitisa"string,itis",andifitisa'string,itis'.

parenisusedtostorethecorrespondingparenthesiswhenitisa%string.Forexample,

%Q(..........)

inthiscase,parenstores'('.And,termstorestheclosingparenthesis')'.Ifitisnota%string,parenis0.

Atlast,func,thisindicatesthetypeofastring.Theavailabletypesaredecidedasfollows:

▼func

2775#defineSTR_FUNC_ESCAPE0x01/*backslashnotationssuchas\nareineffect*/2776#defineSTR_FUNC_EXPAND0x02/*embeddedexpressionsareineffect*/2777#defineSTR_FUNC_REGEXP0x04/*itisaregularexpression*/2778#defineSTR_FUNC_QWORDS0x08/*%w(....)or%W(....)*/2779#defineSTR_FUNC_INDENT0x20/*<<-EOS(thefinishingsymbolcanbeindented)*/27802781enumstring_type{2782str_squote=(0),2783str_dquote=(STR_FUNC_EXPAND),2784str_xquote=(STR_FUNC_ESCAPE|STR_FUNC_EXPAND),2785str_regexp=(STR_FUNC_REGEXP|STR_FUNC_ESCAPE|STR_FUNC_EXPAND),2786str_sword=(STR_FUNC_QWORDS),2787str_dword=(STR_FUNC_QWORDS|STR_FUNC_EXPAND),2788};

Page 476: Ruby Hacking Guide

(parse.y)

Eachmeaningofenumstring_typeisasfollows:

str_squote 'string/%qstr_dquote "string/%Qstr_xquote commandstring(notbeexplainedinthisbook)str_regexp regularexpressionstr_sword %wstr_dword %W

StringscanfunctionTherestisreadingyylex()inthestringmode,inotherwords,theifatthebeginning.

▼yylex−string

3114if(lex_strterm){3115inttoken;3116if(nd_type(lex_strterm)==NODE_HEREDOC){3117token=here_document(lex_strterm);3118if(token==tSTRING_END){3119lex_strterm=0;3120lex_state=EXPR_END;3121}3122}3123else{3124token=parse_string(lex_strterm);3125if(token==tSTRING_END||token==tREGEXP_END){3126rb_gc_force_recycle((VALUE)lex_strterm);3127lex_strterm=0;3128lex_state=EXPR_END;3129}

Page 477: Ruby Hacking Guide

3130}3131returntoken;3132}

(parse.y)

Itisdividedintothetwomajorgroups:heredocumentandothers.Butthistime,wewon’treadparse_string().AsIpreviouslydescribed,therearealotofconditions,itistremendouslybeingaspaghetticode.IfItriedtoexplainit,oddsarehighthatreaderswouldcomplainthat“itisasthecodeiswritten!”.Furthermore,althoughitrequiresalotofefforts,itisnotinteresting.

But,notexplainingatallisalsonotagoodthingtodo,ThemodifiedversionthatfunctionsareseparatelydefinedforeachtargettobescannediscontainedintheattachedCD-ROM(doc/parse_string.html).I’dlikereaderswhoareinterestedintotrytolookoverit.

HereDocumentIncomparisontotheordinarystrings,heredocumentsarefairlyinteresting.Thatmaybebecause,unliketheotherelements,itdealwithalineatatime.Moreover,itisterrificthatthestartingsymbolcanexistinthemiddleofaprogram.First,I’llshowthecodeofyylex()toscanthestartingsymbolofaheredocument.

▼yylex−'&lt;'

3260case'<':

Page 478: Ruby Hacking Guide

3261c=nextc();3262if(c=='<'&&3263lex_state!=EXPR_END&&3264lex_state!=EXPR_DOT&&3265lex_state!=EXPR_ENDARG&&3266lex_state!=EXPR_CLASS&&3267(!IS_ARG()||space_seen)){3268inttoken=heredoc_identifier();3269if(token)returntoken;

(parse.y)

Asusual,we’llignoretheherdoflex_state.Then,wecanseethatitreadsonly“<<”hereandtherestisscannedatheredoc_identifier().Therefore,hereisheredoc_identifier().

▼heredoc_identifier()

2926staticint2927heredoc_identifier()2928{/*...omission...readingthestartingsymbol*/2979tokfix();2980len=lex_p-lex_pbeg;/*(A)*/2981lex_p=lex_pend;/*(B)*/2982lex_strterm=rb_node_newnode(NODE_HEREDOC,2983rb_str_new(tok(),toklen()),/*nd_lit*/2984len,/*nd_nth*/2985/*(C)*/lex_lastline);/*nd_orig*/29862987returnterm=='`'?tXSTRING_BEG:tSTRING_BEG;2988}

(parse.y)

Thepartwhichreadsthestartingsymbol(<<EOS)isnotimportant,soitistotallyleftout.Untilnow,theinputbufferprobablyhas

Page 479: Ruby Hacking Guide

becomeasdepictedasFigure10.Let’srecallthattheinputbufferreadsalineatatime.

Figure10:scanning"printf\(<<EOS,n\)"

Whatheredoc_identifier()isdoingisasfollows:(A)lenisthenumberofreadbytesinthecurrentline.(B)and,suddenlymovelex_ptotheendoftheline.Itmeansthatinthereadline,thepartafterthestartingsymbolisreadbutnotparsed.Whenisthatrestpartparsed?Forthismystery,ahintisthatat(C)thelex_lastline(thecurrentlyreadline)andlen(thelengththathasalreadyread)aresaved.

Then,thedynamiccallgraphbeforeandafterheredoc_identifierissimplyshownbelow:

yyparseyylex(case'<')heredoc_identifier(lex_strterm=....)yylex(thebeginningif)here_document

And,thishere_document()isdoingthescanofthebodyoftheheredocument.Omittinginvalidcasesandaddingsomecomments,heredoc_identifier()isshownbelow.Noticethatlex_strterm

Page 480: Ruby Hacking Guide

remainsunchangedafteritwassetatheredoc_identifier().

▼here_document()(simplified)

here_document(NODE*here){VALUEline;/*thelinecurrentlybeingscanned*/VALUEstr=rb_str_new("",0);/*astringtostoretheresults*/

/*...handlinginvalidconditions,omitted...*/

if(embededexpressionsnotineffect){do{line=lex_lastline;/*(A)*/rb_str_cat(str,RSTRING(line)->ptr,RSTRING(line)->len);lex_p=lex_pend;/*(B)*/if(nextc()==-1){/*(C)*/gotoerror;}}while(thecurrentlyreadlineisnotequaltothefinishingsymbol);}else{/*theembededexpressionsareavailable...omitted*/}heredoc_restore(lex_strterm);lex_strterm=NEW_STRTERM(-1,0,0);yylval.node=NEW_STR(str);returntSTRING_CONTENT;}

rb_str_cat()isthefunctiontoconnectachar*attheendofaRubystring.Itmeansthatthecurrentlybeingreadlinelex_lastlineisconnectedtostrat(A).Afteritisconnected,there’snouseofthecurrentline.At(B),suddenlymovinglex_ptotheendofline.And(C)isaproblem,inthisplace,itlookslikedoingthecheckwhetheritisfinished,butactuallythenext“line”isread.I’dlikeyouto

Page 481: Ruby Hacking Guide

recallthatnextc()automaticallyreadsthenextlinewhenthecurrentlinehasfinishedtoberead.So,sincethecurrentlineisforciblyfinishedat(B),lex_pmovestothenextlineat(C).

Andfinally,leavingthedo~whileloop,itisheredoc_restore().

▼heredoc_restore()

2990staticvoid2991heredoc_restore(here)2992NODE*here;2993{2994VALUEline=here->nd_orig;2995lex_lastline=line;2996lex_pbeg=RSTRING(line)->ptr;2997lex_pend=lex_pbeg+RSTRING(line)->len;2998lex_p=lex_pbeg+here->nd_nth;2999heredoc_end=ruby_sourceline;3000ruby_sourceline=nd_line(here);3001rb_gc_force_recycle(here->nd_lit);3002rb_gc_force_recycle((VALUE)here);3003}

(parse.y)

here->nd_origholdsthelinewhichcontainsthestartingsymbol.here->nd_nthholdsthelengthalreadyreadinthelinecontainsthestartingsymbol.Itmeansitcancontinuetoscanfromthejustafterthestartingsymbolasiftherewasnothinghappened.(Figure11)

Page 482: Ruby Hacking Guide

Figure11:ThepictureofassignationofscanningHereDocument

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 483: Ruby Hacking Guide

RubyHackingGuide

TranslatedbyPeterZotovI’mverygratefultomyemployerEvilMartians,whosponsoredthework,andNikolayKonovalenko,whoputmoreeffortinthistranslationthanIcouldeverwishfor.Withoutthem,IwouldbestillfiguringoutwhatCOND_LEXPOP()actuallydoes.

Page 484: Ruby Hacking Guide

Chapter11Finite-state

scanner

Outline

Intheory,thescannerandtheparserarecompletelyindependentofeachother–thescannerissupposedtorecognizetokens,whiletheparserissupposedtoprocesstheresultingseriesoftokens.Itwouldbeniceifthingswerethatsimple,butinrealityitrarelyis.Dependingonthecontextoftheprogramitisoftennecessarytoalterthewaytokensarerecognizedortheirsymbols.Inthischapterwewilltakealookatthewaythescannerandtheparsercooperate.

PracticalexamplesInmostprogramminglanguages,spacesdon’thaveanyspecificmeaningunlesstheyareusedtoseparatewords.However,Rubyisnotanordinarylanguageandmeaningscanchangesignificantlydependingonthepresenceofspaces.Hereisanexample

a[i]=1#a[i]=(1)

Page 485: Ruby Hacking Guide

a[i]#a([i])

Theformerisanexampleofassigninganindex.Thelatterisanexampleofomittingthemethodcallparenthesesandpassingamemberofanarraytoaparameter.

Hereisanotherexample.

a+1#(a)+(1)a+1#a(+1)

Thisseemstobereallydislikedbysome.

However,theaboveexamplesmightgiveonetheimpressionthatonlyomittingthemethodcallparenthesescanbeasourceoftrouble.Let’slookatadifferentexample.

`cvsdiffparse.y`#commandcallstringobj.`("cvsdiffparse.y")#normalmethodcall

Here,theformerisamethodcallusingaliteral.Incontrast,thelatterisanormalmethodcall(with‘’’beingthemethodname).Dependingonthecontext,theycouldbehandledquitedifferently.

Belowisanotherexamplewherethefunctioningchangesdramatically

print(<<EOS)#here-document......EOS

list=[]

Page 486: Ruby Hacking Guide

list<<nil#list.push(nil)

Theformerisamethodcallusingahere-document.Thelatterisamethodcallusinganoperator.

Asdemonstrated,Ruby’sgrammarcontainsmanypartswhicharedifficulttoimplementinpractice.Icouldn’trealisticallygiveathoroughdescriptionofallinjustonechapter,sointhisoneIwilllookatthebasicprinciplesandthosepartswhichpresentthemostdifficulty.

lex_state

Thereisavariablecalled“lex_state”.“lex”,obviously,standsfor“lexer”.Thus,itisavariablewhichshowsthescanner’sstate.

Whatstatesarethere?Let’slookatthedefinitions.

▼enumlex_state

61staticenumlex_state{62EXPR_BEG,/*ignorenewline,+/-isasign.*/63EXPR_END,/*newlinesignificant,+/-isaoperator.*/64EXPR_ARG,/*newlinesignificant,+/-isaoperator.*/65EXPR_CMDARG,/*newlinesignificant,+/-isaoperator.*/66EXPR_ENDARG,/*newlinesignificant,+/-isaoperator.*/67EXPR_MID,/*newlinesignificant,+/-isaoperator.*/68EXPR_FNAME,/*ignorenewline,noreservedwords.*/69EXPR_DOT,/*rightafter`.'or`::',noreservedwords.*/70EXPR_CLASS,/*immediateafter`class',noheredocument.*/71}lex_state;

(parse.y)

Page 487: Ruby Hacking Guide

TheEXPRprefixstandsfor“expression”.EXPR_BEGis“Beginningofexpression”andEXPR_DOTis“insidetheexpression,afterthedot”.

Toelaborate,EXPR_BEGdenotes“Locatedattheheadoftheexpression”.EXPR_ENDdenotes“Locatedattheendoftheexpression”.EXPR_ARGdenotes“Beforethemethodparameter”.EXPR_FNAMEdenotes“Beforethemethodname(suchasdef)”.Theonesnotcoveredherewillbeanalyzedindetailbelow.

Incidentally,Iamledtobelievethatlex_stateactuallydenotes“afterparentheses”,“headofstatement”,soitshowsthestateoftheparserratherthanthescanner.However,it’sstillconventionallyreferredtoasthescanner’sstateandhere’swhy.

Themeaningof“state”hereisactuallysubtlydifferentfromhowit’susuallyunderstood.The“state”oflex_stateis“astateunderwhichthescannerdoesx”.ForexampleanaccuratedescriptionofEXPR_BEGwouldbe“Astateunderwhichthescanner,ifrun,willreactasifthisisattheheadoftheexpression”

Technically,this“state”canbedescribedasthestateofthescannerifwelookatthescannerasastatemachine.However,delvingtherewouldbeveeringofftopicandtootedious.Iwouldreferanyinterestedreaderstoanytextbookondatastructures.

Understandingthefinite-statescannerThetricktoreadingafinite-statescanneristonottrytograsp

Page 488: Ruby Hacking Guide

everythingatonce.Someonewritingaparserwouldprefernottouseafinite-statescanner.Thatistosay,theywouldprefernottomakeitthemainpartoftheprocess.Scannerstatemanagementoftenendsupbeinganextrapartattachedtothemainpart.Inotherwords,thereisnosuchthingasacleanandconcisediagramforstatetransitions.

Whatoneshoulddoisthinktowardspecificgoals:“Thispartisneededtosolvethistask”“Thiscodeisforovercomingthisproblem”.Basically,putoutcodeinaccordancewiththetaskathand.Ifyoustartthinkingaboutthemutualrelationshipbetweentasks,you’llinvariablyendupstuck.LikeIsaid,thereissimplynosuchthing.

However,therestillneedstobeanoverreachingobjective.Whenreadingafinite-statescanner,thatobjectivewouldundoubtedlybetounderstandeverystate.Forexample,whatkindofstateisEXPR_BEG?Itisastatewheretheparserisattheheadoftheexpression.

ThestaticapproachSo,howcanweunderstandwhatastatedoes?Therearethreebasicapproaches

Lookatthenameofthestate

Thesimplestandmostobviousapproach.Forexample,thenameEXPR_BEGobviouslyreferstothehead(beginning)ofsomething.

Page 489: Ruby Hacking Guide

Observewhatchangesunderthisstate

Lookatthewaytokenrecognitionchangesunderthestate,thentestitincomparisontopreviousexamples.

Lookatthestatefromwhichittransitions

Lookatwhichstateittransitionsfromandwhichtokencausesit.Forexample,if'\n'isalwaysfollowedbyatransitiontoaHEADstate,itmustdenotetheheadoftheline.

LetustakeEXPR_BEGasanexample.InRuby,allstatetransitionsareexpressedasassignmentstolex_state,sofirstweneedtogrepEXPR_BEGassignmentstofindthem.Thenweneedtoexporttheirlocation,forexample,suchas'#'and'*'and'!'ofyylex()Thenweneedtorecallthestatepriortothetransitionandconsiderwhichcasesuitsbest(seeimage1)

Figure1:TransitiontoEXPR_BEG

((errata:1.ActuallywhenthestateisEXPR_DOT,thestateafterreadingatIDENTIFIERwouldbeeitherARGorCMDARG.However,becausetheauthorwantedtoroughlygroupthemasFNAME/DOTandtheothershere,thesetwoareshowntogether.Therefore,tobeprecise,

Page 490: Ruby Hacking Guide

EXPR_FNAMEandEXPR_DOTshouldhavealsobeenseparated.2.‘)’doesnotcausethetransitionfrom“everythingelse”toEXPR_BEG.))

Thisdoesindeedlookliketheheadofstatement.Especiallythe'\n'andthe';'Theopenparenthesesandthecommaalsosuggestthatit’stheheadnotjustofthestatement,butoftheexpressionaswell.

ThedynamicapproachThereareothereasymethodstoobservethefunctioning.Forexample,youcanuseadebuggerto“hook”theyylex()andlookatthelex_state

Anotherwayistorewritethesourcecodetooutputstatetransitions.Inthecaseoflex_stateweonlyhaveafewpatternsforassignmentandcomparison,sothesolutionwouldbetograspthemastextpatternsandrewritethecodetooutputstatetransitions.TheCDthatcomeswiththisbookcontainstherubylex-analysertool.Whennecessary,Iwillrefertoitinthistext.

Theoverallprocesslookslikethis:useadebuggerortheaforementionedtooltoobservethefunctioningoftheprogram.Thenlookatthesourcecodetoconfirmtheacquireddataanduseit.

Descriptionofstates

Page 491: Ruby Hacking Guide

HereIwillgivesimpledescriptionsoflex_statestates.

EXPR_BEG

Headofexpression.Comesimmediatelyafter\n({[!?:,ortheoperatorop=Themostgeneralstate.

EXPR_MID

Comesimmediatelyafterthereservedwordsreturnbreaknextrescue.Invalidatesbinaryoperatorssuchas*or&GenerallysimilarinfunctiontoEXPR_BEG

EXPR_ARG

Comesimmediatelyafterelementswhicharelikelytobethemethodnameinamethodcall.Alsocomesimmediatelyafter'['ExceptforcaseswhereEXPR_CMDARGisused.

EXPR_CMDARG

Comesbeforethefirstparameterofanormalmethodcall.Formoreinformation,seethesection“Thedoconflict”

EXPR_END

Usedwhenthereisapossibilitythatthestatementisterminal.Forexample,afteraliteraloraclosingparenthesis.ExceptforcaseswhenEXPR_ENDARGisused

Page 492: Ruby Hacking Guide

EXPR_ENDARG

SpecialiterationofEXPR_ENDComesimmediatelyaftertheclosingparenthesiscorrespondingtotLPAREN_ARGRefertothesection“Firstparameterenclosedinparentheses”

EXPR_FNAME

Comesbeforethemethodname,usuallyafterdef,alias,undeforthesymbol':'Asingle“`”canbeaname.

EXPR_DOT

Comesafterthedotinamethodcall.HandledsimilarlytoEXPR_FNAMEVariousreservedwordsaretreatedassimpleidentifiers.Asingle'`'canbeaname.

EXPR_CLASS

ComesafterthereservedwordclassThisisaverylimitedstate.

Thefollowingstatescanbegroupedtogether

BEGMID

ENDENDARG

ARGCMDARG

FNAMEDOT

Theyallexpresssimilarconditions.EXPR_CLASSisalittledifferent,

Page 493: Ruby Hacking Guide

butonlyappearsinalimitednumberofplaces,notwarrantinganyspecialattention.

Line-breakhandling

TheproblemInRuby,astatementdoesnotnecessarilyrequireaterminator.InCorJavaastatementmustalwaysendwithasemicolon,butRubyhasnosuchrequirement.Statementsusuallytakeuponlyoneline,andthusendattheendoftheline.

Ontheotherhand,whenastatementisclearlycontinued,thishappensautomatically.Someconditionsfor“Thisstatementisclearlycontinued”areasfollows:

AfteracommaAfteraninfixoperatorParenthesesorbracketsarenotbalancedImmediatelyafterthereservedwordif

Etc.

ImplementationSo,whatdoweneedtoimplementthisgrammar?Simplyhaving

Page 494: Ruby Hacking Guide

thescannerignoreline-breaksisnotsufficient.InagrammarlikeRuby’s,wherestatementsaredelimitedbyreservedwordsonbothends,conflictsdon’thappenasfrequentlyasinClanguages,butwhenItriedasimpleexperiment,Icouldn’tgetittoworkuntilIgotridofreturnnextbreakandreturnedthemethodcallparentheseswherevertheywereomitted.Toretainthosefeaturesweneedsomekindofterminalsymbolforstatements’ends.Itdoesn’tmatterwhetherit’s\nor';'butitisnecessary.

Twosolutionsexist–parser-basedandscanner-based.Fortheformer,youcanjustoptionallyput\nineveryplacethatallowsit.Forthelatter,havethe\npassedtotheparseronlywhenithassomemeaning(ignoringitotherwise).

Whichsolutiontouseisuptoyourpreferences,butusuallythescanner-basedoneisused.Thatwayproducesamorecompactcode.Moreover,iftherulesareoverloadedwithmeaninglesssymbols,itdefeatsthepurposeoftheparser-generator.

Tosumup,inRuby,line-breaksarebesthandledusingthescanner.Whenalineneedstocontinued,the\nwillbeignored,andwhenitneedstobeterminated,the\nispassedasatoken.Intheyylex()thisisfoundhere:

▼yylex()-'\n'

3155case'\n':3156switch(lex_state){3157caseEXPR_BEG:3158caseEXPR_FNAME:

Page 495: Ruby Hacking Guide

3159caseEXPR_DOT:3160caseEXPR_CLASS:3161gotoretry;3162default:3163break;3164}3165command_start=Qtrue;3166lex_state=EXPR_BEG;3167return'\n';

(parse.y)

WithEXPR_BEG,EXPR_FNAME,EXPR_DOT,EXPR_CLASSitwillbegotoretry.Thatistosay,it’smeaninglessandshallbeignored.Thelabelretryisfoundinfrontofthelargeswitchintheyylex()

Inallotherinstances,line-breaksaremeaningfulandshallbepassedtotheparser,afterwhichlex_stateisrestoredtoEXPR_BEGBasically,wheneveraline-breakismeaningful,itwillbetheendofexpr

Irecommendleavingcommand_startaloneforthetimebeing.Toreiterate,tryingtograsptoomanythingsatoncewillonlyendinneedlessconfusion.

Letusnowtakealookatsomeexamplesusingtherubylex-analysertool.

%rubylex-analyser-e'm(a,b,c)unlessi'+EXPR_BEGEXPR_BEGC"\nm"tIDENTIFIEREXPR_CMDARG

Page 496: Ruby Hacking Guide

EXPR_CMDARG"("'('EXPR_BEG0:condpush0:cmdpushEXPR_BEGC"a"tIDENTIFIEREXPR_CMDARGEXPR_CMDARG","','EXPR_BEGEXPR_BEGS"\nb"tIDENTIFIEREXPR_ARGEXPR_ARG","','EXPR_BEGEXPR_BEGS"c"tIDENTIFIEREXPR_ARGEXPR_ARG")"')'EXPR_END0:condlexpop0:cmdlexpopEXPR_ENDS"unless"kUNLESS_MODEXPR_BEGEXPR_BEGS"i"tIDENTIFIEREXPR_ARGEXPR_ARG"\n"\nEXPR_BEGEXPR_BEGC"\n"'EXPR_BEG

Asyoucansee,thereisalotofoutputhere,butweonlyneedtheleftandmiddlecolumns.Theleftcolumndisplaysthelex_statebeforeitenterstheyylex()whilethemiddlecolumndisplaysthetokensandtheirsymbols.

Thefirsttokenmandthesecondparameterbareprecededbyaline-breakbuta\nisappendedinfrontofthemanditisnottreatedasaterminalsymbol.Thatisbecausethelex_stateisEXPR_BEG.

However,inthesecondtolastline\nisusedasaterminalsymbol.ThatisbecausethestateisEXPR_ARG

Andthatishowitshouldbeused.Letushaveanotherexample.

%rubylex-analyser-e'classC<Objectend'+EXPR_BEGEXPR_BEGC"class"kCLASSEXPR_CLASS

Page 497: Ruby Hacking Guide

EXPR_CLASS"\nC"tCONSTANTEXPR_ENDEXPR_ENDS"<"'<'EXPR_BEG+EXPR_BEGEXPR_BEGS"Object"tCONSTANTEXPR_ARGEXPR_ARG"\n"\nEXPR_BEGEXPR_BEGC"end"kENDEXPR_ENDEXPR_END"\n"\nEXPR_BEG

ThereservedwordclassisfollowedbyEXPR_CLASSsotheline-breakisignored.However,thesuperclassObjectisfollowedbyEXPR_ARG,sothe\nappears.

%rubylex-analyser-e'obj.class'+EXPR_BEGEXPR_BEGC"obj"tIDENTIFIEREXPR_CMDARGEXPR_CMDARG"."'.'EXPR_DOTEXPR_DOT"\nclass"tIDENTIFIEREXPR_ARGEXPR_ARG"\n"\nEXPR_BEG

'.'isfollowedbyEXPR_DOTsothe\nisignored.

NotethatclassbecomestIDENTIFIERdespitebeingareservedword.Thisisdiscussedinthenextsection.

Reservedwordsandidenticalmethodnames

Theproblem

Page 498: Ruby Hacking Guide

InRuby,reservedwordscanusedasmethodnames.However,inactualityit’snotassimpleas“itcanbeused”–thereexistthreepossiblecontexts:

Methoddefinition(defxxxx)Call(obj.xxxx)Symbolliteral(:xxxx)

AllthreearepossibleinRuby.Belowwewilltakeacloserlookateach.

First,themethoddefinition.Itisprecededbythereservedworddefsoitshouldwork.

Incaseofthemethodcall,omittingthereceivercanbeasourceofdifficulty.However,thescopeofusehereisevenmorelimited,andomittingthereceiverisactuallyforbidden.Thatis,whenthemethodnameisareservedword,thereceiverabsolutelycannotbeomitted.Perhapsitwouldbemoreaccuratetosaythatitisforbiddeninordertoguaranteethatparsingisalwayspossible.

Finally,incaseofthesymbol,itisprecededbytheterminalsymbol':'soitalsoshouldwork.However,regardlessofreservedwords,the':'hereconflictswiththecolonina?b:cIfthisisavoided,thereshouldbenofurthertrouble.

Foreachofthesecases,similarlytobefore,ascanner-basedsolutionandaparser-basedsolutionexist.FortheformerusetIDENTIFIER(forexample)asthereservedwordthatcomesafterdef

Page 499: Ruby Hacking Guide

or.or:Forthelatter,makethatintoarule.Rubyallowsforbothsolutionstobeusedineachofthethreecases.

MethoddefinitionThenamepartofthemethoddefinition.Thisishandledbytheparser.

▼Methoddefinitionrule

|kDEFfnamef_arglistbodystmtkEND|kDEFsingletondot_or_colonfnamef_arglistbodystmtkEND

Thereexistonlytworulesformethoddefinition–onefornormalmethodsandoneforsingletonmethods.Forboth,thenamepartisfnameanditisdefinedasfollows.

▼fname

fname:tIDENTIFIER|tCONSTANT|tFID|op|reswords

reswordsisareservedwordandopisabinaryoperator.Bothrulesconsistofsimplyallterminalsymbolslinedup,soIwon’tgointo

Page 500: Ruby Hacking Guide

detailhere.Finally,fortFIDtheendcontainssymbolssimilarlytogsub!andinclude?

MethodcallMethodcallswithnamesidenticaltoreservedwordsarehandledbythescanner.Thescancodeforreservedwordsisshownbelow.

Scanningtheidentifierresult=(tIDENTIFIERortCONSTANT)

if(lex_state!=EXPR_DOT){structkwtable*kw;

/*Seeifitisareservedword.*/kw=rb_reserved_word(tok(),toklen());Reservedwordisprocessed}

EXPR_DOTexpresseswhatcomesafterthemethodcalldot.UnderEXPR_DOTreservedwordsareuniversallynotprocessed.ThesymbolforreservedwordsafterthedotbecomeseithertIDENTIFIERortCONSTANT.

SymbolsReservedwordsymbolsarehandledbyboththescannerandtheparser.First,therule.

▼symbol

Page 501: Ruby Hacking Guide

symbol:tSYMBEGsym

sym:fname|tIVAR|tGVAR|tCVAR

fname:tIDENTIFIER|tCONSTANT|tFID|op|reswords

Reservedwords(reswords)areexplicitlypassedthroughtheparser.ThisisonlypossiblebecausethespecialterminalsymboltSYMBEGispresentatthestart.Ifthesymbolwere,forexample,':'itwouldconflictwiththeconditionaloperator(a?b:c)andstall.Thus,thetrickistorecognizetSYMBEGonthescannerlevel.

Buthowtocausethatrecognition?Let’slookattheimplementationofthescanner.

▼yylex-':'

3761case':':3762c=nextc();3763if(c==':'){3764if(lex_state==EXPR_BEG||lex_state==EXPR_MID||3765(IS_ARG()&&space_seen)){3766lex_state=EXPR_BEG;3767returntCOLON3;3768}3769lex_state=EXPR_DOT;3770returntCOLON2;3771}3772pushback(c);

Page 502: Ruby Hacking Guide

3773if(lex_state==EXPR_END||lex_state==EXPR_ENDARG||ISSPACE(c)){3774lex_state=EXPR_BEG;3775return':';3776}3777lex_state=EXPR_FNAME;3778returntSYMBEG;

(parse.y)

Thisisasituationwhentheifinthefirsthalfhastwoconsecutive':'Inthissituation,the'::'isscannedinaccordancewiththeleftmostlongestmatchbasicrule.

Forthenextif,the':'istheaforementionedconditionaloperator.BothEXPR_ENDandEXPR_ENDARGcomeattheendoftheexpression,soaparameterdoesnotappear.Thatistosay,sincetherecan’tbeasymbol,the':'isaconditionaloperator.Similarly,ifthenextletterisaspace(ISSPACE(c)),asymbolisunlikelysoitisagainaconditionaloperator.

Whennoneoftheaboveapplies,it’sallsymbols.Inthatcase,atransitiontoEXPR_FNAMEoccurstoprepareforallmethodnames.Thereisnoparticulardangertoparsinghere,butifthisisforgotten,thescannerwillnotpassvaluestoreservedwordsandvaluecalculationwillbedisrupted.

Modifiers

Page 503: Ruby Hacking Guide

TheproblemForexample,forififthereexistsanormalnotationandoneforpostfixmodification.

#Normalnotationifcondthenexprend

#Postfixexprifcond

Thiscouldcauseaconflict.Thereasoncanbeguessed–again,it’sbecausemethodparentheseshavebeenomittedpreviously.Observethisexample

callifcondthenaelsebend

Readingthisexpressionuptotheifgivesustwopossibleinterpretations.

call((if....))call()if....

Whenunsure,Irecommendsimplyusingtrialanderrorandseeingifaconflictoccurs.LetustrytohandleitwithyaccafterchangingkIF_MODtokIFinthegrammar.

%yaccparse.yparse.ycontains4shift/reduceconflictsand13reduce/reduceconflicts.

Page 504: Ruby Hacking Guide

Asexpected,conflictsareaplenty.Ifyouareinterested,youaddtheoption-vtoyaccandbuildalog.Thenatureoftheconflictsshouldbeshownthereingreatdetail.

ImplementationSo,whatistheretodo?InRuby,onthesymbollevel(thatis,onthescannerlevel)thenormalifisdistinguishedfromthepostfixifbythembeingkIFandkIF_MODrespectively.Thisalsoappliestoallotherpostfixoperators.Inall,therearefive–kUNLESS_MODkUNTIL_MODkWHILE_MODkRESCUE_MODandkIF_MODThedistinctionismadehere:

▼yylex-Reservedword

4173structkwtable*kw;41744175/*Seeifitisareservedword.*/4176kw=rb_reserved_word(tok(),toklen());4177if(kw){4178enumlex_statestate=lex_state;4179lex_state=kw->state;4180if(state==EXPR_FNAME){4181yylval.id=rb_intern(kw->name);4182}4183if(kw->id[0]==kDO){4184if(COND_P())returnkDO_COND;4185if(CMDARG_P()&&state!=EXPR_CMDARG)4186returnkDO_BLOCK;4187if(state==EXPR_ENDARG)4188returnkDO_BLOCK;4189returnkDO;4190}4191if(state==EXPR_BEG)/***Here***/4192returnkw->id[0];

Page 505: Ruby Hacking Guide

4193else{4194if(kw->id[0]!=kw->id[1])4195lex_state=EXPR_BEG;4196returnkw->id[1];4197}4198}

(parse.y)

Thisislocatedattheendofyylexaftertheidentifiersarescanned.Thepartthathandlesmodifiersisthelast(innermost)if〜else

WhetherthereturnvalueisalteredcanbedeterminedbywhetherornotthestateisEXPR_BEG.Thisiswhereamodifierisidentified.Basically,thevariablekwisthekeyandifyoulookfaraboveyouwillfindthatitisstructkwtable

I’vealreadydescribedinthepreviouschapterhowstructkwtableisastructuredefinedinkeywordsandthehashfunctionrb_reserved_word()iscreatedbygperf.I’llshowthestructurehereagain.

▼keywords–structkwtable

1structkwtable{char*name;intid[2];enumlex_statestate;};

(keywords)

I’vealreadyexplainedaboutnameandid[0]–theyarethereservedwordnameanditssymbol.HereIwillspeakabouttheremainingmembers.

First,id[1]isasymboltodealwithmodifiers.Forexample,incase

Page 506: Ruby Hacking Guide

ofifthatwouldbekIF_MOD.Whenareservedworddoesnothaveamodifierequivalent,id[0]andid[1]containthesamethings.

Becausestateisenumlex_stateitisthestatetowhichatransitionshouldoccurafterthereservedwordisread.Belowisalistcreatedinthekwstat.rbtoolwhichImade.ThetoolcanbefoundontheCD.

%kwstat.rbruby/keywords----EXPR_ARGdefined?superyield

----EXPR_BEGandcaseelseensureifmoduleorunlesswhenbegindoelsifforinnotthenuntilwhile

----EXPR_CLASSclass

----EXPR_ENDBEGIN__FILE__endnilretrytrueEND__LINE__falseredoself

----EXPR_FNAMEaliasdefundef

----EXPR_MIDbreaknextrescuereturn

----modifiersifrescueunlessuntilwhile

Thedoconflict

Page 507: Ruby Hacking Guide

TheproblemTherearetwoiteratorforms–do〜endand{〜}Theirdifferenceisinpriority–{〜}hasamuchhigherpriority.Ahigherprioritymeansthataspartofthegrammaraunitis“small”whichmeansitcanbeputintoasmallerrule.Forexample,itcanbeputnotintostmtbutexprorprimary.Inthepast{〜}iteratorswereinprimarywhiledo〜enditeratorswereinstmt

Bytheway,therehasbeenarequestforanexpressionlikethis:

mdo....end+mdo....end

Toallowforthis,putthedo〜enditeratorinargorprimary.Incidentally,theconditionforwhileisexpr,meaningitcontainsargandprimary,sothedowillcauseaconflicthere.Basically,itlookslikethis:

whilemdo....end

Atfirstglance,thedolookslikethedoofwhile.However,acloserlookrevealsthatitcouldbeamdo〜endbundling.Somethingthat’snotobviouseventoapersonwilldefinitelycauseyacctoconflict.Let’stryitinpractice.

/*doconflictexperiment*/%tokenkWHILEkDOtIDENTIFIERkEND%%

Page 508: Ruby Hacking Guide

expr:kWHILEexprkDOexprkEND|tIDENTIFIER|tIDENTIFIERkDOexprkEND

Isimplifiedtheexampletoonlyincludewhile,variablereferencinganditerators.Thisrulecausesashift/reduceconflictiftheheadoftheconditionalcontainstIDENTIFIER.IftIDENTIFIERisusedforvariablereferencinganddoisappendedtowhile,thenit’sreduction.Ifit’smadeaniteratordo,thenit’sashift.

Unfortunately,inashift/reduceconflicttheshiftisprioritized,soifleftunchecked,dowillbecomeaniteratordo.Thatsaid,evenifareductionisforcedthroughoperatorprioritiesorsomeothermethod,dowon’tshiftatall,becomingunusable.Thus,tosolvetheproblemwithoutanycontradictions,weneedtoeitherdealwithonthescannerlevelorwritearulethatallowstouseoperatorswithoutputtingthedo〜enditeratorintoexpr.

However,notputtingdo〜endintoexprisnotarealisticgoal.Thatwouldrequireallrulesforexpr(aswellasforargandprimary)toberepeated.Thisleavesusonlythescannersolution.

Rule-levelsolutionBelowisasimplifiedexampleofarelevantrule.

▼dosymbol

primary:kWHILEexpr_valuedocompstmtkEND

Page 509: Ruby Hacking Guide

do:term|kDO_COND

primary:operationbrace_block|method_callbrace_block

brace_block:'{'opt_block_varcompstmt'}'|kDOopt_block_varcompstmtkEND

Asyoucansee,theterminalsymbolsforthedoofwhileandfortheiteratordoaredifferent.Fortheformerit’skDO_CONDwhileforthelatterit’skDOThenit’ssimplyamatterofpointingthatdistinctionouttothescanner.

Symbol-levelsolutionBelowisapartialviewoftheyylexsectionthatprocessesreservedwords.It’stheonlyparttaskedwithprocessingdosolookingatthiscodeshouldbeenoughtounderstandthecriteriaformakingthedistinction.

▼yylex-Identifier-Reservedword

4183if(kw->id[0]==kDO){4184if(COND_P())returnkDO_COND;4185if(CMDARG_P()&&state!=EXPR_CMDARG)4186returnkDO_BLOCK;4187if(state==EXPR_ENDARG)4188returnkDO_BLOCK;4189returnkDO;4190}

(parse.y)

Page 510: Ruby Hacking Guide

It’salittlemessy,butyouonlyneedthepartassociatedwithkDO_COND.Thatisbecauseonlytwocomparisonsaremeaningful.ThefirstisthecomparisonbetweenkDO_CONDandkDO/kDO_BLOCKThesecondisthecomparisonbetweenkDOandkDO_BLOCK.Therestaremeaningless.Rightnowweonlyneedtodistinguishtheconditionaldo–leavealltheotherconditionsalone.

Basically,COND_P()isthekey.

COND_P()

cond_stack

COND_P()isdefinedclosetotheheadofparse.y

▼cond_stack

75#ifdefHAVE_LONG_LONG76typedefunsignedLONG_LONGstack_type;77#else78typedefunsignedlongstack_type;79#endif8081staticstack_typecond_stack=0;82#defineCOND_PUSH(n)(cond_stack=(cond_stack<<1)|((n)&1))83#defineCOND_POP()(cond_stack>>=1)84#defineCOND_LEXPOP()do{\85intlast=COND_P();\86cond_stack>>=1;\87if(last)cond_stack|=1;\88}while(0)89#defineCOND_P()(cond_stack&1)

(parse.y)

Page 511: Ruby Hacking Guide

Thetypestack_typeiseitherlong(over32bit)orlonglong(over64bit).cond_stackisinitializedbyyycompile()atthestartofparsingandafterthatishandledonlythroughmacros.Allyouneed,then,istounderstandthosemacros.

IfyoulookatCOND_PUSH/POPyouwillseethatthesemacrosuseintegersasstacksconsistingofbits.

MSB←→LSB...0000000000Initialvalue0...0000000001COND_PUSH(1)...0000000010COND_PUSH(0)...0000000101COND_PUSH(1)...0000000010COND_POP()...0000000100COND_PUSH(0)...0000000010COND_POP()

AsforCOND_P(),sinceitdetermineswhetherornottheleastsignificantbit(LSB)isa1,iteffectivelydetermineswhethertheheadofthestackisa1.

TheremainingCOND_LEXPOP()isalittleweird.ItleavesCOND_P()attheheadofthestackandexecutesarightshift.Basically,it“crushes”thesecondbitfromthebottomwiththelowermostbit.

MSB←→LSB...0000000000Initialvalue0...0000000001COND_PUSH(1)...0000000010COND_PUSH(0)...0000000101COND_PUSH(1)...0000000011COND_LEXPOP()...0000000100COND_PUSH(0)...0000000010COND_LEXPOP()

Page 512: Ruby Hacking Guide

((errata:ItleavesCOND_P()onlywhenitis1.WhenCOND_P()is0andthesecondbottombitis1,itwouldbecome1afterdoingLEXPOP,thusCOND_P()isnotleftinthiscase.))

NowIwillexplainwhatthatmeans.

InvestigatingthefunctionLetusinvestigatethefunctionofthisstack.TodothatIwilllistupallthepartswhereCOND_PUSH()COND_POP()areused.

|kWHILE{COND_PUSH(1);}expr_valuedo{COND_POP();}--|kUNTIL{COND_PUSH(1);}expr_valuedo{COND_POP();}--|kFORblock_varkIN{COND_PUSH(1);}expr_valuedo{COND_POP();}--case'(':::COND_PUSH(0);CMDARG_PUSH(0);--case'[':::COND_PUSH(0);CMDARG_PUSH(0);--case'{':::COND_PUSH(0);CMDARG_PUSH(0);--case']':

Page 513: Ruby Hacking Guide

case'}':case')':COND_LEXPOP();CMDARG_LEXPOP();

Fromthiswecanderivethefollowinggeneralrules

AtthestartofaconditionalexpressionPUSH(1)AtopeningparenthesisPUSH(0)AttheendofaconditionalexpressionPOP()AtclosingparenthesisLEXPOP()

Withthis,youshouldseehowtouseit.Ifyouthinkaboutitforaminute,thenamecond_stackitselfisclearlythenameforamacrothatdetermineswhetherornotit’sonthesamelevelastheconditionalexpression(seeimage2)

Figure2:ChangesofCOND_P()

Usingthistrickshouldalsomakesituationsliketheoneshownbeloweasytodealwith.

while(mdo....end)#doisaniteratordo(kDO)

Page 514: Ruby Hacking Guide

....end

Thismeansthatona32-bitmachineintheabsenceoflonglongifconditionalexpressionsorparenthesesarenestedat32levels,thingscouldgetstrange.Ofcourse,inrealityyouwon’tneedtonestsodeepsothere’snoactualrisk.

Finally,thedefinitionofCOND_LEXPOP()looksabitstrange–thatseemstobeawayofdealingwithlookahead.However,therulesnowdonotallowforlookaheadtooccur,sothere’snopurposetomakethedistinctionbetweenPOPandLEXPOP.Basically,atthistimeitwouldbecorrecttosaythatCOND_LEXPOP()hasnomeaning.

tLPAREN_ARG(1)

TheproblemThisoneisverycomplicated.ItonlybecameworkableininRuby1.7andonlyfairlyrecently.Thecoreoftheissueisinterpretingthis:

call(expr)+1

Asoneofthefollowing

(call(expr))+1

Page 515: Ruby Hacking Guide

call((expr)+1)

Inthepast,itwasalwaysinterpretedastheformer.Thatis,theparentheseswerealwaystreatedas“Methodparameterparentheses”.ButsinceRuby1.7itbecamepossibletointerpretitasthelatter–basically,ifaspaceisadded,theparenthesesbecome“Parenthesesofexpr”

Iwillalsoprovideanexampletoexplainwhytheinterpretationchanged.First,Iwroteastatementasfollows

pm()+1

Sofarsogood.Butlet’sassumethevaluereturnedbymisafractionandtherearetoomanydigits.Thenwewillhaveitdisplayedasaninteger.

pm()+1.to_i#??

Uh-oh,weneedparentheses.

p(m()+1).to_i

Howtointerpretthis?Upto1.6itwillbethis

(p(m()+1)).to_i

Themuch-neededto_iisrenderedmeaningless,whichisunacceptable.Tocounterthat,addingaspacebetweenitandthe

Page 516: Ruby Hacking Guide

parentheseswillcausetheparenthesestobetreatedspeciallyasexprparentheses.

Forthoseeagertotestthis,thisfeaturewasimplementedinparse.yrevision1.100(2001-05-31).Thus,itshouldberelativelyprominentwhenlookingatthedifferencesbetweenitand1.99.Thisisthecommandtofindthedifference.

~/src/ruby%cvsdiff-r1.99-r1.100parse.y

InvestigationFirstletuslookathowtheset-upworksinreality.Usingtheruby-lexertool{ruby-lexer:locatedintools/ruby-lexer.tar.gzontheCD}wecanlookatthelistofsymbolscorrespondingtotheprogram.

%ruby-lexer-e'm(a)'tIDENTIFIER'('tIDENTIFIER')''\n'

SimilarlytoRuby,-eistheoptiontopasstheprogramdirectlyfromthecommandline.Withthiswecantryallkindsofthings.Let’sstartwiththeproblemathand–thecasewherethefirstparameterisenclosedinparentheses.

%ruby-lexer-e'm(a)'tIDENTIFIERtLPAREN_ARGtIDENTIFIER')''\n'

Afteraddingaspace,thesymboloftheopeningparenthesisbecametLPAREN_ARG.Nowlet’slookatnormalexpression

Page 517: Ruby Hacking Guide

parentheses.

%ruby-lexer-e'(a)'tLPARENtIDENTIFIER')''\n'

FornormalexpressionparenthesesitseemstobetLPAREN.Tosumup:

Input Symbolofopeningparenthesism(a) '('m(a) tLPAREN_ARG(a) tLPAREN

Thusthefocusisdistinguishingbetweenthethree.FornowtLPAREN_ARGisthemostimportant.

ThecaseofoneparameterWe’llstartbylookingattheyylex()sectionfor'('

▼yylex-'('

3841case'(':3842command_start=Qtrue;3843if(lex_state==EXPR_BEG||lex_state==EXPR_MID){3844c=tLPAREN;3845}3846elseif(space_seen){3847if(lex_state==EXPR_CMDARG){3848c=tLPAREN_ARG;3849}3850elseif(lex_state==EXPR_ARG){3851c=tLPAREN_ARG;3852yylval.id=last_id;

Page 518: Ruby Hacking Guide

3853}3854}3855COND_PUSH(0);3856CMDARG_PUSH(0);3857lex_state=EXPR_BEG;3858returnc;

(parse.y)

SincethefirstifistLPARENwe’relookingatanormalexpressionparenthesis.Thedistinguishingfeatureisthatlex_stateiseitherBEGorMID–thatis,it’sclearlyatthebeginningoftheexpression.

Thefollowingspace_seenshowswhethertheparenthesisisprecededbyaspace.Ifthereisaspaceandlex_stateiseitherARGorCMDARG,basicallyifit’sbeforethefirstparameter,thesymbolisnot'('buttLPAREN_ARG.Thisway,forexample,thefollowingsituationcanbeavoided

m(#Parenthesisnotprecededbyaspace.Methodparenthesis('(')marg,(#Unlessfirstparameter,expressionparenthesis(tLPAREN)

WhenitisneithertLPARENnortLPAREN_ARG,theinputcharactercisusedasisandbecomes'('.Thiswilldefinitelybeamethodcallparenthesis.

Ifsuchacleardistinctionismadeonthesymbollevel,noconflictshouldoccurevenifrulesarewrittenasusual.Simplified,itbecomessomethinglikethis:

stmt:command_call

Page 519: Ruby Hacking Guide

method_call:tIDENTIFIER'('args')'/*Normalmethod*/

command_call:tIDENTIFIERcommand_args/*Methodwithparenthesesomitted*/

command_args:args

args:arg:args','arg

arg:primary

primary:tLPARENcompstmt')'/*Normalexpressionparenthesis*/|tLPAREN_ARGexpr')'/*Firstparameterenclosedinparentheses*/|method_call

NowIneedyoutofocusonmethod_callandcommand_callIfyouleavethe'('withoutintroducingtLPAREN_ARG,thencommand_argswillproduceargs,argswillproducearg,argwillproduceprimary.Then,'('willappearfromtLPAREN_ARGandconflictwithmethod_call(seeimage3)

Figure3:method_callandcommand_call

Page 520: Ruby Hacking Guide

ThecaseoftwoparametersandmoreOnemightthinkthatiftheparenthesisbecomestLPAREN_ARGallwillbewell.Thatisnotso.Forexample,considerthefollowing

m(a,a,a)

Beforenow,expressionslikethisoneweretreatedasmethodcallsanddidnotproduceerrors.However,iftLPAREN_ARGisintroduced,theopeningparenthesisbecomesanexprparenthesis,andiftwoormoreparametersarepresent,thatwillcauseaparseerror.Thisneedstoberesolvedforthesakeofcompatibility.

Unfortunately,rushingaheadandjustaddingarulelike

command_args:tLPAREN_ARGargs')'

willjustcauseaconflict.Let’slookatthebiggerpictureandthinkcarefully.

stmt:command_call|expr

expr:arg

command_call:tIDENTIFIERcommand_args

command_args:args|tLPAREN_ARGargs')'

args:arg:args','arg

Page 521: Ruby Hacking Guide

arg:primary

primary:tLPARENcompstmt')'|tLPAREN_ARGexpr')'|method_call

method_call:tIDENTIFIER'('args')'

Lookatthefirstruleofcommand_argsHere,argsproducesargThenargproducesprimaryandoutoftherecomesthetLPAREN_ARGrule.Andsinceexprcontainsargandasitisexpanded,itbecomeslikethis:

command_args:tLPAREN_ARGarg')'|tLPAREN_ARGarg')'

Thisisareduce/reduceconflict,whichisverybad.

So,howcanwedealwithonly2+parameterswithoutcausingaconflict?We’llhavetowritetoaccommodateforthatsituationspecifically.Inpractice,it’ssolvedlikethis:

▼command_args

command_args:open_args

open_args:call_args|tLPAREN_ARG')'|tLPAREN_ARGcall_args2')'

call_args:command|argsopt_block_arg|args','tSTARarg_valueopt_block_arg|assocsopt_block_arg

Page 522: Ruby Hacking Guide

|assocs','tSTARarg_valueopt_block_arg|args','assocsopt_block_arg|args','assocs','tSTARargopt_block_arg|tSTARarg_valueopt_block_arg|block_arg

call_args2:arg_value','argsopt_block_arg|arg_value','block_arg|arg_value','tSTARarg_valueopt_block_arg|arg_value','args','tSTARarg_valueopt_block_arg|assocsopt_block_arg|assocs','tSTARarg_valueopt_block_arg|arg_value','assocsopt_block_arg|arg_value','args','assocsopt_block_arg|arg_value','assocs','tSTARarg_valueopt_block_arg|arg_value','args','assocs','tSTARarg_valueopt_block_arg|tSTARarg_valueopt_block_arg|block_arg

primary:literal|strings|xstring:|tLPAREN_ARGexpr')'

Herecommand_argsisfollowedbyanotherlevel–open_argswhichmaynotbereflectedintheruleswithoutconsequence.Thekeyisthesecondandthirdrulesofthisopen_argsThisformissimilartotherecentexample,butisactuallysubtlydifferent.Thedifferenceisthatcall_args2hasbeenintroduced.Thedefiningcharacteristicofthiscall_args2isthatthenumberofparametersisalwaystwoormore.Thisisevidencedbythefactthatmostrulescontain','Theonlyexceptionisassocs,butsinceassocsdoesnotcomeoutofexpritcannotconflictanyway.

Page 523: Ruby Hacking Guide

Thatwasn’taverygoodexplanation.Toputitsimply,inagrammarwherethis:

command_args:call_args

doesn’twork,andonlyinsuchagrammar,thenextruleisusedtomakeanaddition.Thus,thebestwaytothinkhereis“Inwhatkindofgrammarwouldthisrulenotwork?”Furthermore,sinceaconflictonlyoccurswhentheprimaryoftLPAREN_ARGappearsattheheadofcall_args,thescopecanbelimitedfurtherandthebestwaytothinkis“InwhatkindofgrammardoesthisrulenotworkwhenatIDENTIFIERtLPAREN_ARGlineappears?”Belowareafewexamples.

m(a,a)

ThisisasituationwhenthetLPAREN_ARGlistcontainstwoormoreitems.

m()

Conversely,thisisasituationwhenthetLPAREN_ARGlistisempty.

m(*args)m(&block)m(k=>v)

ThisisasituationwhenthetLPAREN_ARGlistcontainsaspecialexpression(onenotpresentinexpr).

Page 524: Ruby Hacking Guide

Thisshouldbesufficientformostcases.Nowlet’scomparetheabovewithapracticalimplementation.

▼open_args(1)

open_args:call_args|tLPAREN_ARG')'

First,theruledealswithemptylists

▼open_args(2)

|tLPAREN_ARGcall_args2')'

call_args2:arg_value','argsopt_block_arg|arg_value','block_arg|arg_value','tSTARarg_valueopt_block_arg|arg_value','args','tSTARarg_valueopt_block_arg|assocsopt_block_arg|assocs','tSTARarg_valueopt_block_arg|arg_value','assocsopt_block_arg|arg_value','args','assocsopt_block_arg|arg_value','assocs','tSTARarg_valueopt_block_arg|arg_value','args','assocs','tSTARarg_valueopt_block_arg|tSTARarg_valueopt_block_arg|block_arg

Andcall_args2dealswithelementscontainingspecialtypessuchasassocs,passingofarraysorpassingofblocks.Withthis,thescopeisnowsufficientlybroad.

Page 525: Ruby Hacking Guide

tLPAREN_ARG(2)

TheproblemIntheprevioussectionIsaidthattheexamplesprovidedshouldbesufficientfor“most”specialmethodcallexpressions.Isaid“most”becauseiteratorsarestillnotcovered.Forexample,thebelowstatementwillnotwork:

m(a){....}m(a)do....end

Inthissectionwewillonceagainlookatthepreviouslyintroducedpartswithsolvingthisprobleminmind.

Rule-levelsolutionLetusstartwiththerules.Thefirstparthereisallfamiliarrules,sofocusonthedo_blockpart

▼command_call

command_call:command|block_command

command:operationcommand_args

command_args:open_args

open_args:call_args|tLPAREN_ARG')'|tLPAREN_ARGcall_args2')'

Page 526: Ruby Hacking Guide

block_command:block_call

block_call:commanddo_block

do_block:kDO_BLOCKopt_block_varcompstmt'}'|tLBRACE_ARGopt_block_varcompstmt'}'

Bothdoand{arecompletelynewsymbolskDO_BLOCKandtLBRACE_ARG.Whyisn’titkDOor'{'youask?Inthiskindofsituationthebestanswerisanexperiment,sowewilltryreplacingkDO_BLOCKwithkDOandtLBRACE_ARGwith'{'andprocessingthatwithyacc

%yaccparse.yconflicts:2shift/reduce,6reduce/reduce

Itconflictsbadly.Afurtherinvestigationrevealsthatthisstatementisthecause.

m(a),b{....}

Thatisbecausethiskindofstatementisalreadysupposedtowork.b{....}becomesprimary.AndnowarulehasbeenaddedthatconcatenatestheblockwithmThatresultsintwopossibleinterpretations:

m((a),b){....}m((a),(b{....}))

Thisisthecauseoftheconflict–namely,a2shift/reduceconflict.

Theotherconflicthastodowithdo〜end

Page 527: Ruby Hacking Guide

m((a))do....end#Adddo〜endusingblock_callm((a))do....end#Adddo〜endusingprimary

Thesetwoconflict.Thisis6reduce/reduceconflict.

{〜}iteratorThisistheimportantpart.Asshownpreviously,youcanavoidaconflictbychangingthedoand'{'symbols.

▼yylex-'{'

3884case'{':3885if(IS_ARG()||lex_state==EXPR_END)3886c='{';/*block(primary)*/3887elseif(lex_state==EXPR_ENDARG)3888c=tLBRACE_ARG;/*block(expr)*/3889else3890c=tLBRACE;/*hash*/3891COND_PUSH(0);3892CMDARG_PUSH(0);3893lex_state=EXPR_BEG;3894returnc;

(parse.y)

IS_ARG()isdefinedas

▼IS_ARG

3104#defineIS_ARG()(lex_state==EXPR_ARG||lex_state==EXPR_CMDARG)

(parse.y)

Page 528: Ruby Hacking Guide

Thus,whenthestateisEXPR_ENDARGitwillalwaysbefalse.Inotherwords,whenlex_stateisEXPR_ENDARG,itwillalwaysbecometLBRACE_ARG,sothekeytoeverythingisthetransitiontoEXPR_ENDARG.

EXPR_ENDARG

NowweneedtoknowhowtosetEXPR_ENDARGIusedgreptofindwhereitisassigned.

▼TransitiontoEXPR_ENDARG

open_args:call_args|tLPAREN_ARG{lex_state=EXPR_ENDARG;}')'|tLPAREN_ARGcall_args2{lex_state=EXPR_ENDARG;}')'

primary:tLPAREN_ARGexpr{lex_state=EXPR_ENDARG;}')'

That’sstrange.OnewouldexpectthetransitiontoEXPR_ENDARGtooccuraftertheclosingparenthesiscorrespondingtotLPAREN_ARG,butit’sactuallyassignedbefore')'IrangrepafewmoretimesthinkingtheremightbeotherpartssettingtheEXPR_ENDARGbutfoundnothing.

Maybethere’ssomemistake.Maybelex_stateisbeingchangedsomeotherway.Let’suserubylex-analysertovisualizethelex_statetransition.

%rubylex-analyser-e'm(a){nil}'+EXPR_BEGEXPR_BEGC"m"tIDENTIFIEREXPR_CMDARGEXPR_CMDARGS"("tLPAREN_ARGEXPR_BEG

Page 529: Ruby Hacking Guide

0:condpush0:cmdpush1:cmdpush-EXPR_BEGC"a"tIDENTIFIEREXPR_CMDARGEXPR_CMDARG")"')'EXPR_END0:condlexpop1:cmdlexpop+EXPR_ENDARGEXPR_ENDARGS"{"tLBRACE_ARGEXPR_BEG0:condpush10:cmdpush0:cmdresumeEXPR_BEGS"nil"kNILEXPR_ENDEXPR_ENDS"}"'}'EXPR_END0:condlexpop0:cmdlexpopEXPR_END"\n"\nEXPR_BEG

Thethreebigbranchinglinesshowthestatetransitioncausedbyyylex().Ontheleftisthestatebeforeyylex()Themiddletwoarethewordtextanditssymbols.Finally,ontherightisthelex_stateafteryylex()

Theproblemherearepartsofsinglelinesthatcomeoutas+EXPR_ENDARG.Thisindicatesatransitionoccurringduringparseraction.Accordingtothis,forsomereasonanactionisexecutedafterreadingthe')'atransitiontoEXPR_ENDARGoccursand'{'isnicelychangedintotLBRACE_ARGThisisactuallyaprettyhigh-leveltechnique–generously(ab)usingtheLALRuptothe(1).

Abusingthelookaheadruby-ycanbringupadetaileddisplayoftheyaccparserengine.Thistimewewilluseittomorecloselytracetheparser.

Page 530: Ruby Hacking Guide

%ruby-yce'm(a){nil}'2>&1|egrep'^Reading|Reducing'Reducingviarule1(line303),->@1Readingatoken:Nexttokenis304(tIDENTIFIER)Readingatoken:Nexttokenis340(tLPAREN_ARG)Reducingviarule446(line2234),tIDENTIFIER->operationReducingviarule233(line1222),->@6Readingatoken:Nexttokenis304(tIDENTIFIER)Readingatoken:Nexttokenis41(')')Reducingviarule392(line1993),tIDENTIFIER->variableReducingviarule403(line2006),variable->var_refReducingviarule256(line1305),var_ref->primaryReducingviarule198(line1062),primary->argReducingviarule42(line593),arg->exprReducingviarule260(line1317),->@9Reducingviarule261(line1317),tLPAREN_ARGexpr@9')'->primaryReadingatoken:Nexttokenis344(tLBRACE_ARG)::

Herewe’reusingtheoption-cwhichstopstheprocessatjustcompilingand-ewhichallowstogiveaprogramfromthecommandline.Andwe’reusinggreptosingleouttokenreadandreductionreports.

Startbylookingatthemiddleofthelist.')'isread.Nowlookattheend–thereduction(execution)ofembeddingaction(@9)finallyhappens.Indeed,thiswouldallowEXPR_ENDARGtobesetafterthe')'beforethe'{'Butisthisalwaysthecase?Let’stakeanotherlookatthepartwhereit’sset.

Rule1tLPAREN_ARG{lex_state=EXPR_ENDARG;}')'Rule2tLPAREN_ARGcall_args2{lex_state=EXPR_ENDARG;}')'Rule3tLPAREN_ARGexpr{lex_state=EXPR_ENDARG;}')'

Page 531: Ruby Hacking Guide

Theembeddingactioncanbesubstitutedwithanemptyrule.Forexample,wecanrewritethisusingrule1withnochangeinmeaningwhatsoever.

target:tLPAREN_ARGtmp')'tmp:{lex_state=EXPR_ENDARG;}

Assumingthatthisisbeforetmp,it’spossiblethatoneterminalsymbolwillbereadbylookahead.Thuswecanskipthe(empty)tmpandreadthenext.Andifwearecertainthatlookaheadwilloccur,theassignmenttolex_stateisguaranteedtochangetoEXPR_ENDARGafter')'Butis')'certaintobereadbylookaheadinthisrule?

AscertaininglookaheadThisisactuallyprettyclear.Thinkaboutthefollowinginput.

m(){nil}#Am(a){nil}#Bm(a,b,c){nil}#C

Ialsotooktheopportunitytorewritetheruletomakeiteasiertounderstand(withnoactualchanges).

rule1:tLPAREN_ARGe1')'rule2:tLPAREN_ARGone_arge2')'rule3:tLPAREN_ARGmore_argse3')'

e1:/*empty*/

Page 532: Ruby Hacking Guide

e2:/*empty*/e3:/*empty*/

First,thecaseofinputA.Readingupto

m(#...tLPAREN_ARG

wearrivebeforethee1.Ife1isreducedhere,anotherrulecannotbechosenanymore.Thus,alookaheadoccurstoconfirmwhethertoreducee1andcontinuewithrule1tothebitterendortochooseadifferentrule.Accordingly,iftheinputmatchesrule1itiscertainthat')'willbereadbylookahead.

OntoinputB.First,readinguptohere

m(#...tLPAREN_ARG

Herealookaheadoccursforthesamereasonasdescribedabove.Furtherreadinguptohere

m(a#...tLPAREN_ARG'('tIDENTIFIER

Anotherlookaheadoccurs.Itoccursbecausedependingonwhetherwhatfollowsisa','ora')'adecisionismadebetweenrule2andrule3Ifwhatfollowsisa','thenitcanonlybeacommatoseparateparameters,thusrule3therulefortwoormoreparameters,ischosen.Thisisalsotrueiftheinputisnotasimpleabutsomethinglikeaniforliteral.Whentheinputiscomplete,alookaheadoccurstochoosebetweenrule2andrule3-therulesfor

Page 533: Ruby Hacking Guide

oneparameterandtwoormoreparametersrespectively.

Thepresenceofaseparateembeddingactionispresentbefore')'ineveryrule.There’snogoingbackafteranactionisexecuted,sotheparserwilltrytopostponeexecutinganactionuntilitisascertainaspossible.Forthatreason,situationswhenthiscertaintycannotbegainedwithasinglelookaheadshouldbeexcludedwhenbuildingaparserasitisaconflict.

ProceedingtoinputC.

m(a,b,c

Atthispointanythingotherthanrule3isunlikelysowe’renotexpectingalookahead.Andyet,thatiswrong.Ifthefollowingis'('thenit’samethodcall,butifthefollowingis','or')'itneedstobeavariablereference.Basically,thistimealookaheadisneededtoconfirmparameterelementsinsteadofembeddingactionreduction.

Butwhatabouttheotherinputs?Forexample,whatifthethirdparameterisamethodcall?

m(a,b,c(....)#...','method_call

Onceagainalookaheadisnecessarybecauseachoiceneedstobemadebetweenshiftandreductiondependingonwhetherwhatfollowsis','or')'.Thus,inthisruleinallinstancesthe')'isreadbeforetheembeddingactionisexecuted.Thisisquite

Page 534: Ruby Hacking Guide

complicatedandmorethanalittleimpressive.

Butwoulditbepossibletosetlex_stateusinganormalactioninsteadofanembeddingaction?Forexample,likethis:

|tLPAREN_ARG')'{lex_state=EXPR_ENDARG;}

Thiswon’tdobecauseanotherlookaheadislikelytooccurbeforetheactionisreduced.Thistimethelookaheadworkstoourdisadvantage.WiththisitshouldbeclearthatabusingthelookaheadofaLALRparserisprettytrickyandnotsomethinganoviceshouldbedoing.

do〜enditeratorSofarwe’vedealtwiththe{〜}iterator,butwestillhavedo〜end

left.Sincethey’rebothiterators,onewouldexpectthesamesolutionstowork,butitisn’tso.Theprioritiesaredifferent.Forexample,

ma,b{....}#m(a,(b{....}))ma,bdo....end#m(a,b)do....end

Thusit’sonlyappropriatetodealwiththemdifferently.

Thatsaid,insomesituationsthesamesolutionsdoapply.Theexamplebelowisonesuchsituation

m(a){....}m(a)do....end

Page 535: Ruby Hacking Guide

Intheend,ouronlyoptionistolookattherealthing.Sincewe’redealingwithdohere,weshouldlookinthepartofyylex()thathandlesreservedwords.

▼yylex-Identifiers-Reservedwords-do

4183if(kw->id[0]==kDO){4184if(COND_P())returnkDO_COND;4185if(CMDARG_P()&&state!=EXPR_CMDARG)4186returnkDO_BLOCK;4187if(state==EXPR_ENDARG)4188returnkDO_BLOCK;4189returnkDO;4190}

(parse.y)

ThistimeweonlyneedthepartthatdistinguishesbetweenkDO_BLOCKandkDO.IgnorekDO_CONDOnlylookatwhat’salwaysrelevantinafinite-statescanner.

Thedecision-makingpartusingEXPR_ENDARGisthesameastLBRACE_ARGsoprioritiesshouldn’tbeanissuehere.Similarlyto'{'therightcourseofactionisprobablytomakeitkDO_BLOCK

((errata:Inthefollowingcase,prioritiesshouldhaveaninfluence.(Butitdoesnotintheactualcode.Itmeansthisisabug.)

mm(a){...}#Thisshouldbeinterpretedasm(m(a){...}),#butisinterpretedasm(m(a)){...}mm(a)do...end#asthesameasthis:m(m(a))do...end

Page 536: Ruby Hacking Guide

))

TheproblemlieswithCMDARG_P()andEXPR_CMDARG.Let’slookatboth.

CMDARG_P()

▼cmdarg_stack

91staticstack_typecmdarg_stack=0;92#defineCMDARG_PUSH(n)(cmdarg_stack=(cmdarg_stack<<1)|((n)&1))93#defineCMDARG_POP()(cmdarg_stack>>=1)94#defineCMDARG_LEXPOP()do{\95intlast=CMDARG_P();\96cmdarg_stack>>=1;\97if(last)cmdarg_stack|=1;\98}while(0)99#defineCMDARG_P()(cmdarg_stack&1)

(parse.y)

Thestructureandinterface(macro)ofcmdarg_stackiscompletelyidenticaltocond_stack.It’sastackofbits.Sinceit’sthesame,wecanusethesamemeanstoinvestigateit.Let’slistuptheplaceswhichuseit.First,duringtheactionwehavethis:

command_args:{$<num>$=cmdarg_stack;CMDARG_PUSH(1);}open_args{/*CMDARG_POP()*/cmdarg_stack=$<num>1;$$=$2;}

Page 537: Ruby Hacking Guide

$<num>$representstheleftvaluewithaforcedcasting.Inthiscaseitcomesoutasthevalueoftheembeddingactionitself,soitcanbeproducedinthenextactionwith$<num>1.Basically,it’sastructurewherecmdarg_stackishiddenin$$beforeopen_argsandthenrestoredinthenextaction.

Butwhyuseahide-restoresysteminsteadofasimplepush-pop?Thatwillbeexplainedattheendofthissection.

Searchingyylex()formoreCMDARGrelations,Ifoundthis.

Token Relation'(''[''{' CMDARG_PUSH(0)')'']''}' CMDARG_LEXPOP()

Basically,aslongasitisenclosedinparentheses,CMDARG_P()isfalse.

Considerboth,anditcanbesaidthatwhencommand_args,aparameterforamethodcallwithparenthesesomitted,isnotenclosedinparenthesesCMDARG_P()istrue.

EXPR_CMDARG

Nowlet’stakealookatonemorecondition–EXPR_CMDARGLikebefore,letuslookforplacewhereatransitiontoEXPR_CMDARGoccurs.

▼yylex-Identifiers-StateTransitions

Page 538: Ruby Hacking Guide

4201if(lex_state==EXPR_BEG||4202lex_state==EXPR_MID||4203lex_state==EXPR_DOT||4204lex_state==EXPR_ARG||4205lex_state==EXPR_CMDARG){4206if(cmd_state)4207lex_state=EXPR_CMDARG;4208else4209lex_state=EXPR_ARG;4210}4211else{4212lex_state=EXPR_END;4213}

(parse.y)

Thisiscodethathandlesidentifiersinsideyylex()Leavingasidethatthereareabunchoflex_statetestsinhere,let’slookfirstatcmd_stateAndwhatisthis?

▼cmd_state

3106staticint3107yylex()3108{3109staticIDlast_id=0;3110registerintc;3111intspace_seen=0;3112intcmd_state;31133114if(lex_strterm){/*……omitted……*/3132}3133cmd_state=command_start;3134command_start=Qfalse;

(parse.y)

Page 539: Ruby Hacking Guide

Turnsoutit’sanyylexlocalvariable.Furthermore,aninvestigationusinggreprevealedthathereistheonlyplacewhereitsvalueisaltered.Thismeansit’sjustatemporaryvariableforstoringcommand_startduringasinglerunofyylex

Whendoescommand_startbecometrue,then?

▼command_start

2327staticintcommand_start=Qtrue;

2334staticNODE*2335yycompile(f,line)2336char*f;2337intline;2338{:2380command_start=1;

staticintyylex(){:case'\n':/*……omitted……*/3165command_start=Qtrue;3166lex_state=EXPR_BEG;3167return'\n';

3821case';':3822command_start=Qtrue;

3841case'(':3842command_start=Qtrue;

(parse.y)

Page 540: Ruby Hacking Guide

Fromthisweunderstandthatcommand_startbecomestruewhenoneoftheparse.ystaticvariables\n;(isscanned.

Summingupwhatwe’vecovereduptonow,first,when\n;(isread,command_startbecomestrueandduringthenextyylex()runcmd_statebecomestrue.

Andhereisthecodeinyylex()thatusescmd_state

▼yylex-Identifiers-Statetransitions

4201if(lex_state==EXPR_BEG||4202lex_state==EXPR_MID||4203lex_state==EXPR_DOT||4204lex_state==EXPR_ARG||4205lex_state==EXPR_CMDARG){4206if(cmd_state)4207lex_state=EXPR_CMDARG;4208else4209lex_state=EXPR_ARG;4210}4211else{4212lex_state=EXPR_END;4213}

(parse.y)

Fromthisweunderstandthefollowing:whenafter\n;(thestateisEXPR_BEGMIDDOTARGCMDARGandanidentifierisread,atransitiontoEXPR_CMDARGoccurs.However,lex_statecanonlybecomeEXPR_BEGfollowinga\n;(sowhenatransitionoccurstoEXPR_CMDARGthelex_statelosesitsmeaning.Thelex_staterestrictionisonlyimportanttotransitionsdealingwithEXPR_ARG

Page 541: Ruby Hacking Guide

BasedontheabovewecannowthinkofasituationwherethestateisEXPR_CMDARG.Forexample,seetheonebelow.Theunderscoreisthecurrentposition.

m_m(m_mm_

((errata:Thethirdone“mm_”isnotEXPR_CMDARG.(ItisEXPR_ARG.)))

ConclusionLetusnowreturntothedodecisioncode.

▼yylex-Identifiers-Reservedwords-kDO-kDO_BLOCK

4185if(CMDARG_P()&&state!=EXPR_CMDARG)4186returnkDO_BLOCK;

(parse.y)

Insidetheparameterofamethodcallwithparenthesesomittedbutnotbeforethefirstparameter.Thatmeansfromthesecondparameterofcommand_callonward.Basically,likethis:

marg,argdo....endm(arg),argdo....end

WhyisthecaseofEXPR_CMDARGexcluded?ThisexampleshouldclearItup

Page 542: Ruby Hacking Guide

mdo....end

Thispatterncanalreadybehandledusingthedo〜enditeratorwhichuseskDOandisdefinedinprimaryThus,includingthatcasewouldcauseanotherconflict.

RealityandtruthDidyouthinkwe’redone?Notyet.Certainly,thetheoryisnowcomplete,butonlyifeverythingthathasbeenwritteniscorrect.Asamatteroffact,thereisonefalsehoodinthissection.Well,moreaccurately,itisn’tafalsehoodbutaninexactstatement.It’sinthepartaboutCMDARG_P()

Actually,CMDARG_P()becomestruewheninsidecommand_args,thatistosay,insidetheparameterofamethodcallwithparenthesesomitted.

Butwhereexactlyis“insidetheparameterofamethodcallwithparenthesesomitted”?Onceagain,letususerubylex-analysertoinspectindetail.

%rubylex-analyser-e'ma,a,a,a;'+EXPR_BEGEXPR_BEGC"m"tIDENTIFIEREXPR_CMDARGEXPR_CMDARGS"a"tIDENTIFIEREXPR_ARG1:cmdpush-EXPR_ARG","','EXPR_BEGEXPR_BEG"a"tIDENTIFIEREXPR_ARGEXPR_ARG","','EXPR_BEGEXPR_BEG"a"tIDENTIFIEREXPR_ARG

Page 543: Ruby Hacking Guide

EXPR_ARG","','EXPR_BEGEXPR_BEG"a"tIDENTIFIEREXPR_ARGEXPR_ARG";"';'EXPR_BEG0:cmdresumeEXPR_BEGC"\n"'EXPR_BEG

The1:cmdpush-intherightcolumnisthepushtocmd_stack.Whentherightmostdigitinthatlineis1CMDARG_P()becometrue.Tosumup,theperiodofCMDARG_P()canbedescribedas:

FromimmediatelyafterthefirstparameterofamethodcallwithparenthesesomittedTotheterminalsymbolfollowingthefinalparameter

But,verystrictlyspeaking,eventhisisstillnotentirelyaccurate.

%rubylex-analyser-e'ma(),a,a;'+EXPR_BEGEXPR_BEGC"m"tIDENTIFIEREXPR_CMDARGEXPR_CMDARGS"a"tIDENTIFIEREXPR_ARG1:cmdpush-EXPR_ARG"("'('EXPR_BEG0:condpush10:cmdpushEXPR_BEGC")"')'EXPR_END0:condlexpop1:cmdlexpopEXPR_END","','EXPR_BEGEXPR_BEG"a"tIDENTIFIEREXPR_ARGEXPR_ARG","','EXPR_BEGEXPR_BEG"a"tIDENTIFIEREXPR_ARGEXPR_ARG";"';'EXPR_BEG0:cmdresumeEXPR_BEGC"\n"'EXPR_BEG

Whenthefirstterminalsymbolofthefirstparameterhasbeen

Page 544: Ruby Hacking Guide

read,CMDARG_P()istrue.Therefore,thecompleteanswerwouldbe:

FromthefirstterminalsymbolofthefirstparameterofamethodcallwithparenthesesomittedTotheterminalsymbolfollowingthefinalparameter

Whatrepercussionsdoesthisfacthave?RecallthecodethatusesCMDARG_P()

▼yylex-Identifiers-Reservedwords-kDO-kDO_BLOCK

4185if(CMDARG_P()&&state!=EXPR_CMDARG)4186returnkDO_BLOCK;

(parse.y)

EXPR_CMDARGstandsfor“Beforethefirstparameterofcommand_call”andisexcluded.Butwait,thismeaningisalsoincludedinCMDARG_P().Thus,thefinalconclusionofthissection:

EXPR_CMDARGiscompletelyuseless

Truthbetold,whenIrealizedthis,Ialmostbrokedowncrying.IwassureithadtomeanSOMETHINGandspentenormouseffortanalyzingthesource,butcouldn’tunderstandanything.Finally,Iranallkindoftestsonthecodeusingrubylex-analyserandarrivedattheconclusionthatithasnomeaningwhatsoever.

Ididn’tspendsomuchtimedoingsomethingmeaninglessjusttofillupmorepages.Itwasanattempttosimulateasituationlikely

Page 545: Ruby Hacking Guide

tohappeninreality.Noprogramisperfect,allprogramscontaintheirownmistakes.Complicatedsituationsliketheonediscussedherearewheremistakesoccurmosteasily,andwhentheydo,readingthesourcematerialwiththeassumptionthatit’sflawlesscanreallybackfire.Intheend,whenreadingthesourcecode,youcanonlytrustthewhatactuallyhappens.

Hopefully,thiswillteachyoutheimportanceofdynamicanalysis.Wheninvestigatingsomething,focusonwhatreallyhappens.Thesourcecodewillnottellyoueverything.Itcan’ttellanythingotherthanwhatthereaderinfers.

Andwiththisveryusefulsermon,Iclosethechapter.

((errata:Thisconfidentlywrittenconclusionwaswrong.WithoutEXPR_CMDARG,forinstance,thisprogram“m(mdoend)”cannotbeparsed.Thisisanexampleofthefactthatcorrectnessisnotprovedevenifdynamicanalysesaredonesomanytimes.))

StillnottheendAnotherthingIforgot.Ican’tendthechapterwithoutexplainingwhyCMDARG_P()takesthatvalue.Here’stheproblematicpart:

▼command_args

1209command_args:{1210$<num>$=cmdarg_stack;1211CMDARG_PUSH(1);

Page 546: Ruby Hacking Guide

1212}1213open_args1214{1215/*CMDARG_POP()*/1216cmdarg_stack=$<num>1;1217$$=$2;1218}

1221open_args:call_args

(parse.y)

Allthingsconsidered,thislookslikeanotherinfluencefromlookahead.command_argsisalwaysinthefollowingcontext:

tIDENTIFIER_

Thus,thislookslikeavariablereferenceoramethodcall.Ifit’savariablereference,itneedstobereducedtovariableandifit’samethodcallitneedstobereducedtooperationWecannotdecidehowtoproceedwithoutemployinglookahead.Thusalookaheadalwaysoccursattheheadofcommand_argsandafterthefirstterminalsymbolofthefirstparameterisread,CMDARG_PUSH()isexecuted.

ThereasonwhyPOPandLEXPOPexistseparatelyincmdarg_stackisalsohere.Observethefollowingexample:

%rubylex-analyser-e'mm(a),a'-e:1:warning:parenthesizeargument(s)forfutureversion+EXPR_BEGEXPR_BEGC"m"tIDENTIFIEREXPR_CMDARGEXPR_CMDARGS"m"tIDENTIFIEREXPR_ARG1:cmdpush-

Page 547: Ruby Hacking Guide

EXPR_ARGS"("tLPAREN_ARGEXPR_BEG0:condpush10:cmdpush101:cmdpush-EXPR_BEGC"a"tIDENTIFIEREXPR_CMDARGEXPR_CMDARG")"')'EXPR_END0:condlexpop11:cmdlexpop+EXPR_ENDARGEXPR_ENDARG","','EXPR_BEGEXPR_BEGS"a"tIDENTIFIEREXPR_ARGEXPR_ARG"\n"\nEXPR_BEG10:cmdresume0:cmdresume

Lookingonlyatthepartsrelatedtocmdandhowtheycorrespondtoeachother…

1:cmdpush-parserpush(1)10:cmdpushscannerpush101:cmdpush-parserpush(2)11:cmdlexpopscannerpop10:cmdresumeparserpop(2)0:cmdresumeparserpop(1)

Thecmdpush-withaminussignattheendisaparserpush.Basically,pushandpopdonotcorrespond.Originallythereweresupposedtobetwoconsecutivepush-andthestackwouldbecome110,butduetothelookaheadthestackbecame101instead.CMDARG_LEXPOP()isalast-resortmeasuretodealwiththis.Thescanneralwayspushes0sonormallywhatitpopsshouldalsoalwaysbe0.Whenitisn’t0,wecanonlyassumethatit’s1duetotheparserpushbeinglate.Thus,thevalueisleft.

Conversely,atthetimeoftheparserpopthestackissupposedtobe

Page 548: Ruby Hacking Guide

backinnormalstateandusuallypopshouldn’tcauseanytrouble.Whenitdoesn’tdothat,thereasonisbasicallythatitshouldworkright.Whetherpoppingorhidingin$$andrestoring,theprocessisthesame.Whenyouconsiderallthefollowingalterations,it’sreallyimpossibletotellhowlookahead’sbehaviorwillchange.Moreover,thisproblemappearsinagrammarthat’sgoingtobeforbiddeninthefuture(that’swhythereisawarning).Tomakesomethinglikethiswork,thetrickistoconsidernumerouspossiblesituationsandrespondthem.AndthatiswhyIthinkthiskindofimplementationisrightforRuby.Thereinliestherealsolution.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 549: Ruby Hacking Guide

RubyHackingGuide

Page 550: Ruby Hacking Guide

Chapter12:Syntaxtree

construction

Node

NODE

AsI’vealreadydescribed,aRubyprogramisfirstconvertedtoasyntaxtree.Tobemoreprecise,asyntaxtreeisatreestructuremadeofstructscalled“nodes”.Inruby,allnodesareoftypeNODE.

▼NODE

128typedefstructRNode{129unsignedlongflags;130char*nd_file;131union{132structRNode*node;133IDid;134VALUEvalue;135VALUE(*cfunc)(ANYARGS);136ID*tbl;137}u1;138union{139structRNode*node;140IDid;141intargc;

Page 551: Ruby Hacking Guide

142VALUEvalue;143}u2;144union{145structRNode*node;146IDid;147longstate;148structglobal_entry*entry;149longcnt;150VALUEvalue;151}u3;152}NODE;

(node.h)

AlthoughyoumightbeabletoinferfromthestructnameRNode,nodesareRubyobjects.Thismeansthecreationandreleaseofnodesaretakencareofbytheruby’sgarbagecollector.

Therefore,flagsnaturallyhasthesameroleasbasic.flagsoftheobjectstruct.ItmeansthatT_NODEwhichisthetypeofastructandflagssuchasFL_FREEZEarestoredinit.AsforNODE,inadditiontothese,itsnodetypeisstoredinflags.

Whatdoesitmean?Sinceaprogramcouldcontainvariouselementssuchasifandwhileanddefandsoon,therearealsovariouscorrespondingnodetypes.Thethreeavailableunionarecomplicated,buthowtheseunionsareusedisdecidedtoonlyonespecificwayforeachnode.Forexample,thebelowtableshowsthecasewhenitisNODE_IFthatisthenodeofif.

member unionmember roleu1 u1.node theconditionexpressionu2 u2.node thebodyoftrue

Page 552: Ruby Hacking Guide

u3 u3.node thebodyoffalse

And,innode.h,themacrostoaccesseachunionmemberareavailable.

▼themacrostoaccessNODE

166#definend_headu1.node167#definend_alenu2.argc168#definend_nextu3.node169170#definend_condu1.node171#definend_bodyu2.node172#definend_elseu3.node173174#definend_origu3.value::

(node.h)

Forexample,theseareusedasfollows:

NODE*head,*tail;head->nd_next=tail;/*head->u3.node=tail*/

Inthesourcecode,it’salmostcertainthatthesemacrosareused.AveryfewexceptionsareonlythetwoplaceswherecreatingNODEinparse.yandwheremarkingNODEingc.c.

Bytheway,whatisthereasonwhysuchmacrosareused?Foronething,itmightbebecauseit’scumbersometoremembernumberslikeu1thatarenotmeaningfulbyjustthemselves.Butwhatis

Page 553: Ruby Hacking Guide

moreimportantthanthatis,thereshouldbenoproblemifthecorrespondingnumberischangedandit’spossiblethatitwillactuallybechanged.Forexample,sinceaconditionclauseofifdoesnothavetobestoredinu1,someonemightwanttochangeittou2forsomereason.Butifu1isdirectlyused,heneedstomodifyalotofplacesalloverthesourcecodes,itisinconvenient.SincenodesarealldeclaredasNODE,it’shardtofindnodesthatrepresentif.Bypreparingthemacrostoaccess,thiskindoftroublecanbeavoidedandconverselywecandeterminethenodetypesfromthemacros.

NodeTypeIsaidthatintheflagsofaNODEstructitsnodetypeisstored.We’lllookatinwhatformthisinformationisstored.Anodetypecanbesetbynd_set_type()andobtainedbynd_type().

▼nd_typend_set_type

156#definend_type(n)(((RNODE(n))->flags>>FL_USHIFT)&0xff)157#definend_set_type(n,t)\158RNODE(n)->flags=((RNODE(n)->flags&~FL_UMASK)\|(((t)<<FL_USHIFT)&FL_UMASK))

(node.h)

▼FL_USHIFTFL_UMASK

418#defineFL_USHIFT11429#defineFL_UMASK(0xff<<FL_USHIFT)

Page 554: Ruby Hacking Guide

(ruby.h)

Itwon’tbesomuchtroubleifwe’llkeepfocusonaroundnd_type.Fig.1showshowitseemslike.

Fig.1:TheusageofRNode.flags

And,sincemacroscannotbeusedfromdebuggers,thenodetype()functionisalsoavailable.

▼nodetype

4247staticenumnode_type4248nodetype(node)/*fordebug*/4249NODE*node;4250{4251return(enumnode_type)nd_type(node);4252}

(parse.y)

FileNameandLineNumberThend_fileofaNODEholds(thepointerto)thenameofthefilewherethetextthatcorrespondstothisnodeexists.Sincethere’s

Page 555: Ruby Hacking Guide

thefilename,wenaturallyexpectthatthere’salsothelinenumber,butthecorrespondingmembercouldnotbefoundaroundhere.Actually,thelinenumberisbeingembeddedtoflagsbythefollowingmacro:

▼nd_linend_set_line

160#defineNODE_LSHIFT(FL_USHIFT+8)161#defineNODE_LMASK(((long)1<<(sizeof(NODE*)*CHAR_BIT-NODE_LSHIFT))-1)162#definend_line(n)\((unsignedint)((RNODE(n)->flags>>NODE_LSHIFT)&NODE_LMASK))163#definend_set_line(n,l)\164RNODE(n)->flags=((RNODE(n)->flags&~(-1<<NODE_LSHIFT))\|(((l)&NODE_LMASK)<<NODE_LSHIFT))

(node.h)

nd_set_line()isfairlyspectacular.However,asthenamessuggest,itiscertainthatnd_set_line()andnd_lineworkssymmetrically.Thus,ifwefirstexaminethesimplernd_line()andgrasptherelationshipbetweentheparameters,there’snoneedtoanalyzend_set_line()inthefirstplace.

ThefirstthingisNODE_LSHIFT,asyoucanguessfromthedescriptionofthenodetypesoftheprevioussection,itisthenumberofusedbitsinflags.FL_USHIFTisreservedbysystemofruby(11bits,ruby.h),8bitsareforitsnodetype.

ThenextthingisNODE_LMASK.

sizeof(NODE*)*CHAR_BIT-NODE_LSHIFT

Page 556: Ruby Hacking Guide

Thisisthenumberoftherestofthebits.Let’sassumeitisrestbits.Thismakesthecodealotsimpler.

#defineNODE_LMASK(((long)1<<restbits)-1)

Fig.2showswhattheabovecodeseemstobedoing.Notethataborrowoccurswhensubtracting1.WecaneventuallyunderstandthatNODE_LMASKisasequencefilledwith1whosesizeisthenumberofthebitsthatarestillavailable.

Fig.2:NODE_LMASK

Now,let’slookatnd_line()again.

(RNODE(n)->flags>>NODE_LSHIFT)&NODE_LMASK

Bytherightshift,theunusedspaceisshiftedtotheLSB.ThebitwiseANDleavesonlytheunusedspace.Fig.3showshowflagsisused.SinceFL_USHIFTis11,in32-bitmachine32-(11+8)=13bitsareavailableforthelinenumber.

Page 557: Ruby Hacking Guide

Fig.3:HowflagsareusedatNODE

…Thismeans,ifthelinenumbersbecomesbeyond2^13=8192,thelinenumbersshouldwronglybedisplayed.Let’stry.

File.open('overflow.rb','w'){|f|10000.times{f.puts}f.puts'raise'}

Withmy686machine,rubyoverflow.rbproperlydisplayed1809asalinenumber.I’vesucceeded.However,ifyouuse64-bitmachine,youneedtocreatealittlebiggerfileinordertosuccessfullyfail.

rb_node_newnode()

Lastlylet’slookatthefunctionrb_node_newnode()thatcreatesanode.

▼rb_node_newnode()

4228NODE*4229rb_node_newnode(type,a0,a1,a2)4230enumnode_typetype;4231NODE*a0,*a1,*a2;

Page 558: Ruby Hacking Guide

4232{4233NODE*n=(NODE*)rb_newobj();42344235n->flags|=T_NODE;4236nd_set_type(n,type);4237nd_set_line(n,ruby_sourceline);4238n->nd_file=ruby_sourcefile;42394240n->u1.node=a0;4241n->u2.node=a1;4242n->u3.node=a2;42434244returnn;4245}

(parse.y)

We’veseenrb_newobj()intheChapter5:Garbagecollection.ItisthefunctiontogetavacantRVALUE.ByattachingtheT_NODEstruct-typeflagtoit,theinitializationasaVALUEwillcomplete.Ofcourse,it’spossiblethatsomevaluesthatarenotoftypeNODE*arepassedforu1u2u3,butreceivedasNODE*forthetimebeing.Sincethesyntaxtreesofrubydoesnotcontaindoubleandsuch,ifthevaluesarereceivedaspointers,itwillneverbetoosmallinsize.

Fortherestpart,youcanforgetaboutthedetailsyou’velearnedsofar,andassumeNODEis

flags

nodetype

nd_line

nd_file

u1

Page 559: Ruby Hacking Guide

u2

u3

astructtypethathastheabovesevenmembers.

SyntaxTreeConstruction

Theroleoftheparseristoconvertthesourcecodethatisabytesequencetoasyntaxtree.Althoughthegrammarpassed,itdoesnotfinishevenhalfofthetask,sowehavetoassemblenodesandcreateatree.Inthissection,we’lllookattheconstructionprocessofthatsyntaxtree.

YYSTYPE

Essentiallythischapterisaboutactions,thusYYSTYPEwhichisthetypeof$$or$1becomesimportant.Let’slookatthe%unionofrubyfirst.

▼%uniondeclaration

170%union{171NODE*node;172IDid;173intnum;174structRVarmap*vars;175}

(parse.y)

Page 560: Ruby Hacking Guide

structRVarmapisastructusedbytheevaluatorandholdsablocklocalvariable.Youcantelltherest.Themostusedoneisofcoursenode.

LandscapewithSyntaxTreesImentionedthatlookingatthefactfirstisatheoryofcodereading.Sincewhatwewanttoknowthistimeishowthegeneratedsyntaxtreeis,weshouldstartwithlookingattheanswer(thesyntaxtree).

It’salsoniceusingdebuggerstoobserveeverytime,butyoucanvisualizethesyntaxtreemorehandilybyusingthetoolnodedumpcontainedintheattachedCD-ROM,ThistoolisoriginallytheNodeDumpmadebyPragmaticProgrammersandremodeledforthisbook.Theoriginalversionshowsquiteexplanatoryoutput,butthisremodeledversiondeeplyanddirectlydisplaystheappearanceofthesyntaxtree.

Forexample,inordertodumpthesimpleexpressionm(a),youcandoasfollows:

%ruby-rnodedump-e'm(a)'NODE_NEWLINEnd_file="-e"nd_nth=1nd_next:NODE_FCALLnd_mid=9617(m)nd_args:

Page 561: Ruby Hacking Guide

NODE_ARRAYnd_alen=1nd_head:NODE_VCALLnd_mid=9625(a)nd_next=(null)

The-roptionisusedtospecifythelibrarytobeload,andthe-eisusedtopassaprogram.Then,thesyntaxtreeexpressionoftheprogramwillbedumped.

I’llbrieflyexplainabouthowtoseethecontent.NODE_NEWLINEandNODE_FCALLandsucharethenodetypes.Whatarewrittenatthesameindentlevelofeachnodearethecontentsofitsnodemembers.Forexample,therootisNODE_NEWLINE,andithasthethreemembers:nd_filend_nthnd_next.nd_filepointstothe"-e"stringofC,andng_nthpointstothe1integerofC,andnd_nextholdsthenextnodeNODE_CALL.Butsincetheseexplanationintextareprobablynotintuitive,IrecommendyoutoalsocheckFig.4atthesametime.

Page 562: Ruby Hacking Guide

Fig.4:SyntaxTree

I’llexplainthemeaningofeachnode.NODE_CALLisaFunctionCALL.NODE_ARRAYisasitsnamesuggeststhenodeofarray,andhereitexpressesthelistofarguments.NODE_VCALLisaVariableorCALL,areferencetoundefinedlocalvariablewillbecomethis.

Then,whatisNODE_NEWLINE?Thisisthenodetojointhenameofthecurrentlyexecutedfileandthelinenumberatruntimeandissetforeachstmt.Therefore,whenonlythinkingaboutthemeaningoftheexecution,thisnodecanbeignored.Whenyourequirenodedump-shortinsteadofnodedump,distractionslikeNODE_NEWLINEareleftoutinthefirstplace.Sinceitiseasiertoseeifitissimple,nodedump-shortwillbeusedlateronexceptforwhenparticularlywritten.

Page 563: Ruby Hacking Guide

Now,we’lllookatthethreetypeofcomposingelementsinordertograsphowthewholesyntaxtreeis.Thefirstoneistheleavesofasyntaxtree.Next,we’lllookatexpressionsthatarecombinationsofthatleaves,thismeanstheyarebranchesofasyntaxtree.Thelastoneisthelisttolistupthestatementsthatisthetrunkofasyntaxtreeinotherwords.

LeafFirst,let’sstartwiththeedgesthataretheleavesofthesyntaxtree.Literalsandvariablereferencesandsoon,amongtherules,theyarewhatbelongtoprimaryandareparticularlysimpleevenamongtheprimaryrules.

%ruby-rnodedump-short-e'1'NODE_LITnd_lit=1:Fixnum

1asanumericvalue.There’snotanytwist.However,noticethatwhatisstoredinthenodeisnot1ofCbut1ofRuby(1ofFixnum).Thisisbecause…

%ruby-rnodedump-short-e':sym'NODE_LITnd_lit=9617:Symbol

Thisway,SymbolisrepresentedbythesameNODE_LITwhenitbecomesasyntaxtree.Astheaboveexample,VALUEisalwaysstoredinnd_litsoitcanbehandledcompletelyinthesamewaywhether

Page 564: Ruby Hacking Guide

itisaSymboloraFixnumwhenexecuting.Inthisway,allweneedtodowhendealingwithitareretrievingthevalueinnd_litandreturningit.Sincewecreateasyntaxtreeinordertoexecuteit,designingitsothatitbecomesconvenientwhenexecutingistherightthingtodo.

%ruby-rnodedump-short-e'"a"'NODE_STRnd_lit="a":String

Astring.ThisisalsoaRubystring.Stringliteralsarecopiedwhenactuallyused.

%ruby-rnodedump-e'[0,1]'NODE_NEWLINEnd_file="-e"nd_nth=1nd_next:NODE_ARRAYnd_alen=2nd_head:NODE_LITnd_lit=0:Fixnumnd_next:NODE_ARRAYnd_alen=1nd_head:NODE_LITnd_lit=1:Fixnumnd_next=(null)

Array.Ican’tsaythisisaleaf,butlet’sallowthistobeherebecauseit’salsoaliteral.ItseemslikealistofNODE_ARRAYhungwitheachelementnode.ThereasonwhyonlyinthiscaseIdidn’tusenodedump-shortis…youwillunderstandafterfinishingtoread

Page 565: Ruby Hacking Guide

thissection.

BranchNext,we’llfocuson“combinations”thatarebranches.ifwillbetakenasanexample.

if

Ifeellikeifisalwaysusedasanexample,that’sbecauseitsstructureissimpleandthere’snotanyreaderwhodon’tknowaboutif,soitisconvenientforwriters.

Anyway,thisisanexampleofif.Forexample,let’sconvertthiscodetoasyntaxtree.

▼TheSourceProgram

iftrue'trueexpr'else'falseexpr'end

▼Itssyntaxtreeexpression

NODE_IFnd_cond:NODE_TRUEnd_body:NODE_STRnd_lit="trueexpr":String

Page 566: Ruby Hacking Guide

nd_else:NODE_STRnd_lit="falseexpr":String

Here,thepreviouslydescribednodedump-shortisused,soNODE_NEWLINEdisappeared.nd_condisthecondition,nd_bodyisthebodyofthetruecase,nd_elseisthebodyofthefalsecase.

Then,let’slookatthecodetobuildthis.

▼ifrule

1373|kIFexpr_valuethen1374compstmt1375if_tail1376kEND1377{1378$$=NEW_IF(cond($2),$4,$5);1379fixpos($$,$2);1380}

(parse.y)

ItseemsthatNEW_IF()isthemacrotocreateNODE_IF.Amongthevaluesofthesymbols,$2$4$5areused,thusthecorrespondencesbetweenthesymbolsoftheruleand$nare:

kIFexpr_valuethencompstmtif_tailkEND$1$2$3$4$5$6NEW_IF(expr_value,compstmt,if_tail)

thisway.Inotherwords,expr_valueistheconditionexpression,compstmt($4)isthecaseoftrue,if_tailisthecaseoffalse.

Page 567: Ruby Hacking Guide

Ontheotherhand,themacrostocreatenodesareallnamedNEW_xxxx,andtheyaredefinednode.h.Let’slookatNEW_IF().

▼NEW_IF()

243#defineNEW_IF(c,t,e)rb_node_newnode(NODE_IF,c,t,e)

(node.h)

Asfortheparameters,itseemsthatcrepresentscondition,trepresentsthen,anderepresentselserespectively.Asdescribedattheprevioussection,theorderofmembersofanodeisnotsomeaningful,soyoudon’tneedtobecarefulaboutparameternamesinthiskindofplace.

And,thecode()whichprocessesthenodeoftheconditionexpressionintheactionisasemanticanalysisfunction.Thiswillbedescribedlater.

Additionally,fixpos()correctsthelinenumber.NODEisinitializedwiththefilenameandthelinenumberofthetimewhenitis“created”.However,forinstance,thecodeofifshouldalreadybeparsedbyendbythetimewhencreatingNODE_IF.Thus,thelinenumberwouldgowrongifitremainsuntouched.Therefore,itneedstobecorrectedbyfixpos().

fixpos(dest,src)

Thisway,thelinenumberofthenodedestissettotheoneofthe

Page 568: Ruby Hacking Guide

nodesrc.Asforif,thelinenumberoftheconditionexpressionbecomesthelinenumberofthewholeifexpression.

elsif

Subsequently,let’slookattheruleofif_tail.

▼if_tail

1543if_tail:opt_else1544|kELSIFexpr_valuethen1545compstmt1546if_tail1547{1548$$=NEW_IF(cond($2),$4,$5);1549fixpos($$,$2);1550}

1553opt_else:none1554|kELSEcompstmt1555{1556$$=$2;1557}

(parse.y)

First,thisruleexpresses“alistendswithopt_elseafterzeroormorenumberofelsifclauses”.That’sbecause,if_tailappearsagainandagainwhileelsifcontinues,itdisappearswhenopt_elsecomesin.Wecanunderstandthisbyextractingarbitrarytimes.

if_tail:kELSIF....if_tailif_tail:kELSIF....kELSIF....if_tailif_tail:kELSIF....kELSIF....kELSIF....if_tailif_tail:kELSIF....kELSIF....kELSIF....opt_else

Page 569: Ruby Hacking Guide

if_tail:kELSIF....kELSIF....kELSIF....kELSEcompstmt

Next,let’sfocusontheactions,surprisingly,elsifusesthesameNEW_IF()asif.Itmeans,thebelowtwoprogramswilllosethedifferenceaftertheybecomesyntaxtrees.

ifcond1ifcond1body1body1elsifcond2elsebody2ifcond2elsifcond3body2body3elseelseifcond3body4body3endelsebody4endendend

Cometothinkofit,inClanguageandsuch,there’snodistinctionbetweenthetwoalsoatthesyntaxlevel.Thusthismightbeamatterofcourse.Alternatively,theconditionaloperator(a?b:c)becomesindistinguishablefromifstatementaftertheybecomesyntaxtrees.

Theprecedenceswasverymeaningfulwhenitwasinthecontextofgrammar,buttheybecomeunnecessaryanymorebecausethestructureofasyntaxtreecontainsthatinformation.And,thedifferenceinappearancesuchasifandtheconditionaloperatorbecomecompletelymeaningless,itsmeaning(itsbehavior)onlymatters.Therefore,there’sperfectlynoproblemififandthe

Page 570: Ruby Hacking Guide

conditionaloperatorarethesameinitssyntaxtreeexpression.

I’llintroduceafewmoreexamples.addand&&becomethesame.orand||arealsoequaltoeachother.notand!,ifandmodifierif,andsoon.Thesepairsalsobecomeequaltoeachother.

LeftRecursiveandRightRecursiveBytheway,thesymbolofalistwasalwayswrittenattheleftsidewhenexpressingalistinChapter9:yacccrashcourse.However,haveyounoticeditbecomesoppositeinif_tail?I’llshowonlythecrucialpartagain.

if_tail:opt_else|kELSIF...if_tail

Surely,itisoppositeofthepreviousexamples.if_tailwhichisthesymbolofalistisattherightside.

Infact,there’sanotherestablishedwayofexpressinglists,

list:END_ITEM|ITEMlist

whenyouwriteinthisway,itbecomesthelistthatcontainscontinuouszeroormorenumberofITEMandendswithEND_ITEM.

Asanexpressionofalist,whicheverisuseditdoesnotcreateasomuchdifference,butthewaythattheactionsareexecutedisfatallydifferent.Withtheformthatlistiswrittenattheright,theactions

Page 571: Ruby Hacking Guide

aresequentiallyexecutedfromthelastITEM.We’vealreadylearnedaboutthebehaviorofthestackofwhenlistisattheleft,solet’strythecasethatlistisattheright.Theinputis4ITEMsandEND_ITEM.

emptyatfirstITEM shiftITEMITEMITEM shiftITEMITEMITEMITEM shiftITEMITEMITEMITEMITEM shiftITEMITEMITEMITEMITEMEND_ITEM shiftEND_ITEMITEMITEMITEMITEMlist reduceEND_ITEMtolistITEMITEMITEMlist reduceITEMlisttolistITEMITEMlist reduceITEMlisttolistITEMlist reduceITEMlisttolistlist reduceITEMlisttolist

accept.

Whenlistwasattheleft,shiftsandreductionsweredoneinturns.Thistime,asyousee,therearecontinuousshiftsandcontinuousreductions.

Thereasonwhyif_tailplaces“listattheright”istocreateasyntaxtreefromthebottomup.Whencreatingfromthebottomup,thenodeofifwillbeleftinhandintheend.Butifdefiningif_tailbyplacing“listattheleft”,inordertoeventuallyleavethenodeofifinhand,itneedstotraversealllinksoftheelsifandeverytimeelsifisfoundaddittotheend.Thisiscumbersome.

Page 572: Ruby Hacking Guide

And,slow.Thus,if_tailisconstructedinthe“listattheright”manner.

Finally,themeaningoftheheadlineis,ingrammarterms,“theleftislist”iscalledleft-recursive,“therightislist”iscalledright-recursive.Thesetermsareusedmainlywhenreadingpapersaboutprocessinggrammarsorwritingabookofyacc.

TrunkLeaf,branch,andfinally,it’strunk.Let’slookathowthelistofstatementsarejoined.

▼TheSourceProgram

789

Thedumpofthecorrespondingsyntaxtreeisshownbelow.Thisisnotnodedump-shortbutintheperfectform.

▼ItsSyntaxTree

NODE_BLOCKnd_head:NODE_NEWLINEnd_file="multistmt"nd_nth=1nd_next:NODE_LITnd_lit=7:Fixnumnd_next:

Page 573: Ruby Hacking Guide

NODE_BLOCKnd_head:NODE_NEWLINEnd_file="multistmt"nd_nth=2nd_next:NODE_LITnd_lit=8:Fixnumnd_next:NODE_BLOCKnd_head:NODE_NEWLINEnd_file="multistmt"nd_nth=3nd_next:NODE_LITnd_lit=9:Fixnumnd_next=(null)

WecanseethelistofNODE_BLOCKiscreatedandNODE_NEWLINEareattachedasheaders.(Fig.5)

Page 574: Ruby Hacking Guide

Fig.5:NODE_BLOCKandNODE_NEWLINE

Itmeans,foreachstatement(stmt)NODE_NEWLINEisattached,andwhentheyaremultiple,itwillbealistofNODE_BLOCK.Let’salsoseethecode.

▼stmts

354stmts:none355|stmt356{357$$=newline_node($1);358}359|stmtstermsstmt360{361$$=block_append($1,newline_node($3));362}

(parse.y)

Page 575: Ruby Hacking Guide

newline_node()capsNODE_NEWLINE,block_append()appendsittothelist.It’sstraightforward.Let’slookatthecontentonlyoftheblock_append().

block_append()

Itthisfunction,theerrorchecksareintheverymiddleandobstructive.ThusI’llshowthecodewithoutthatpart.

▼block_append()(omitted)

4285staticNODE*4286block_append(head,tail)4287NODE*head,*tail;4288{4289NODE*end;42904291if(tail==0)returnhead;4292if(head==0)returntail;42934294if(nd_type(head)!=NODE_BLOCK){4295end=NEW_BLOCK(head);4296end->nd_end=end;/*(A-1)*/4297fixpos(end,head);4298head=end;4299}4300else{4301end=head->nd_end;/*(A-2)*/4302}

/*……omitted……*/

4325if(nd_type(tail)!=NODE_BLOCK){4326tail=NEW_BLOCK(tail);4327tail->nd_end=tail;4328}4329end->nd_next=tail;

Page 576: Ruby Hacking Guide

4330head->nd_end=tail->nd_end;/*(A-3)*/4331returnhead;4332}

(parse.y)

Accordingtotheprevioussyntaxtreedump,NEW_BLOCKwasalinkedlistusesnd_next.Beingawareofitwhilereading,itcanberead“ifeitherheadortailisnotNODE_BLOCK,wrapitwithNODE_BLOCKandjointhelistseachother.”

Additionally,on(A-1~3),thend_endoftheNODE_BLOCKoftheheadofthelistalwayspointstotheNODE_BLOCKofthetailofthelist.Thisisprobablybecauseinthiswaywedon’thavetotraverseallelementswhenaddinganelementtothetail(Fig.6).Converselyspeaking,whenyouneedtoaddelementslater,NODE_BLOCKissuitable.

Fig.6:Appendingiseasy.

Thetwotypesoflists

Page 577: Ruby Hacking Guide

Now,I’veexplainedtheoutlinesofar.BecausethestructureofsyntaxtreewillalsoappearinPart3inlargeamounts,wewon’tgofurtheraslongasweareinPart2.Butbeforeending,there’sonemorethingI’dliketotalkabout.Itisaboutthetwogeneral-purposelists.

Thetwogeneral-purposelistsmeanBLOCKandLIST.BLOCKis,aspreviouslydescribed,alinkedlistofNODE_BLOCKtojointhestatements.LISTis,althoughitiscalledLIST,alistofNODE_ARRAY.Thisiswhatisusedforarrayliterals.LISTisusedtostoretheargumentsofamethodorthelistofmultipleassignments.

Asforthedifferencebetweenthetwolists,lookingattheusageofthenodesishelpfultounderstand.

NODE_BLOCK nd_head holdinganelementnd_end pointingtotheNODE_BLOCKoftheendofthelistnd_next pointingtothenextNODE_BLOCK

NODE_ARRAY nd_head holdinganelementnd_alen thelengthofthelistthatfollowsthisnodend_next pointingtothenextNODE_ARRAY

Theusagediffersonlyinthesecondelementsthatarend_endandnd_alen.Andthisisexactlythesignificanceoftheexistenceofeachtypeofthetwonodes.SinceitssizecanbestoredinNODE_ARRAY,weuseanARRAYlistwhenthesizeofthelistwillfrequentlyberequired.Otherwise,weuseaBLOCKlistthatisveryfasttojoin.Idon’tdescribethistopicindetailsbecausethecodesthatusethem

Page 578: Ruby Hacking Guide

isnecessarytounderstandthesignificancebutnotshownhere,butwhenthecodesappearinPart3,I’dlikeyoutorecallthisandthink“Oh,thisusesthelength”.

SemanticAnalysis

AsIbrieflymentionedatthebeginningofPart2,therearetwotypesofanalysisthatareappearanceanalysisandsemanticanalysis.Theappearanceanalysisismostlydonebyyacc,therestisdoingthesemanticanalysisinsideactions.

ErrorsinsideactionsWhatdoesthesemanticanalysispreciselymean?Forexample,therearetypechecksinalanguagethathastypes.Alternatively,checkifvariableswiththesamenamearenotdefinedmultipletimes,andcheckifvariablesarenotusedbeforetheirdefinitions,andcheckiftheprocedurebeingusedisdefined,andcheckifreturnisnotusedoutsideofprocedures,andsoon.Thesearepartofthesemanticanalysis.

Whatkindofsemanticanalysisisdoneinthecurrentruby?Sincetheerrorchecksoccupiesalmostallofsemanticanalysisinruby,searchingtheplaceswheregeneratingerrorsseemsagoodway.Inaparserofyacc,yyerror()issupposedtobecalledwhenanerror

Page 579: Ruby Hacking Guide

occurs.Converselyspeaking,there’sanerrorwhereyyerror()exists.So,Imadealistoftheplaceswherecallingyyerror()insidetheactions.

anexpressionnothavingitsvalue(voidvalueexpression)ataplacewhereavalueisrequiredanaliasof$nBEGINinsideofamethodENDinsideofamethodreturnoutsideofmethodsalocalvariableataplacewhereconstantisrequiredaclassstatementinsideofamethodaninvalidparametervariable($gvarandCONSTandsuch)parameterswiththesamenameappeartwiceaninvalidreceiverofasingletonmethod(def().methodandsuch)asingletonmethoddefinitiononliteralsanoddnumberofalistforhashliteralsanassignmenttoself/nil/true/false/__FILE__/__LINE__aconstantassignmentinsideofamethodamultipleassignmentinsideofaconditionalexpression

Thesecheckscanroughlybecategorizedbyeachpurposeasfollows:

forthebettererrormessageinordernottomaketheruletoocomplextheothers(puresemanticanalysis)

Page 580: Ruby Hacking Guide

Forexample,“returnoutsideofamethod”isacheckinordernottomaketheruletoocomplex.Sincethiserrorisaproblemofthestructure,itcanbedealtwithbygrammar.Forexample,it’spossiblebydefiningtherulesseparatelyforbothinsideandoutsideofmethodsandmakingthelistofallwhatareallowedandwhatarenotallowedrespectively.Butthisisinanywaycumbersomeandrejectingitinanactionisfarmoreconcise.

And,“anassignmenttoself”seemsacheckforthebettererrormessage.Incomparisonto“returnoutsideofmethods”,rejectingitbygrammarismucheasier,butifitisrejectedbytheparser,theoutputwouldbejust"parseerror".Comparingtoit,thecurrent

%ruby-e'self=1'-e:1:Can'tchangethevalueofselfself=1^

thiserrorismuchmorefriendly.

Ofcourse,wecannotalwayssaythatanarbitraryruleisexactly“forthispurpose”.Forexample,asfor“returnoutsideofmethods”,thiscanalsobeconsideredthatthisisacheck“forthebettererrormessage”.Thepurposesareoverlappingeachother.

Now,theproblemis“apuresemanticanalysis”,inRubytherearefewthingsbelongtothiscategory.Inthecaseofatypedlanguage,thetypeanalysisisabigevent,butbecausevariablesarenottypedinRuby,itismeaningless.Whatisstandingoutinsteadisthe

Page 581: Ruby Hacking Guide

cheekofanexpressionthathasitsvalue.

Toput“havingitsvalue”precisely,itis“youcanobtainavalueasaresultofevaluatingit”.returnandbreakdonothavevaluesbythemselves.Ofcourse,avalueispassedtotheplacewherereturnto,butnotanyvaluesareleftattheplacewherereturniswritten.Therefore,forexample,thenextexpressionisodd,

i=return(1)

Sincethiskindofexpressionsareclearlyduetomisunderstandingorsimplemistakes,it’sbettertorejectwhencompiling.Next,we’lllookatvalue_exprwhichisoneofthefunctionstocheckifittakesavalue.

value_expr()

value_expr()isthefunctiontocheckifitisanexprthathasavalue.

▼value_expr()

4754staticint4755value_expr(node)4756NODE*node;4757{4758while(node){4759switch(nd_type(node)){4760caseNODE_CLASS:4761caseNODE_MODULE:4762caseNODE_DEFN:4763caseNODE_DEFS:4764rb_warning("voidvalueexpression");4765returnQfalse;

Page 582: Ruby Hacking Guide

47664767caseNODE_RETURN:4768caseNODE_BREAK:4769caseNODE_NEXT:4770caseNODE_REDO:4771caseNODE_RETRY:4772yyerror("voidvalueexpression");4773/*or"controlneverreach"?*/4774returnQfalse;47754776caseNODE_BLOCK:4777while(node->nd_next){4778node=node->nd_next;4779}4780node=node->nd_head;4781break;47824783caseNODE_BEGIN:4784node=node->nd_body;4785break;47864787caseNODE_IF:4788if(!value_expr(node->nd_body))returnQfalse;4789node=node->nd_else;4790break;47914792caseNODE_AND:4793caseNODE_OR:4794node=node->nd_2nd;4795break;47964797caseNODE_NEWLINE:4798node=node->nd_next;4799break;48004801default:4802returnQtrue;4803}4804}48054806returnQtrue;4807}

Page 583: Ruby Hacking Guide

(parse.y)

AlgorithmSummary:Itsequentiallychecksthenodesofthetree,ifithits“anexpressioncertainlynothavingitsvalue”,itmeansthetreedoesnothaveanyvalue.Thenitwarnsaboutthatbyusingrb_warning()andreturnQfalse.Ifitfinishestotraversetheentiretreewithouthittingany“anexpressionnothavingitsvalue”,itmeansthetreedoeshaveavalue.ThusitreturnsQtrue.

Here,noticethatitdoesnotalwaysneedtocheckthewholetree.Forexample,let’sassumevalue_expr()iscalledontheargumentofamethod.Here:

▼checkthevalueofargbyusingvalue_expr()

1055arg_value:arg1056{1057value_expr($1);1058$$=$1;1059}

(parse.y)

Insideofthisargument$1,therecanalsobeothernestingmethodcallsagain.But,theargumentoftheinsidemethodmusthavebeenalreadycheckedwithvalue_expr(),soyoudon’thavetocheckitagain.

Let’sthinkmoregenerally.Assumeanarbitrarygrammarelement

Page 584: Ruby Hacking Guide

Aexists,andassumevalue_expr()iscalledagainstitsallcomposingelements,thenecessitytochecktheelementAagainwoulddisappear.

Then,forexample,howisif?Isitpossibletobehandledasifvalue_expr()hasalreadycalledforallelements?IfIputonlythebottomline,itisn’t.Thatisbecause,sinceifisastatement(whichdoesnotuseavalue),themainbodyshouldnothavetoreturnavalue.Forexample,inthenextcase:

defmethodiftruereturn1elsereturn2end5end

Thisifstatementdoesnotneedavalue.Butinthenextcase,itsvalueisnecessary.

defmethod(arg)tmp=ifargthen3else98endtmp*tmp/3.5end

So,inthiscase,theifstatementmustbecheckedwhencheckingtheentireassignmentexpression.Thiskindofthingsarelaidoutintheswitchstatementofvalue_expr().

Page 585: Ruby Hacking Guide

RemovingTailRecursionBytheway,whenlookingoverthewholevalue_expr,wecanseethatthere’sthefollowingpatternappearsfrequently:

while(node){switch(nd_type(node)){caseNODE_XXXX:node=node->nd_xxxx;break;::}}

Thisexpressionwillalsocarrythesamemeaningafterbeingmodifiedtothebelow:

returnvalue_expr(node->nd_xxxx)

Acodelikethiswhichdoesarecursivecalljustbeforereturniscalledatailrecursion.Itisknownthatthiscangenerallybeconvertedtogoto.Thismethodisoftenusedwhenoptimizing.AsforScheme,itisdefinedinspecificationsthattailrecursionsmustberemovedbylanguageprocessors.ThisisbecauserecursionsareoftenusedinsteadofloopsinLisp-likelanguages.

However,becarefulthattailrecursionsareonlywhen“callingjustbeforereturn”.Forexample,takealookattheNODE_IFofvalue_expr(),

Page 586: Ruby Hacking Guide

if(!value_expr(node->nd_body))returnQfalse;node=node->nd_else;break;

Asshownabove,thefirsttimeisarecursivecall.Rewritingthistotheformofusingreturn,

returnvalue_expr(node->nd_body)&&value_expr(node->nd_else);

Iftheleftvalue_expr()isfalse,therightvalue_expr()isalsoexecuted.Inthiscase,theleftvalue_expr()isnot“justbefore”return.Therefore,itisnotatailrecursion.Hence,itcan’tbeextractedtogoto.

ThewholepictureofthevaluecheckAsforvaluechecks,wewon’treadthefunctionsfurther.Youmightthinkit’stooearly,butalloftheotherfunctionsare,asthesameasvalue_expr(),step-by-stepone-by-oneonlytraversingandcheckingnodes,sotheyarecompletelynotinteresting.However,I’dliketocoverthewholepictureatleast,soIfinishthissectionbyjustshowingthecallgraphoftherelevantfunctions(Fig.7).

Page 587: Ruby Hacking Guide

Fig.7:thecallgraphofthevaluecheckfunctions

LocalVariables

LocalVariableDefinitionsThevariabledefinitionsinRubyarereallyvarious.Asforconstantsandclassvariables,thesearedefinedonthefirstassignment.Asforinstancevariablesandglobalvariables,asallnamescanbeconsideredthattheyarealreadydefined,youcanreferthemwithoutassigningbeforehand(althoughitproduceswarnings).

Page 588: Ruby Hacking Guide

Thedefinitionsoflocalvariablesareagaincompletelydifferentfromtheaboveall.Alocalvariableisdefinedwhenitsassignmentappearsontheprogram.Forexample,asfollows:

lvar=nilplvar#beingdefined

Inthiscase,astheassignmenttolvariswrittenatthefirstline,inthismomentlvarisdefined.Whenitisundefined,itendsupwitharuntimeexceptionNameErrorasfollows:

%rubylvar.rblvar.rb:1:undefinedlocalvariableormethod`lvar'for#<Object:0x40163a9c>(NameError)

Whydoesitsay"localvariableormethod"?Asformethods,theparenthesesoftheargumentscanbeomittedwhencalling,sowhenthere’snotanyarguments,itcan’tbedistinguishedfromlocalvariables.Toresolvethissituation,rubytriestocallitasamethodwhenitfindsanundefinedlocalvariable.Thenifthecorrespondingmethodisnotfound,itgeneratesanerrorsuchastheaboveone.

Bytheway,itisdefinedwhen“itappears”,thismeansitisdefinedeventhoughitwasnotassigned.Theinitialvalueofadefinedvariableisnil.

iffalselvar="thisassigmentwillneverbeexecuted"endplvar#showsnil

Page 589: Ruby Hacking Guide

Moreover,sinceitisdefined“when”it“appears”,thedefinitionhastobebeforethereferenceinasymbolsequence.Forexample,inthenextcase,itisnotdefined.

plvar#notdefined!lvar=nil#althoughappearinghere...

Becarefulaboutthepointof“inthesymbolsequence”.Ithascompletelynothingtodowiththeorderofevaluations.Forexample,forthenextcode,naturallytheconditionexpressionisevaluatedfirst,butinthesymbolsequence,atthemomentwhenpappearstheassignmenttolvarhasnotappearedyet.Therefore,thisproducesNameError.

p(lvar)iflvar=true

Whatwe’velearnedbynowisthatthelocalvariablesareextremelyinfluencedbytheappearances.Whenasymbolsequencethatexpressesanassignmentappears,itwillbedefinedintheappearanceorder.Basedonthisinformation,wecaninferthatrubyseemstodefinelocalvariableswhileparsingbecausetheorderofthesymbolsequencedoesnotexistafterleavingtheparser.Andinfact,itistrue.Inruby,theparserdefineslocalvariables.

BlockLocalVariablesThelocalvariablesnewlydefinedinaniteratorblockarecalled

Page 590: Ruby Hacking Guide

blocklocalvariablesordynamicvariables.Blocklocalvariablesare,inlanguagespecifications,identicaltolocalvariables.However,thesetwodifferintheirimplementations.We’lllookathowisthedifferencefromnowon.

ThedatastructureWe’llstartwiththelocalvariabletablestructlocal_vars.

▼structlocal_vars

5174staticstructlocal_vars{5175ID*tbl;/*thetableoflocalvariablenames*/5176intnofree;/*whetheritisusedfromoutside*/5177intcnt;/*thesizeofthetblarray*/5178intdlev;/*thenestinglevelofdyna_vars*/5179structRVarmap*dyna_vars;/*blocklocalvariablenames*/5180structlocal_vars*prev;5181}*lvtbl;

(parse.y)

Themembernameprevindicatesthatthestructlocal_varsisaopposite-directionlinkedlist.…Basedonthis,wecanexpectastack.Thesimultaneouslydeclaredglobalvariablelvtblpointstolocal_varsthatisthetopofthatstack.

And,structRVarmapisdefinedinenv.h,andisavailabletootherfilesandisalsousedbytheevaluator.Thisisusedtostoretheblocklocalvariables.

▼structRVarmap

Page 591: Ruby Hacking Guide

52structRVarmap{53structRBasicsuper;54IDid;/*thevariablename*/55VALUEval;/*itsvalue*/56structRVarmap*next;57};

(env.h)

Sincethere’sstructRBasicatthetop,thisisaRubyobject.Itmeansitismanagedbythegarbagecollector.Andsinceitisjoinedbythenextmember,itisprobablyalinkedlist.

Basedontheobservationwe’vedoneandtheinformationthatwillbeexplained,Fig.8illustratestheimageofbothstructswhileexecutingtheparser.

Page 592: Ruby Hacking Guide

Fig.8:Theimageoflocalvariabletablesatruntime

LocalVariableScopeWhenlookingoverthelistoffunctionnamesofparse.y,wecanfindfunctionssuchaslocal_push()local_pop()local_cnt()arelaidout.Inwhateverwayofthinking,theyappeartoberelatingtoalocalvariable.Moreover,becausethenamesarepushpop,itisclearlyastack.Sofirst,let’sfindouttheplaceswhereusingthesefunctions.

▼local_push()local_pop()usedexamples

Page 593: Ruby Hacking Guide

1475|kDEFfname1476{1477$<id>$=cur_mid;1478cur_mid=$2;1479in_def++;1480local_push(0);1481}1482f_arglist1483bodystmt1484kEND1485{1486/*NOEX_PRIVATEfortoplevel*/1487$$=NEW_DEFN($2,$4,$5,class_nest?NOEX_PUBLIC:NOEX_PRIVATE);1488if(is_attrset_id($2))$$->nd_noex=NOEX_PUBLIC;1489fixpos($$,$4);1490local_pop();1491in_def--;1492cur_mid=$<id>3;1493}

(parse.y)

Atdef,Icouldfindtheplacewhereitisused.Itcanalsobefoundinclassdefinitionsandsingletonclassdefinitions,andmoduledefinitions.Inotherwords,itistheplacewherethescopeoflocalvariablesiscut.Moreover,asforhowtheyareused,itdoespushwherethemethoddefinitionstartsanddoespopwhenthedefinitionends.Thismeans,asweexpected,itisalmostcertainthatthefunctionsstartwithlocal_arerelatingtolocalvariables.Anditisalsorevealedthatthepartbetweenpushandpopisprobablyalocalvariablescope.

Moreover,Ialsosearchedlocal_cnt().

Page 594: Ruby Hacking Guide

▼NEW_LASGN()

269#defineNEW_LASGN(v,val)rb_node_newnode(NODE_LASGN,v,val,local_cnt(v))

(node.h)

Thisisfoundinnode.h.Eventhoughtherearealsotheplaceswhereusinginparse.y,Ifounditintheotherfile.Thus,probablyI’mindesperation.

ThisNEW_LASGNis“newlocalassignment”.Thisshouldmeanthenodeofanassignmenttoalocalvariable.Andalsoconsideringtheplacewhereusingit,theparametervisapparentlythelocalvariablename.valisprobably(asyntaxtreethatrepresents).theright-handsidevalue

Basedontheaboveobservations,local_push()isatthebeginningofthelocalvariable,local_cnt()isusedtoaddalocalvariableifthere’salocalvariableassignmentinthehalfway,local_pop()isusedwhenendingthescope.Thisperfectscenariocomesout.(Fig.9)

Fig.9:theflowofthelocalvariablemanagement

Page 595: Ruby Hacking Guide

Then,let’slookatthecontentofthefunction.

pushandpop▼local_push()

5183staticvoid5184local_push(top)5185inttop;5186{5187structlocal_vars*local;51885189local=ALLOC(structlocal_vars);5190local->prev=lvtbl;5191local->nofree=0;5192local->cnt=0;5193local->tbl=0;5194local->dlev=0;5195local->dyna_vars=ruby_dyna_vars;5196lvtbl=local;5197if(!top){5198/*preservethevariabletableofthepreviousscopeintoval*/5199rb_dvar_push(0,(VALUE)ruby_dyna_vars);5200ruby_dyna_vars->next=0;5201}5202}

(parse.y)

Asweexpected,itseemsthatstructlocal_varsisusedasastack.Also,wecanseelvtblispointingtothetopofthestack.Thelinesrelatestorb_dvar_push()willbereadlater,soitisleftuntouchedfornow.

Subsequently,we’lllookatlocal_pop()andlocal_tbl()atthesame

Page 596: Ruby Hacking Guide

time.

▼local_tbllocal_pop

5218staticID*5219local_tbl()5220{5221lvtbl->nofree=1;5222returnlvtbl->tbl;5223}

5204staticvoid5205local_pop()5206{5207structlocal_vars*local=lvtbl->prev;52085209if(lvtbl->tbl){5210if(!lvtbl->nofree)free(lvtbl->tbl);5211elselvtbl->tbl[0]=lvtbl->cnt;5212}5213ruby_dyna_vars=lvtbl->dyna_vars;5214free(lvtbl);5215lvtbl=local;5216}

(parse.y)

I’dlikeyoutolookatlocal_tbl().Thisisthefunctiontoobtainthecurrentlocalvariabletable(lvtbl->tbl).Bycallingthis,thenofreeofthecurrenttablebecomestrue.Themeaningofnofreeseemsnaturally“Don’tfree()”.Inotherwords,thisislikereferencecounting,“thistablewillbeused,sopleasedon’tfree()”.Converselyspeaking,whenlocal_tbl()wasnotcalledwithatableevenonce,thattablewillbefreedatthemomentwhenbeingpoppedandbediscarded.Forexample,thissituationprobably

Page 597: Ruby Hacking Guide

happenswhenamethodwithoutanylocalvariables.

However,the“necessarytable”heremeanslvtbl->tbl.Asyoucansee,lvtblitselfwillbefreedatthesamemomentwhenbeingpopped.Itmeansonlythegeneratedlvtbl->tblisusedintheevaluator.Then,thestructureoflvtbl->tblisbecomingimportant.Let’slookatthefunctionlocal_cnt()(whichseems)toaddvariableswhichisprobablyhelpfultounderstandhowthestructureis.

Andbeforethat,I’dlikeyoutorememberthatlvtbl->cntisstoredattheindex0ofthelvtbl->tbl.

AddingvariablesThefunction(whichseems)toaddalocalvariableislocal_cnt().

▼local_cnt()

5246staticint5247local_cnt(id)5248IDid;5249{5250intcnt,max;52515252if(id==0)returnlvtbl->cnt;52535254for(cnt=1,max=lvtbl->cnt+1;cnt<max;cnt++){5255if(lvtbl->tbl[cnt]==id)returncnt-1;5256}5257returnlocal_append(id);5258}

Page 598: Ruby Hacking Guide

(parse.y)

Thisscanslvtbl->tblandsearcheswhatisequalstoid.Ifthesearchedoneisfound,itstraightforwardlyreturnscnt-1.Ifnothingisfound,itdoeslocal_append().local_append()mustbe,asitiscalledappend,theproceduretoappend.Inotherwords,local_cnt()checksifthevariablewasalreadyregistered,ifitwasnot,addsitbyusinglocal_append()andreturnsit.

Whatisthemeaningofthereturnvalueofthisfunction?lvtbl->tblseemsanarrayofthevariables,sothere’reone-to-onecorrespondencesbetweenthevariablenamesand“theirindex–1(cnt-1)”.(Fig.10)

Fig.10:Thecorrespondencesbetweenthevariablenamesandthereturnvalues

Moreover,thisreturnvalueiscalculatedsothatthestartpointbecomes0,thelocalvariablespaceisprobablyanarray.And,thisreturnstheindextoaccessthatarray.Ifitisnot,liketheinstancevariablesorconstants,(theIDof)thevariablenamecouldhavebeenusedasakeyinthefirstplace.

Youmightwanttoknowwhyitisavoidingindex0(theloopstartfromcnt=1)forsomereasons,itisprobablytostoreavalueat

Page 599: Ruby Hacking Guide

local_pop().

Basedontheknowledgewe’velearned,wecanunderstandtheroleoflocal_append()withoutactuallylookingatthecontent.Itregistersalocalvariableandreturns“(theindexofthevariableinlvtbl->tbl)–1”.Itisshownbelow,let’smakesure.

▼local_append()

5225staticint5226local_append(id)5227IDid;5228{5229if(lvtbl->tbl==0){5230lvtbl->tbl=ALLOC_N(ID,4);5231lvtbl->tbl[0]=0;5232lvtbl->tbl[1]='_';5233lvtbl->tbl[2]='~';5234lvtbl->cnt=2;5235if(id=='_')return0;5236if(id=='~')return1;5237}5238else{5239REALLOC_N(lvtbl->tbl,ID,lvtbl->cnt+2);5240}52415242lvtbl->tbl[lvtbl->cnt+1]=id;5243returnlvtbl->cnt++;5244}

(parse.y)

Itseemsdefinitelytrue.lvtbl->tblisanarrayofthelocalvariablenames,anditsindex–1isthereturnvalue(localvariableID).

Notethatitincreaseslvtbl->cnt.Sincethecodetoincreaselvtbl-

Page 600: Ruby Hacking Guide

>cntonlyexistshere,fromonlythiscodeitsmeaningcanbedecided.Then,whatisthemeaning?Itis,since“lvtbl->cntincreasesby1whenanewvariableisadded”,“lvtbl->cntholdsthenumberoflocalvariablesinthisscope”.

Finally,I’llexplainabouttbl[1]andtbl[2].These'_'and'~'are,asyoucanguessifyouarefamiliarwithRuby,thespecialvariablesnamed$_and$~.Thoughtheirappearancesareidenticaltoglobalvariables,theyareactuallylocalvariables.EvenIfyoudidn’texplicitlyuseit,whenthemethodssuchasKernel#getsarecalled,thesevariablesareimplicitlyassigned,thusit’snecessarythatthespacesarealwaysallocated.

SummaryoflocalvariablesSincethedescriptionoflocalvariableswerecomplexinvariousways,let’ssummarizeit.

First,Itseemsthelocalvariablesaredifferentfromtheothervariablesbecausetheyarenotmanagedwithst_table.Then,wherearetheystoredin?Itseemstheanswerisanarray.Moreover,itisstoredinadifferentarrayforeachscope.

Thearrayislvtbl->tbl,andtheindex0holdsthelvtbl->cntwhichissetatlocal_pop().Inotherwords,itholdsthenumberofthelocalvariables.Theindex1ormoreholdthelocalvariablenamesdefinedinthescope.Fig.11showsthefinalappearanceweexpect.

Page 601: Ruby Hacking Guide

Fig.11:correspondencesbetweenlocalvariablenamesandthereturnvalues

BlockLocalVariablesTherestisdyna_varswhichisamemberofstructlocal_vars.Inotherwords,thisisabouttheblocklocalvariables.Ithoughtthattheremustbethefunctionstodosomethingwiththis,lookedoverthelistofthefunctionnames,andfoundthemasexpected.Therearethesuspiciousfunctionsnameddyna_push()dyna_pop()dyna_in_block().Moreover,hereistheplacewheretheseareused.

▼anexampleusingdyna_pushdyna_pop

1651brace_block:'{'1652{1653$<vars>$=dyna_push();1654}1655opt_block_var1656compstmt'}'1657{1658$$=NEW_ITER($3,0,$4);1659fixpos($$,$4);1660dyna_pop($<vars>2);1661}

(parse.y)

pushatthebeginningofaniteratorblock,popattheend.Thismust

Page 602: Ruby Hacking Guide

betheprocessofblocklocalvariables.

Now,wearegoingtolookatthefunctions.

▼dyna_push()

5331staticstructRVarmap*5332dyna_push()5333{5334structRVarmap*vars=ruby_dyna_vars;53355336rb_dvar_push(0,0);5337lvtbl->dlev++;5338returnvars;5339}

(parse.y)

Increasinglvtbl->dlevseemsthemarkindicatestheexistenceoftheblocklocalvariablescope.Meanwhile,rb_dvar_push()is…

▼rb_dvar_push()

691void692rb_dvar_push(id,value)693IDid;694VALUEvalue;695{696ruby_dyna_vars=new_dvar(id,value,ruby_dyna_vars);697}

(eval.c)

ItcreatesastructRVarmapthathasthevariablenameidandthevaluevalasitsmembers,addsittothetopoftheglobalvariable

Page 603: Ruby Hacking Guide

ruby_dyna_vars.Thisisagainandagaintheformofcons.Indyna_push(),ruby_dyan_varsisnotsetaside,itseemsitaddsdirectlytotheruby_dyna_varsofthepreviousscope.

Moreover,thevalueoftheidmemberoftheRVarmaptobeaddedhereis0.Althoughitwasnotseriouslydiscussedinthisbook,theIDofrubywillneverbe0whileitisnormallycreatedbyrb_intern().Thus,wecaninferthatthisRVarmap,asitislikeNULorNULL,probablyhasaroleassentinel.Ifwethinkbasedonthisassumption,wecandescribethereasonwhytheholderofavariable(RVarmap)isaddedeventhoughnotanyvariablesareadded.

Next,dyna_pop().

▼dyna_pop()

5341staticvoid5342dyna_pop(vars)5343structRVarmap*vars;5344{5345lvtbl->dlev--;5346ruby_dyna_vars=vars;5347}

(parse.y)

Byreducinglvtbl->dlev,itwritesdownthefactthattheblocklocalvariablescopeended.Itseemsthatsomethingisdonebyusingtheargument,let’sseethislateratonce.

Theplacetoaddablocklocalvariablehasnotappearedyet.

Page 604: Ruby Hacking Guide

Somethinglikelocal_cnt()oflocalvariablesismissing.So,Ididplentyofgrepwithdvaranddyna,andthiscodewasfound.

▼assignable()(partial)

4599staticNODE*4600assignable(id,val)4601IDid;4602NODE*val;4603{:4634rb_dvar_push(id,Qnil);4635returnNEW_DASGN_CURR(id,val);

(parse.y)

assignable()isthefunctiontocreateanoderelatestoassignments,thiscitationisthefragmentofthatfunctiononlycontainstheparttodealwithblocklocalvariables.Itseemsthatitaddsanewvariable(toruby_dyna_vars)byusingrb_dvar_push()thatwe’vejustseen.

ruby_dyna_varsintheparserNow,takingtheaboveallintoconsiderations,let’simaginetheappearanceofruby_dyna_varsatthemomentwhenalocalvariablescopeisfinishedtobeparsed.

First,asIsaidpreviously,theRVarmapofid=0whichisaddedatthebeginningofablockscopeisasentinelwhichrepresentsabreakbetweentwoblockscopes.We’llcallthis“theheaderof

Page 605: Ruby Hacking Guide

ruby_dyna_vars”.

Next,amongthepreviouslyshownactionsoftheruleoftheiteratorblock,I’dlikeyoutofocusonthispart:

$<vars>$=dyna_push();/*whatassignedinto$<vars>$is...*/::dyna_pop($<vars>2);/*……appearsat$<vars>2*/

dyna_push()returnstheruby_dyna_varsatthemoment.dyna_pop()puttheargumentintoruby_dyna_vars.Thismeansruby_dyna_varswouldbesavedandrestoredforeachtheblocklocalvariablescope.Therefore,whenparsingthefollowingprogram,

iter{a=niliter{b=niliter{c=nil#nestinglevel3}bb=nil#nestinglevel2iter{e=nil}}#nestinglevel1}

Fig.12showstheruby_dyna_varsinthissituation.

Page 606: Ruby Hacking Guide

Fig.12:ruby_dyna_varswhenallscopesarefinishedtobeparsed

Thisstructureisfairlysmart.That’sbecausethevariablesofthehigherlevelscannaturallybeaccessedbytraversingoverallofthelistevenifthenestinglevelisdeep.Thiswayhasthesimplersearchingprocessthancreatingadifferenttableforeachlevel.

Plus,inthefigure,itlookslikebbishungatastrangeplace,butthisiscorrect.Whenavariableisfoundatthenestlevelwhichisdecreasedafterincreasedonce,itisattachedtothesubsequentofthelistoftheoriginallevel.Moreover,inthisway,thespecificationoflocalvariablethat“onlythevariableswhichalreadyexistinthesymbolsequencearedefined”isexpressedinanaturalform.

Andfinally,ateachcutoflocalvariablescopes(thisisnotofblocklocalvariablescopes),thislinkisentirelysavedorrestoredtolvtbl->dyna_vars.I’dlikeyoutogobackalittleandcheck

Page 607: Ruby Hacking Guide

local_push()andlocal_pop().

Bytheway,althoughcreatingtheruby_dyna_varslistwasahugetask,itisbyitselfnotusedattheevaluator.Thislistisusedonlytochecktheexistenceofthevariablesandwillbegarbagecollectedatthesamemomentwhenparsingisfinished.Andafterenteringtheevaluator,anotherchainiscreatedagain.There’saquitedeepreasonforthis,…we’llseearoundthisonceagaininPart3.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 608: Ruby Hacking Guide

RubyHackingGuide

Page 609: Ruby Hacking Guide

Chapter13:Structureofthe

evaluator

Outline

InterfaceWearenotfamiliarwiththeword“Hyo-ka-ki”(evaluator).Literally,itmustbea“-ki”(device)to“hyo-ka”(evaluating).Then,whatis“hyo-ka”?

“Hyo-ka”isthedefinitivetranslationof“evaluate”.However,ifthepremiseisdescribingaboutprograminglanguages,itcanbeconsideredasanerrorintranslation.It’shardtoavoidthattheword“hyo-ka”givestheimpressionof“whetheritisgoodorbad”.

“Evaluate”inthecontextofprograminglanguageshasnothingtodowith“goodorbad”,anditsmeaningismorecloseto“speculating”or“executing”.Theoriginof“evaluate”isaLatinword“ex+value+ate”.IfItranslateitdirectly,itis“turnitintoavalue”.Thismaybethesimplestwaytounderstand:todeterminethevaluefromanexpressionexpressedintext.

Page 610: Ruby Hacking Guide

Veryfranklyspeaking,thebottomlineisthatevaluatingisexecutingawrittenexpressionandgettingtheresultofit.Thenwhyisitnotcalledjust“execute”?It’sbecauseevaluatingisnotonlyexecuting.

Forexample,inanordinaryprogramminglanguage,whenwewrite“3”,itwillbedealtwithasaninteger3.Thissituationissometimesdescribedas“theresultofevaluating”3"is3".It’shardtosayanexpressionofaconstantisexecuted,butitiscertainlyanevaluation.It’sallrightifthereexistaprogramminglanguageinwhichtheletter“3”,whenitisevaluated,willbedealtwith(evaluated)asaninteger6.

I’llintroduceanotherexample.Whenanexpressionconsistsofmultipleconstants,sometimestheconstantsarecalculatedduringthecompilingprocess(constantfolding).Weusuallydon’tcallit“executing”becauseexecutingindicatestheprocessthatthecreatedbinaryisworking.However,nomatterwhenitiscalculatedyou’llgetthesameresultfromthesameprogram.

Inotherwords,“evaluating”isusuallyequalsto“executing”,butessentially“evaluating”isdifferentfrom“executing”.Fornow,onlythispointiswhatI’dlikeyoutoremember.

Thecharacteristicsofruby'sevaluator.Thebiggestcharacteristicofruby‘sevaluatoristhat,asthisisalsoofthewholeruby’sinterpretor,thedifferenceinexpressions

Page 611: Ruby Hacking Guide

betweentheC-levelcode(extensionlibraries)andtheRuby-levelcodeissmall.Inordinaryprogramminglanguages,theamountofthefeaturesofitsinterpretorwecanusefromextensionlibrariesisusuallyverylimited,butthereareawfullyfewlimitsinruby.Definingclasses,definingmethodsandcallingamethodwithoutlimitation,thesecanbetakenforgranted.Wecanalsouseexceptionhandling,iterators.Furthermore,threads.

Butwehavetocompensatefortheconveniencessomewhere.Somecodesareweirdlyhardtoimplement,somecodeshavealotoverhead,andtherearealotofplacesimplementingthealmostsamethingtwicebothforCandRuby.

Additionally,rubyisadynamiclanguage,itmeansthatyoucanconstructandevaluateastringatruntime.Thatisevalwhichisafunction-likemethod.Asyouexpected,itisnamedafter“evaluate”.Byusingit,youcanevendosomethinglikethis:

lvar=1answer=eval("lvar+lvar")#theansweris2

TherearealsoModule#module_evalandObject#instance_eval,eachmethodbehavesslightlydifferently.I’lldescribeaboutthemindetailinChapter17:Dynamicevaluation.

eval.c

Theevaluatorisimplementedineval.c.However,thiseval.cisareallyhugefile:ithas9000lines,itssizeis200Kbytes,andthe

Page 612: Ruby Hacking Guide

numberofthefunctionsinitis309.Itishardtofightagainst.Whenthesizebecomesthisamount,it’simpossibletofigureoutitsstructurebyjustlookingoverit.

Sohowcanwedo?First,thebiggerthefile,thelesspossibilityofitscontentnotseparatedatall.Inotherwords,theinsideofitmustbemodularizedintosmallportions.Then,howcanwefindthemodules?I’lllistupsomeways.

Thefirstwayistoprintthelistofthedefinedfunctionsandlookattheprefixesofthem.rb_dvar_,rb_mod_,rb_thread—thereareplentyoffunctionswiththeseprefixes.Eachprefixclearlyindicateagroupofthesametypeoffunctions.

Alternatively,aswecantellwhenlookingatthecodeoftheclasslibraries,Init_xxxx()isalwaysputattheendofablockinruby.Therefore,Init_xxxx()alsoindicatesabreakbetweenmodules.

Additionally,thenamesareobviouslyimportant,too.Sinceeval()andrb_eval()andeval_node()appearclosetoeachother,wenaturallythinkthereshouldbeadeeprelationshipamongthem.

Finally,inthesourcecodeofruby,thedefinitionsoftypesorvariablesandthedeclarationsofprototypesoftenindicateabreakbetweenmodules.

Beingawareofthesepointswhenlooking,itseemsthateval.ccanbemainlydividedintothesemoduleslistedbelow:

Page 613: Ruby Hacking Guide

SafeLevel alreadyexplainedinChapter7:SecurityMethodEntryManipulations

findingordeletingsyntaxtreeswhichareactualmethodbodies

EvaluatorCore theheartoftheevaluatorthatrb_eval()isatitscenter.

Exception generationsofexceptionsandcreationsofbacktraces

Method theimplementationofmethodcall

Iterator theimplementationoffunctionsthatarerelatedtoblocks

Load loadingandevaluatingexternalfilesProc theimplementationofProcThread theimplementationofRubythreads

Amongthem,“Load”and“Thread”arethepartsthatessentiallyshouldnotbeineval.c.Theyareineval.cmerelybecauseoftherestrictionsofClanguage.Toputitmoreprecisely,theyneedthemacrossuchasPUSH_TAGdefinedineval.c.So,IdecidedtoexcludethetwotopicsfromPart3anddealwiththematPart4.And,it’sprobablyallrightifIdon’texplainthesafelevelherebecauseI’vealreadydoneinPart1.

Excludingtheabovethree,thesixitemsarelefttobedescribed.Thebelowtableshowsthecorrespondingchapterofeachofthem:

MethodEntryManipulations thenextchapter:ContextEvaluatorCore theentirepartofPart3Exception thischapterMethod Chapter15:MethodsIterator Chapter16:Blocks

Page 614: Ruby Hacking Guide

Proc Chapter16:Blocks

Frommainbywayofruby_runtorb_eval

CallGraphThetruecoreoftheevaluatorisafunctioncalledrb_eval().Inthischapter,wewillfollowthepathfrommain()tothatrb_eval().Firstofall,hereisaroughcallgrapharoundrb_eval:

main....main.cruby_init....eval.cruby_prog_init....ruby.cruby_options....eval.cruby_process_options....ruby.cruby_run....eval.ceval_noderb_eval*ruby_stop

Iputthefilenamesontherightsidewhenmovingtoanotherfile.Gazingthiscarefully,thefirstthingwe’llnoticeisthatthefunctionsofeval.ccallthefunctionsofruby.cback.

Iwroteitas“callingback”becausemain.candruby.carerelativelyfortheimplementationofrubycommand.eval.cistheimplementationoftheevaluatoritselfwhichkeepsalittledistancefromrubycommand.Inotherwords,eval.cissupposedtobeusedbyruby.candcallingthefunctionsofruby.cfromeval.cmakeseval.clessindependent.

Page 615: Ruby Hacking Guide

Then,whyisthisinthisway?It’smainlybecauseoftherestrictionsofClanguage.Becausethefunctionssuchasruby_prog_init()andruby_process_options()starttousetheAPIoftherubyworld,it’spossibleanexceptionoccurs.However,inordertostopanexceptionofRuby,it’snecessarytousethemacronamedPUSH_TAG()whichcanonlybeusedineval.c.Inotherwords,essentially,ruby_init()andruby_run()shouldhavebeendefinedinruby.c.

Then,whyisn’tPUSH_TAGanexternfunctionorsomethingwhichisavailabletootherfiles?Actually,PUSH_TAGcanonlybeusedasapairwithPOP_TAGasfollows:

PUSH_TAG();/*dolotsofthings*/POP_TAG();

Becauseofitsimplementation,thetwomacrosshouldbeputintothesamefunction.It’spossibletoimplementinawaytobeabletodividethemintodifferentfunctions,butnotinsuchwaybecauseit’sslower.

Thenextthingwenoticeis,thefactthatitsequentiallycallsthefunctionsnamedruby_xxxxfrommain()seemsverymeaningful.Sincetheyarereallyobviouslysymmetric,it’soddifthere’snotanyrelationship.

Actually,thesethreefunctionshavedeeprelationships.Simplyspeaking,allofthesethreeare“built-inRubyinterfaces”.Thatis,

Page 616: Ruby Hacking Guide

theyareusedonlywhencreatingacommandwithbuilt-inrubyinterpretorandnotwhenwritingextensionlibraries.Sincerubycommanditselfcanbeconsideredasoneofprogramswithbuilt-inRubyintheory,tousetheseinterfacesisnatural.

Whatistheruby_prefix?Sofar,theallofruby’sfunctionsareprefixedwithrb_.Whyaretherethetwotypes:rb_andruby_?Iinvestigatedbutcouldnotunderstandthedifference,soIaskeddirectly.Theanswerwas,“ruby_isfortheauxiliaryfunctionsofrubycommandandrb_isfortheofficialinterfaces”

“Then,whyarethevariableslikeruby_scopeareruby_?”,Iaskedfurther.Itseemsthisisjustacoincidence.Thevariableslikeruby_scopeareoriginallynamedasthe_xxxx,butinthemiddleoftheversion1.3there’sachangetoaddprefixestoallinterfaces.Atthattimeruby_wasaddedtothe“may-be-internals-for-some-reasons”variables.

Thebottomlineisthatruby_isattachedtothingsthatsupportrubycommandortheinternalvariablesandrb_isattachedtotheofficialinterfacesofrubyinterpretor.

main()

First,straightforwardly,I’llstartwithmain().Itisnicethatthisisveryshort.

▼main()

Page 617: Ruby Hacking Guide

36int37main(argc,argv,envp)38intargc;39char**argv,**envp;40{41#ifdefined(NT)42NtInitialize(&argc,&argv);43#endif44#ifdefined(__MACOS__)&&defined(__MWERKS__)45argc=ccommand(&argv);46#endif4748ruby_init();49ruby_options(argc,argv);50ruby_run();51return0;52}

(main.c)

#ifdefNTisobviouslytheNTofWindowsNT.ButsomehowNTisalsodefinedinWin9x.So,itmeansWin32environment.NtInitialize()initializesargcargvandthesocketsystem(WinSock)forWin32.Becausethisfunctionisonlydoingtheinitialization,it’snotinterestingandnotrelatedtothemaintopic.Thus,Iomitthis.

And,__MACOS__isnot“Ma-Ko-Su”butMacOS.Inthiscase,itmeansMacOS9andbefore,anditdoesnotincludeMacOSX.Eventhoughsuch#ifdefremains,asIwroteatthebeginningofthisbook,thecurrentversioncannotrunonMacOS9andbefore.It’sjustalegacyfromwhenrubywasabletorunonit.Therefore,Ialsoomitthiscode.

Page 618: Ruby Hacking Guide

Bytheway,asitisprobablyknownbythereaderswhoarefamiliarwithClanguage,theidentifiersstartingwithanunderbararereservedforthesystemlibrariesorOS.However,althoughtheyarecalled“reserved”,usingitisalmostneverresultinanerror,butifusingalittleweirdccitcouldresultinanerror.Forexample,itistheccofHP-US.HP-USisanUNIXwhichHPiscreating.Ifthere’sanyopinionsuchasHP-UXisnotweird,Iwoulddenyitoutloud.

Anyway,conventionally,wedon’tdefinesuchidentifiersinuserapplications.

Now,I’llstarttobrieflyexplainaboutthebuilt-inRubyinterfaces.

ruby_init()

ruby_init()initializestheRubyinterpretor.SinceonlyasingleinterpretorofthecurrentRubycanexistinaprocess,itdoesnotneedneitherargumentsorareturnvalue.Thispointisgenerallyconsideredas“lackoffeatures”.

Whenthere’sonlyasingleinterpretor,morethananything,thingsaroundthedevelopmentenvironmentshouldbeespeciallytroublesome.Namely,theapplicationssuchasirb,RubyWin,andRDE.Althoughloadingarewrittenprogram,theclasseswhicharesupposedtobedeletedwouldremain.TocounterthiswiththereflectionAPIisnotimpossiblebutrequiresalotofefforts.

However,itseemsthatMr.Matsumoto(Matz)purposefullylimitsthenumberofinterpretorstoone.“it’simpossibletoinitialize

Page 619: Ruby Hacking Guide

completely”seemsitsreason.Forinstance,“theloadedextensionlibrariescouldnotberemoved”istakenasanexample.

Thecodeofruby_init()isomittedbecauseit’sunnecessarytoread.

ruby_options()

Whattoparsecommand-lineoptionsfortheRubyinterpreterisruby_options().Ofcourse,dependingonthecommand,wedonothavetousethis.

Insidethisfunction,-r(loadalibrary)and-e(passaprogramfromcommand-line)areprocessed.Thisisalsowherethefilepassedasacommand-lineargumentisparsedasaRubyprogram.

rubycommandreadsthemainprogramfromafileifitwasgiven,otherwisefromstdin.Afterthat,usingrb_compile_string()orrb_compile_file()introducedatPart2,itcompilesthetextintoasyntaxtree.Theresultwillbesetintotheglobalvariableruby_eval_tree.

Ialsoomitthecodeofruby_options()becauseit’sjustdoingnecessarythingsonebyoneandnotinteresting.

ruby_run()

Finally,ruby_run()startstoevaluatethesyntaxtreewhichwassettoruby_eval_tree.Wealsodon’talwaysneedtocallthisfunction.Otherthanruby_run(),forinstance,wecanevaluateastringby

Page 620: Ruby Hacking Guide

usingafunctionnamedrb_eval_string().

▼ruby_run()

1257void1258ruby_run()1259{1260intstate;1261staticintex;1262volatileNODE*tmp;12631264if(ruby_nerrs>0)exit(ruby_nerrs);12651266Init_stack((void*)&tmp);1267PUSH_TAG(PROT_NONE);1268PUSH_ITER(ITER_NOT);1269if((state=EXEC_TAG())==0){1270eval_node(ruby_top_self,ruby_eval_tree);1271}1272POP_ITER();1273POP_TAG();12741275if(state&&!ex)ex=state;1276ruby_stop(ex);1277}

(eval.c)

WecanseethemacrosPUSH_xxxx(),butwecanignorethemfornow.I’llexplainaboutaroundthemlaterwhenthetimecomes.Theimportantthinghereisonlyeval_node().Itscontentis:

▼eval_node()

1112staticVALUE1113eval_node(self,node)1114VALUEself;

Page 621: Ruby Hacking Guide

1115NODE*node;1116{1117NODE*beg_tree=ruby_eval_tree_begin;11181119ruby_eval_tree_begin=0;1120if(beg_tree){1121rb_eval(self,beg_tree);1122}11231124if(!node)returnQnil;1125returnrb_eval(self,node);1126}

(eval.c)

Thiscallsrb_eval()onruby_eval_tree.Theruby_eval_tree_beginisstoringthestatementsregisteredbyBEGIN.But,thisisalsonotimportant.

And,ruby_stop()insideofruby_run()terminatesallthreadsandfinalizesallobjectsandchecksexceptionsand,intheend,callsexit().Thisisalsonotimportant,sowewon’tseethis.

rb_eval()

OutlineNow,rb_eval().Thisfunctionisexactlytherealcoreofruby.Onerb_eval()callprocessesasingleNODE,andthewholesyntaxtreewillbeprocessedbycallingrecursively.(Fig.1)

Page 622: Ruby Hacking Guide

Fig.1:rb_eval

rb_evalis,asthesameasyylex(),madeofahugeswitchstatementandbranchingbyeachtypeofthenodes.First,let’slookattheoutline.

▼rb_eval()Outline

2221staticVALUE2222rb_eval(self,n)2223VALUEself;2224NODE*n;2225{2226NODE*nodesave=ruby_current_node;2227NODE*volatilenode=n;2228intstate;2229volatileVALUEresult=Qnil;22302231#defineRETURN(v)do{\2232result=(v);\2233gotofinish;\2234}while(0)22352236again:2237if(!node)RETURN(Qnil);22382239ruby_last_node=ruby_current_node=node;2240switch(nd_type(node)){

Page 623: Ruby Hacking Guide

caseNODE_BLOCK:.....caseNODE_POSTEXE:.....caseNODE_BEGIN::(plentyofcasestatements):3415default:3416rb_bug("unknownnodetype%d",nd_type(node));3417}3418finish:3419CHECK_INTS;3420ruby_current_node=nodesave;3421returnresult;3422}

(eval.c)

Intheomittedpart,plentyofthecodestoprocessallnodesarelisted.Bybranchinglikethis,itprocesseseachnode.Whenthecodeisonlyafew,itwillbeprocessedinrb_eval().Butwhenitbecomingmany,itwillbeaseparatedfunction.Mostoffunctionsineval.carecreatedinthisway.

Whenreturningavaluefromrb_eval(),itusesthemacroRETURN()insteadofreturn,inordertoalwayspassthroughCHECK_INTS.Sincethismacroisrelatedtothreads,youcanignorethisuntilthechapteraboutit.

Andfinally,thelocalvariablesresultandnodearevolatileforGC.

NODE_IF

Now,takingtheifstatementasanexample,let’slookatthe

Page 624: Ruby Hacking Guide

processoftherb_eval()evaluationconcretely.Fromhere,inthedescriptionofrb_eval(),

Thesourcecode(aRubyprogram)ItscorrespondingsyntaxtreeThepartialcodeofrb_eval()toprocessthenode.

thesethreewillbelistedatthebeginning.

▼sourceprogram

iftrue'trueexpr'else'falseexpr'end

▼itscorrespondingsyntaxtree(nodedump)

NODE_NEWLINEnd_file="if"nd_nth=1nd_next:NODE_IFnd_cond:NODE_TRUEnd_body:NODE_NEWLINEnd_file="if"nd_nth=2nd_next:NODE_STRnd_lit="trueexpr":Stringnd_else:NODE_NEWLINEnd_file="if"

Page 625: Ruby Hacking Guide

nd_nth=4nd_next:NODE_STRnd_lit="falseexpr":String

Aswe’veseeninPart2,elsifandunlesscanbe,bycontrivingthewaystoassemble,bundledtoasingleNODE_IFtype,sowedon’thavetotreatthemspecially.

▼rb_eval()−NODE_IF

2324caseNODE_IF:2325if(trace_func){2326call_trace_func("line",node,self,2327ruby_frame->last_func,2328ruby_frame->last_class);2329}2330if(RTEST(rb_eval(self,node->nd_cond))){2331node=node->nd_body;2332}2333else{2334node=node->nd_else;2335}2336gotoagain;

(eval.c)

Onlythelastifstatementisimportant.Ifrewritingitwithoutanychangeinitsmeaning,itbecomesthis:

if(RTEST(rb_eval(self,node->nd_cond))){(A)RETURN(rb_eval(self,node->nd_body));(B)}else{RETURN(rb_eval(self,node->nd_else));(C)}

Page 626: Ruby Hacking Guide

First,at(A),evaluating(thenodeof)theRuby’sconditionstatementandtestingitsvaluewithRTEST().I’vementionedthatRTEST()isamacrototestwhetherornotaVALUEistrueofRuby.Ifthatwastrue,evaluatingthethensideclauseat(B).Iffalse,evaluatingtheelsesideclauseat(C).

Inaddition,I’vementionedthatifstatementofRubyalsohasitsownvalue,soit’snecessarytoreturnavalue.Sincethevalueofanifisthevalueofeitherthethensideortheelsesidewhichistheoneexecuted,returningitbyusingthemacroRETURN().

Intheoriginallist,itdoesnotcallrb_eval()recursivelybutjustdoesgoto.Thisisthe"conversionfromtailrecursiontogoto"whichhasalsoappearedinthepreviouschapter“Syntaxtreeconstruction”.

NODE_NEW_LINE

SincetherewasNODE_NEWLINEatthenodeforaifstatement,let’slookatthecodeforit.

▼rb_eval()–NODE_NEWLINE

3404caseNODE_NEWLINE:3405ruby_sourcefile=node->nd_file;3406ruby_sourceline=node->nd_nth;3407if(trace_func){3408call_trace_func("line",node,self,3409ruby_frame->last_func,3410ruby_frame->last_class);3411}

Page 627: Ruby Hacking Guide

3412node=node->nd_next;3413gotoagain;

(eval.c)

There’snothingparticularlydifficult.

call_trace_func()hasalreadyappearedatNODE_IF.Hereisasimpleexplanationofwhatkindofthingitis.ThisisafeaturetotraceaRubyprogramfromRubylevel.Thedebugger(debug.rb)andthetracer(tracer.rb)andtheprofiler(profile.rb)andirb(interactiverubycommand)andmoreareusingthisfeature.

Byusingthefunction-likemethodset_trace_funcyoucanregisteraProcobjecttotrace,andthatProcobjectisstoredintotrace_func.Iftrace_funcisnot0,itmeansnotQFalse,itwillbeconsideredasaProcobjectandexecuted(atcall_trace_func()).

Thiscall_trace_func()hasnothingtodowiththemaintopicandnotsointerestingaswell.Thereforeinthisbook,fromnowon,I’llcompletelyignoreit.Ifyouareinterestedinit,I’dlikeyoutochallengeafterfinishingtheChapter16:Blocks.

Pseudo-localVariablesNODE_IFandsuchareinteriornodesinasyntaxtree.Let’slookattheleaves,too.

▼rb_eval()Ppseudo-LocalVariableNodes

Page 628: Ruby Hacking Guide

2312caseNODE_SELF:2313RETURN(self);23142315caseNODE_NIL:2316RETURN(Qnil);23172318caseNODE_TRUE:2319RETURN(Qtrue);23202321caseNODE_FALSE:2322RETURN(Qfalse);

(eval.c)

We’veseenselfastheargumentofrb_eval().I’dlikeyoutomakesureitbygoingbackalittle.Theothersareprobablynotneededtobeexplained.

JumpTagNext,I’dliketoexplainNODE_WHILEwhichiscorrespondingtowhile,buttoimplementbreakornextonlywithrecursivecallsofafunctionisdifficult.Sincerubyenablesthesesyntaxesbyusingwhatnamed“jumptag”,I’llstartwithdescribingitfirst.

Simplyput,“jumptag”isawrapperofsetjmp()andlongjump()whicharelibraryfunctionsofClanguage.Doyouknowaboutsetjmp()?Thisfunctionhasalreadyappearedatgc.c,butitisusedinveryabnormalwaythere.setjmp()isusuallyusedtojumpoverfunctions.I’llexplainbytakingthebelowcodeasanexample.Theentrypointisparent().

Page 629: Ruby Hacking Guide

▼setjmp()andlongjmp()

jmp_bufbuf;

voidchild2(void){longjmp(buf,34);/*gobackstraighttoparentthereturnvalueofsetjmpbecomes34*/puts("Thismessagewillneverbeprinted.");}

voidchild1(void){child2();puts("Thismessagewillneverbeprinted.");}

voidparent(void){intresult;if((result=setjmp(buf))==0){/*normallyreturnedfromsetjmp*/child1();}else{/*returnedfromchild2vialongjmp*/printf("%d\n",result);/*shows34*/}}

First,whensetjmp()iscalledatparent(),theexecutingstateatthetimeissavedtotheargumentbuf.Toputitalittlemoredirectly,theaddressofthetopofthemachinestackandtheCPUregistersaresaved.Ifthereturnvalueofsetjmp()was0,itmeansitnormallyreturnedfromsetjmp(),thusyoucanwritethesubsequentcodeasusual.Thisistheifside.Here,itcallschild1().

Next,thecontrolmovestochild2()andcallslongjump,thenitcangobackstraighttotheplacewheretheargumentbufwassetjmped.

Page 630: Ruby Hacking Guide

Sointhiscase,itgoesbacktothesetjmpatparent().Whencomingbackvialongjmp,thereturnvalueofsetjmpbecomesthevalueofthesecondargumentoflongjmp,sotheelsesideisexecuted.And,evenifwepass0tolongjmp,itwillbeforcedtobeanothervalue.Thusit’sfruitless.

Fig.2showsthestateofthemachinestack.Theordinaryfunctionsreturnonlyonceforeachcall.However,it’spossiblesetjmp()returnstwice.IsithelpfultograsptheconceptifIsaythatitissomethinglikefork()?

Fig.2:setjmp()longjmp()Image

Now,we’velearnedaboutsetjmp()asapreparation.Ineval.c,EXEC_TAGcorrespondstosetjmp()andJUMP_TAG()correspondstolongjmp()respectively.(Fig.3)

Page 631: Ruby Hacking Guide

Fig.3:“tagjump”image

Takealookatthisimage,itseemsthatEXEC_TAG()doesnothaveanyarguments.Wherehasjmp_bufgone?Actually,inruby,jmp_bufiswrappedbythestructstructtag.Let’slookatit.

▼structtag

783structtag{784jmp_bufbuf;785structFRAME*frame;/*FRAMEwhenPUSH_TAG*/786structiter*iter;/*ITERwhenPUSH_TAG*/787IDtag;/*tagtype*/788VALUEretval;/*thereturnvalueofthisjump*/789structSCOPE*scope;/*SCOPEwhenPUSH_TAG*/790intdst;/*thedestinationID*/791structtag*prev;792};

(eval.c)

Page 632: Ruby Hacking Guide

Becausethere’sthememberprev,wecaninferthatstructtagisprobablyastackstructureusingalinkedlist.Moreover,bylookingaroundit,wecanfindthemacrosPUSH_TAG()andPOP_TAG,thusitdefinitelyseemsastack.

▼PUSH_TAG()POP_TAG()

793staticstructtag*prot_tag;/*thepointertotheheadofthemachinestack*/

795#definePUSH_TAG(ptag)do{\796structtag_tag;\797_tag.retval=Qnil;\798_tag.frame=ruby_frame;\799_tag.iter=ruby_iter;\800_tag.prev=prot_tag;\801_tag.scope=ruby_scope;\802_tag.tag=ptag;\803_tag.dst=0;\804prot_tag=&_tag

818#definePOP_TAG()\819if(_tag.prev)\820_tag.prev->retval=_tag.retval;\821prot_tag=_tag.prev;\822}while(0)

(eval.c)

I’dlikeyoutobeflabbergastedherebecausetheactualtagisfullyallocatedatthemachinestackasalocalvariable.(Fig.4).Moreover,do~whileisdividedbetweenthetwomacros.ThismightbeoneofthemostawfulusagesoftheCpreprocessor.HereisthemacrosPUSH/POPcoupledandextractedtomakeiteasytoread.

do{

Page 633: Ruby Hacking Guide

structtag_tag;_tag.prev=prot_tag;/*savetheprevioustag*/prot_tag=&_tag;/*pushanewtagonthestack*//*doseveralthings*/prot_tag=_tag.prev;/*restoretheprevioustag*/}while(0);

Thismethoddoesnothaveanyoverheadoffunctioncalls,anditscostofthememoryallocationisnexttonothing.Thistechniqueisonlypossiblebecausetherubyevaluatorismadeofrecursivecallsofrb_eval().

Fig.4:thetagstackisembeddedinthemachinestack

Becauseofthisimplementation,it’snecessarythatPUSH_TAGand

Page 634: Ruby Hacking Guide

POP_TAGareinthesameonefunctionasapair.Plus,sinceit’snotsupposedtobecarelesslyusedattheoutsideoftheevaluator,wecan’tmakethemavailabletootherfiles.

Additionally,let’salsotakealookatEXEC_TAG()andJUMP_TAG().

▼EXEC_TAG()JUMP_TAG()

810#defineEXEC_TAG()setjmp(prot_tag->buf)

812#defineJUMP_TAG(st)do{\813ruby_frame=prot_tag->frame;\814ruby_iter=prot_tag->iter;\815longjmp(prot_tag->buf,(st));\816}while(0)

(eval.c)

Inthisway,setjmpandlongjmparewrappedbyEXEC_TAG()andJUMP_TAG()respectively.ThenameEXEC_TAG()canlooklikeawrapperoflongjmp()atfirstsight,butthisoneistoexecutesetjmp().

Basedonalloftheabove,I’llexplainthemechanismofwhile.First,whenstartingwhileitdoesEXEC_TAG()(setjmp).Afterthat,itexecutesthemainbodybycallingrb_eval()recursively.Ifthere’sbreakornext,itdoesJUMP_TAG()(longjmp).Then,itcangobacktothestartpointofthewhileloop.(Fig.5)

Page 635: Ruby Hacking Guide

Fig.5:theimplementationofwhilebyusing“tagjump”

Thoughbreakwastakenasanexamplehere,whatcannotbeimplementedwithoutjumpingisnotonlybreak.Evenifwelimitthecasetowhile,therearenextandredo.Additionally,returnfromamethodandexceptionsalsoshouldhavetoclimboverthewallofrb_eval().Andsinceit’scumbersometouseadifferenttagstackforeachcase,wewantforonlyonestacktohandleallcasesinonewayoranother.

Whatweneedtomakeitpossibleisjustattachinginformationabout“whatthepurposeofthisjumpis”.Conveniently,thereturnvalueofsetjmp()couldbespecifiedastheargumentoflongjmp(),thuswecanusethis.Thetypesareexpressedbythefollowingflags:

▼tagtype

828#defineTAG_RETURN0x1/*return*/

Page 636: Ruby Hacking Guide

829#defineTAG_BREAK0x2/*break*/830#defineTAG_NEXT0x3/*next*/831#defineTAG_RETRY0x4/*retry*/832#defineTAG_REDO0x5/*redo*/833#defineTAG_RAISE0x6/*generalexceptions*/834#defineTAG_THROW0x7/*throw(won'tbeexplainedinthisboook)*/835#defineTAG_FATAL0x8/*fatal:exceptionswhicharenotcatchable*/836#defineTAG_MASK0xf

(eval.c)

Themeaningsarewrittenaseachcomment.ThelastTAG_MASKisthebitmasktotakeouttheseflagsfromareturnvalueofsetjmp().Thisisbecausethereturnvalueofsetjmp()canalsoincludeinformationwhichisnotabouta“typeofjump”.

NODE_WHILE

Now,byexaminingthecodeofNODE_WHILE,let’schecktheactualusageoftags.

▼TheSourceProgram

whiletrue'true_expr'end

▼Itscorrespondingsyntaxtree(nodedump-short)

NODE_WHILEnd_state=1(while)nd_cond:NODE_TRUEnd_body:

Page 637: Ruby Hacking Guide

NODE_STRnd_lit="true_expr":String

▼rb_eval–NODE_WHILE

2418caseNODE_WHILE:2419PUSH_TAG(PROT_NONE);2420result=Qnil;2421switch(state=EXEC_TAG()){2422case0:2423if(node->nd_state&&!RTEST(rb_eval(self,node->nd_cond)))2424gotowhile_out;2425do{2426while_redo:2427rb_eval(self,node->nd_body);2428while_next:2429;2430}while(RTEST(rb_eval(self,node->nd_cond)));2431break;24322433caseTAG_REDO:2434state=0;2435gotowhile_redo;2436caseTAG_NEXT:2437state=0;2438gotowhile_next;2439caseTAG_BREAK:2440state=0;2441result=prot_tag->retval;2442default:2443break;2444}2445while_out:2446POP_TAG();2447if(state)JUMP_TAG(state);2448RETURN(result);

(eval.c)

Theidiomwhichwillappearoverandoveragainappearedinthe

Page 638: Ruby Hacking Guide

abovecode.

PUSH_TAG(PROT_NONE);switch(state=EXEC_TAG()){case0:/*processnormally*/break;caseTAG_a:state=0;/*clearstatebecausethejumpwaitedforcomes*//*dotheprocessofwhenjumpedwithTAG_a*/break;caseTAG_b:state=0;/*clearstatebecausethejumpwaitedforcomes*//*dotheprocessofwhenjumpedwithTAG_b*/break;defaultbreak;/*thisjumpisnotwaitedfor,then...*/}POP_TAG();if(state)JUMP_TAG(state);/*..jumpagainhere*/

First,asPUSH_TAG()andPOP_TAG()arethepreviouslydescribedmechanism,it’snecessarytobeusedalwaysasapair.Also,theyneedtobewrittenoutsideofEXEC_TAG().And,applyEXEC_TAG()tothejustpushedjmp_buf.Thismeansdoingsetjmp().Ifthereturnvalueis0,sinceitmeansimmediatelyreturningfromsetjmp(),itdoesthenormalprocessing(thisusuallycontainsrb_eval()).IfthereturnvalueofEXEC_TAG()isnot0,sinceitmeansreturningvialongjmp(),itfiltersonlytheownnecessaryjumpsbyusingcaseandletstherest(default)pass.

Itmightbehelpfultoseealsothecodeofthejumpingside.Thebelowcodeisthehandlerofthenodeofredo.

Page 639: Ruby Hacking Guide

▼rb_eval()–NODE_REDO

2560caseNODE_REDO:2561CHECK_INTS;2562JUMP_TAG(TAG_REDO);2563break;

(eval.c)

AsaresultofjumpingviaJUMP_TAG(),itgoesbacktothelastEXEC_TAG().ThereturnvalueatthetimeistheargumentTAG_REDO.Beingawareofthis,I’dlikeyoutolookatthecodeofNODE_WHILEandcheckwhatrouteistaken.

Theidiomhasenoughexplained,nowI’llexplainaboutthecodeofNODE_WHILEalittlemoreindetail.Asmentioned,sincetheinsideofcase0:isthemainprocess,Iextractedonlythatpart.Additionally,Imovedsomelabelstoenhancereadability.

if(node->nd_state&&!RTEST(rb_eval(self,node->nd_cond)))gotowhile_out;do{rb_eval(self,node->nd_body);}while(RTEST(rb_eval(self,node->nd_cond)));while_out:

Therearethetwoplacescallingrb_eval()onnode->nd_statewhichcorrespondstotheconditionalstatement.Itseemsthatonlythefirsttestoftheconditionisseparated.Thisistodealwithbothdo~whileandwhileatonce.Whennode->nd_stateis0itisado~while,when1itisanordinarywhile.Therestmightbeunderstoodby

Page 640: Ruby Hacking Guide

followingstep-by-step,Iwon’tparticularlyexplain.

Bytheway,Ifeellikeiteasilybecomesaninfiniteloopifthereisnextorredointheconditionstatement.Sinceitisofcourseexactlywhatthecodemeans,it’sthefaultofwhowroteit,butI’malittlecuriousaboutit.So,I’veactuallytriedit.

%ruby-e'whilenextdonilend'-e:1:voidvalueexpression

It’ssimplyrejectedatthetimeofparsing.It’ssafebutnotaninterestingresult.Whatproducesthiserrorisvalue_expr()ofparse.y.

Thevalueofanevaluationofwhilewhilehadnothaditsvalueforalongtime,butithasbeenabletoreturnavaluebyusingbreaksinceruby1.7.Thistime,let’sfocusontheflowofthevalueofanevaluation.Keepinginmindthatthevalueofthelocalvariableresultbecomesthereturnvalueofrb_eval(),I’dlikeyoutolookatthefollowingcode:

result=Qnil;switch(state=EXEC_TAG()){case0:/*themainprocess*/caseTAG_REDO:caseTAG_NEXT:/*eachjump*/

caseTAG_BREAK:state=0;

Page 641: Ruby Hacking Guide

result=prot_tag->retval;(A)default:break;}RETURN(result);

Whatweshouldfocusonisonly(A).Thereturnvalueofthejumpseemstobepassedviaprot_tag->retvalwhichisastructtag.Hereisthepassingside:

▼rb_eval()–NODE_BREAK

2219#definereturn_value(v)prot_tag->retval=(v)

2539caseNODE_BREAK:2540if(node->nd_stts){2541return_value(avalue_to_svalue(rb_eval(self,node->nd_stts)));2542}2543else{2544return_value(Qnil);2545}2546JUMP_TAG(TAG_BREAK);2547break;

(eval.c)

Inthisway,byusingthemacroreturn_value(),itassignsthevaluetothestructofthetopofthetagstack.

Thebasicflowisthis,butinpracticetherecouldbeanotherEXEC_TAGbetweenEXEC_TAG()ofNODE_WHILEandJUMP_TAG()ofNODE_BREAK.Forexample,rescueofanexceptionhandlingcanexistbetweenthem.

Page 642: Ruby Hacking Guide

whilecond#EXEC_TAG()forNODE_WHILEbegin#EXEC_TAG()againforrescuebreak1rescueendend

Therefore,it’shardtodeterminewhetherornotthestricttagofwhendoingJUMP_TAG()atNODE_BREAKistheonewhichwaspushedatNODE_WHILE.Inthiscase,becauseretvalispropagatedinPOP_TAG()asshownbelow,thereturnvaluecanbepassedtothenexttagwithoutparticularthought.

▼POP_TAG()

818#definePOP_TAG()\819if(_tag.prev)\820_tag.prev->retval=_tag.retval;\821prot_tag=_tag.prev;\822}while(0)

(eval.c)

ThiscanprobablybedepictedasFig.6.

Page 643: Ruby Hacking Guide

Fig.6:Transferringthereturnvalue

Exception

Asthesecondexampleoftheusageof“tagjump”,we’lllookathowexceptionsaredealtwith.

raise

WhenIexplainedwhile,welookedatthesetjmp()sidefirst.This

Page 644: Ruby Hacking Guide

time,we’lllookatthelongjmp()sidefirstforachange.It’srb_exc_raise()whichisthesubstanceofraise.

▼rb_exc_raise()

3645void3646rb_exc_raise(mesg)3647VALUEmesg;3648{3649rb_longjmp(TAG_RAISE,mesg);3650}

(eval.c)

mesgisanexceptionobject(aninstanceofExceptionoroneofitssubclass).NoticethatItseemstojumpwithTAG_RAISEthistime.Andthebelowcodeisverysimplifiedrb_longjmp().

▼rb_longjmp()(simplified)

staticvoidrb_longjmp(tag,mesg)inttag;VALUEmesg;{if(NIL_P(mesg))mesg=ruby_errinfo;set_backtrace(mesg,get_backtrace(mesg));ruby_errinfo=mesg;JUMP_TAG(tag);}

Well,thoughthiscanbeconsideredasamatterofcourse,thisisjusttojumpasusualbyusingJUMP_TAG().

Page 645: Ruby Hacking Guide

Whatisruby_errinfo?Bydoinggrepafewtimes,Ifiguredoutthatthisvariableisthesubstanceoftheglobalvariable$!ofRuby.Sincethisvariableindicatestheexceptionwhichiscurrentlyoccurring,naturallyitssubstanceruby_errinfoshouldhavethesamemeaningaswell.

TheBigPicture▼thesourceprogram

beginraise('exceptionraised')rescue'rescueclause'ensure'ensureclause'end

▼thesyntaxtree(nodedump-short)

NODE_BEGINnd_body:NODE_ENSUREnd_head:NODE_RESCUEnd_head:NODE_FCALLnd_mid=3857(raise)nd_args:NODE_ARRAY[0:NODE_STRnd_lit="exceptionraised":String]nd_resq:NODE_RESBODY

Page 646: Ruby Hacking Guide

nd_args=(null)nd_body:NODE_STRnd_lit="rescueclause":Stringnd_head=(null)nd_else=(null)nd_ensr:NODE_STRnd_lit="ensureclause":String

Astherightorderofrescueandensureisdecidedatparserlevel,therightorderisstrictlydecidedatsyntaxtreeaswell.NODE_ENSUREisalwaysatthe“top”,NODE_RESCUEcomesnext,themainbody(whereraiseexist)isthelast.SinceNODE_BEGINisanodetodonothing,youcanconsiderNODE_ENSUREisvirtuallyonthetop.

Thismeans,sinceNODE_ENSUREandNODE_RESCUEareabovethemainbodywhichwewanttoprotect,wecanstopraisebymerelydoingEXEC_TAG().Orrather,thetwonodesareputaboveinsyntaxtreeforthispurpose,isprobablymoreaccuratetosay.

ensure

WearegoingtolookatthehandlerofNODE_ENSUREwhichisthenodeofensure.

▼rb_eval()–NODE_ENSURE

2634caseNODE_ENSURE:2635PUSH_TAG(PROT_NONE);2636if((state=EXEC_TAG())==0){2637result=rb_eval(self,node->nd_head);(A-1)2638}

Page 647: Ruby Hacking Guide

2639POP_TAG();2640if(node->nd_ensr){2641VALUEretval=prot_tag->retval;(B-1)2642VALUEerrinfo=ruby_errinfo;26432644rb_eval(self,node->nd_ensr);(A-2)2645return_value(retval);(B-2)2646ruby_errinfo=errinfo;2647}2648if(state)JUMP_TAG(state);(B-3)2649break;

(eval.c)

Thisbranchusingifisanotheridiomtodealwithtag.ItinterruptsajumpbydoingEXEC_TAG()thenevaluatestheensureclause((node->nd_ensr).Asfortheflowoftheprocess,it’sprobablystraightforward.

Again,we’lltrytothinkaboutthevalueofanevaluation.Tocheckthespecificationfirst,

beginexpr0ensureexpr1end

fortheabovestatement,thevalueofthewholebeginwillbethevalueofexpr0regardlessofwhetherornotensureexists.Thisbehaviorisreflectedtothecode(A-1,2),sothevalueoftheevaluationofanensureclauseiscompletelydiscarded.

At(B-1,3),itdealswiththeevaluatedvalueofwhenajump

Page 648: Ruby Hacking Guide

occurredatthemainbody.Imentionedthatthevalueofthiscaseisstoredinprot_tag->retval,soitsavesthevaluetoalocalvariabletopreventfrombeingcarelesslyoverwrittenduringtheexecutionoftheensureclause(B-1).Aftertheevaluationoftheensureclause,itrestoresthevaluebyusingreturn_value()(B-2).Whenanyjumphasnotoccurred,state==0inthiscase,prot_tag->retvalisnotusedinthefirstplace.

rescue

It’sbeenalittlewhile,I’llshowthesyntaxtreeofrescueagainjustincase.

▼SourceProgram

beginraise()rescueArgumentError,TypeError'errorraised'end

▼ItsSyntaxTree(nodedump-short)

NODE_BEGINnd_body:NODE_RESCUEnd_head:NODE_FCALLnd_mid=3857(raise)nd_args=(null)nd_resq:NODE_RESBODYnd_args:

Page 649: Ruby Hacking Guide

NODE_ARRAY[0:NODE_CONSTnd_vid=4733(ArgumentError)1:NODE_CONSTnd_vid=4725(TypeError)]nd_body:NODE_STRnd_lit="errorraised":Stringnd_head=(null)nd_else=(null)

I’dlikeyoutomakesurethat(thesyntaxtreeof)thestatementtoberescueedis“under”NODE_RESCUE.

▼rb_eval()–NODE_RESCUE

2590caseNODE_RESCUE:2591retry_entry:2592{2593volatileVALUEe_info=ruby_errinfo;25942595PUSH_TAG(PROT_NONE);2596if((state=EXEC_TAG())==0){2597result=rb_eval(self,node->nd_head);/*evaluatethebody*/2598}2599POP_TAG();2600if(state==TAG_RAISE){/*anexceptionoccurredatthebody*/2601NODE*volatileresq=node->nd_resq;26022603while(resq){/*dealwiththerescueclauseonebyone*/2604ruby_current_node=resq;2605if(handle_rescue(self,resq)){/*Ifdealtwithbythisclause*/2606state=0;2607PUSH_TAG(PROT_NONE);2608if((state=EXEC_TAG())==0){2609result=rb_eval(self,resq->nd_body);2610}/*evaluatetherescueclause*/

Page 650: Ruby Hacking Guide

2611POP_TAG();2612if(state==TAG_RETRY){/*Sinceretryoccurred,*/2613state=0;2614ruby_errinfo=Qnil;/*theexceptionisstopped*/2615gotoretry_entry;/*converttogoto*/2616}2617if(state!=TAG_RAISE){/*Alsobyrescueandsuch*/2618ruby_errinfo=e_info;/*theexceptionisstopped*/2619}2620break;2621}2622resq=resq->nd_head;/*moveontothenextrescueclause*/2623}2624}2625elseif(node->nd_else){/*whenthereisanelseclause,*/2626if(!state){/*evaluateitonlywhenanyexceptionhasnotoccurred.*/2627result=rb_eval(self,node->nd_else);2628}2629}2630if(state)JUMP_TAG(state);/*thejumpwasnotwaitedfor*/2631}2632break;

(eval.c)

Eventhoughthesizeisnotsmall,it’snotdifficultbecauseitonlysimplydealwiththenodesonebyone.Thisisthefirsttimehandle_rescue()appeared,butforsomereasonswecannotlookatthisfunctionnow.I’llexplainonlyitseffectshere.Itsprototypeisthis,

staticinthandle_rescue(VALUEself,NODE*resq)

anditdetermineswhetherthecurrentlyoccurringexception(ruby_errinfo)isasubclassoftheclassthatisexpressedbyresq(TypeError,forinstance).Thereasonwhypassingselfisthatit’s

Page 651: Ruby Hacking Guide

necessarytocallrb_eval()insidethisfunctioninordertoevaluateresq.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 652: Ruby Hacking Guide

RubyHackingGuide

Page 653: Ruby Hacking Guide

Chapter14:Context

Therangecoveredbythischapterisreallybroad.Firstofall,I’lldescribeabouthowtheinternalstateoftheevaluatorisexpressed.Afterthat,asanactualexample,we’llreadhowthestateischangedonaclassdefinitionstatement.Subsequently,we’llexaminehowtheinternalstateinfluencesmethoddefinitionstatements.Lastly,we’llobservehowthebothstatementschangethebehaviorsofthevariabledefinitionsandthevariablereferences.

TheRubystack

ContextandStackWithanimageofatypicalprocedurallanguage,eachtimecallingaprocedure,theinformationwhichisnecessarytoexecutetheproceduresuchasthelocalvariablespaceandtheplacetoreturnisstoredinastruct(astackframe)anditispushedonthestack.Whenreturningfromaprocedure,thestructwhichisonthetopofthestackispoppedandthestateisreturnedtothepreviousmethod.TheexecutingimageofaCprogramwhichwasexplainedatChapter5:Garbagecollectionisaperfectexample.

Whattobecarefulabouthereis,whatischangingduringthe

Page 654: Ruby Hacking Guide

executionisonlythestack,onthecontrary,theprogramremainsunchangedwhereveritis.Forexample,ifitis“areferencetothelocalvariablei”,there’sjustanorderof“givemeiofthecurrentframe”,itisnotwrittenas“givemeiofthatframe”.Inotherwords,“only”thestateofthestackinfluencestheconsequence.Thisiswhy,evenifaprocedureiscalledanytimeandanynumberoftimes,weonlyhavetowriteitscodeonce(Fig.1).

Fig.1:Whatischangingisonlythestack

TheexecutionofRubyisalsobasicallynothingbutchainedcallsofmethodswhichareprocedures,soessentiallyithasthesameimageasabove.Inotherwords,withthesamecode,thingsbeingaccessedsuchaslocalvariablescopeandtheblocklocalscopewillbechanging.Andthesekindofscopesareexpressedbystacks.

HoweverinRuby,forinstance,youcantemporarilygobacktothescopepreviouslyusedbyusingiteratorsorProc.Thiscannotbeimplementedwithjustsimplypushing/poppingastack.ThereforetheframesoftheRubystackwillbeintricatelyrearrangedduringexecution.AlthoughIcallit“stack”,itcouldbebettertoconsideritasalist.

Page 655: Ruby Hacking Guide

Otherthanthemethodcall,thelocalvariablescopecanalsobechangedontheclassdefinitions.So,themethodcallsdoesnotmatchthetransitionsofthelocalvariablescope.Sincetherearealsoblocks,it’snecessarytohandlethemseparately.Forthesevariousreasons,surprisingly,therearesevenstacks.

StackPointer

StackFrameType Description

ruby_frame structFRAME therecordsofmethodcallsruby_scope structSCOPE thelocalvariablescoperuby_block structBLOCK theblockscope

ruby_iter structiter whetherornotthecurrentFRAMEisaniterator

ruby_class VALUE theclasstodefinemethodsonruby_cref NODE(NODE_CREF) theclassnestinginformation

ChasonlyonestackandRubyhassevenstacks,bysimplearithmetic,theexecutingimageofRubyisatleastseventimesmorecomplicatedthanC.Butitisactuallynotseventimesatall,it’satleasttwentytimesmorecomplicated.

First,I’llbrieflydescribeaboutthesestacksandtheirstackframestructs.Thedefinedfileiseithereval.corevn.h.Basicallythesestackframesaretouchedonlybyeval.c…iswhatitshouldbeifitwerepossible,butgc.cneedstoknowthestructtypeswhenmarking,sosomeofthemareexposedinenv.h.

Ofcourse,markingcouldbedoneintheotherfilebutgc.c,butitrequiresseparatedfunctionswhichcauseslowingdown.The

Page 656: Ruby Hacking Guide

ordinaryprogramshadbetternotcareaboutsuchthings,butboththegarbagecollectorandthecoreoftheevaluatoristheruby’sbiggestbottleneck,soit’squiteworthtooptimizeevenforjustonemethodcall.

ruby_frame

ruby_frameisastacktorecordmethodcalls.ThestackframestructisstructFRAME.ThisterminologyisabitconfusingbutpleasebeawarethatI’lldistinctivelywriteitjustaframewhenitmeansa“stackframe”asageneralnounandFRAMEwhenitmeansstructFRAME.

▼ruby_frame

16externstructFRAME{17VALUEself;/*self*/18intargc;/*theargumentcount*/19VALUE*argv;/*thearrayofargumentvalues*/20IDlast_func;/*thenameofthisFRAME(whencalled)*/21IDorig_func;/*thenameofthisFRAME(whendefined)*/22VALUElast_class;/*theclassoflast_func'sreceiver*/23VALUEcbase;/*thebasepointforsearchingconstantsandclassvariables*/24structFRAME*prev;25structFRAME*tmp;/*toprotectfromGC.thiswillbedescribedlater*/26structRNode*node;/*thefilenameandthelinenumberofthecurrentlyexecutedline.*/27intiter;/*isthiscalledwithablock?*/28intflags;/*thebelowtwo*/29}*ruby_frame;

33#defineFRAME_ALLOCA0/*FRAMEisallocatedonthemachinestack*/34#defineFRAME_MALLOC1/*FRAMEisallocatedbymalloc*/

(env.h)

Page 657: Ruby Hacking Guide

Firstafall,sincethere’stheprevmember,youcaninferthatthestackismadeofalinkedlist.(Fig.2)

Fig.2:ruby_frame

Thefactthatruby_xxxxpointstothetopstackframeiscommontoallstacksandwon’tbementionedeverytime.

Thefirstmemberofthestructisself.Thereisalsoselfintheargumentsofrb_eval(),butwhythisstructremembersanotherself?ThisisfortheC-levelfunctions.Moreprecisely,it’sforrb_call_super()thatiscorrespondingtosuper.Inordertoexecutesuper,itrequiresthereceiverofthecurrentmethod,butthecallersideofrb_call_super()couldnothavesuchinformation.However,thechainofrb_eval()isinterruptedbeforethetimewhentheexecutionoftheuser-definedCcodestarts.Therefore,theconclusionisthatthereneedawaytoobtaintheinformationofselfoutofnothing.And,FRAMEistherightplacetostoreit.

Page 658: Ruby Hacking Guide

Thinkingalittlefurther,It’smysteriousthatthereareargcandargv.Becauseparametervariablesarelocalvariablesafterall,itisunnecessarytopreservethegivenargumentsafterassigningthemintothelocalvariablewiththesamenamesatthebeginningofthemethod,isn’tit?Then,whatistheuseofthem?Theansweristhatthisisactuallyforsuperagain.InRuby,whencallingsuperwithoutanyarguments,thevaluesoftheparametervariablesofthemethodwillbepassedtothemethodofthesuperclass.Thus,(thelocalvariablespacefor)theparametervariablesmustbereserved.

Additionally,thedifferencebetweenlast_funcandorig_funcwillbecomeoutinthecaseslikewhenthemethodisaliased.Forinstance,

classCdeforig()endaliasaliorigendC.new.ali

inthiscase,last_func=aliandorig_func=orig.Notsurprisingly,thesemembersalsohavetodowithsuper.

ruby_scope

ruby_scopeisthestacktorepresentthelocalvariablescope.Themethodandclassdefinitionstatements,themoduledefinitionstatementsandthesingletonclassdefinitionstatements,allofthemaredifferentscopes.ThestackframestructisstructSCOPE.

Page 659: Ruby Hacking Guide

I’llcallthisframeSCOPE.

▼ruby_scope

36externstructSCOPE{37structRBasicsuper;38ID*local_tbl;/*anarrayofthelocalvariablenames*/39VALUE*local_vars;/*thespacetostorelocalvariables*/40intflags;/*thebelowfour*/41}*ruby_scope;

43#defineSCOPE_ALLOCA0/*local_varsisallocatedbyalloca*/44#defineSCOPE_MALLOC1/*local_varsisallocatedbymalloc*/45#defineSCOPE_NOSTACK2/*POP_SCOPEisdone*/46#defineSCOPE_DONT_RECYCLE4/*ProciscreatedwiththisSCOPE*/

(env.h)

SincethefirstelementisstructRBasic,thisisaRubyobject.ThisisinordertohandleProcobjects.Forexample,let’strytothinkaboutthecaselikethis:

defmake_counterlvar=0returnProc.new{lvar+=1}end

cnt=make_counter()pcnt.call#1pcnt.call#2pcnt.call#3cnt=nil#cutthereference.ThecreatedProcfinallybecomesunnecessaryhere.

TheProcobjectcreatedbythismethodwillpersistlongerthanthemethodthatcreatesit.And,becausetheProccanrefertothelocalvariablelvar,thelocalvariablesmustbepreserveduntiltheProc

Page 660: Ruby Hacking Guide

willdisappear.Thus,ifitwerenothandledbythegarbagecollector,noonecandeterminethetimetofree.

TherearetworeasonswhystructSCOPEisseparatedfromstructFRAME.Firstly,thethingslikeclassdefinitionstatementsarenotmethodcallsbutcreatedistinctlocalvariablescopes.Secondly,whenacalledmethodisdefinedinCtheRuby’slocalvariablespaceisunnecessary.

ruby_block

structBLOCKistherealbodyofaRuby’siteratorblockoraProcobject,itisalsokindofasnapshotoftheevaluatoratsomepoint.ThisframewillalsobebrieflywrittenasBLOCKasinthesamemannerasFRAMEandSCOPE.

▼ruby_block

580staticstructBLOCK*ruby_block;

559structBLOCK{560NODE*var;/*theblockparameters(mlhs)*/561NODE*body;/*thecodeoftheblockbody*/562VALUEself;/*theselfwhenthisBLOCKiscreated*/563structFRAMEframe;/*thecopyofruby_framewhenthisBLOCKiscreated*/564structSCOPE*scope;/*theruby_scopewhenthisBLOCKiscreated*/565structBLOCKTAG*tag;/*theidentityofthisBLOCK*/566VALUEklass;/*theruby_classwhenthisBLOCKiscreated*/567intiter;/*theruby_iterwhenthisBLOCKiscreated*/568intvmode;/*thescope_vmodewhenthisBLOCKiscreated*/569intflags;/*BLOCK_D_SCOPE,BLOCK_DYNAMIC*/570structRVarmap*dyna_vars;/*theblocklocalvariablespace*/571VALUEorig_thread;/*thethreadthatcreatesthisBLOCK*/572VALUEwrapper;/*theruby_wrapperwhenthisBLOCKiscreated*/

Page 661: Ruby Hacking Guide

573structBLOCK*prev;574};

553structBLOCKTAG{554structRBasicsuper;555longdst;/*destination,thatis,theplacetoreturn*/556longflags;/*BLOCK_DYNAMIC,BLOCK_ORPHAN*/557};

576#defineBLOCK_D_SCOPE1/*havingdistinctblocklocalscope*/577#defineBLOCK_DYNAMIC2/*BLOCKwastakenfromaRubyprogram*/578#defineBLOCK_ORPHAN4/*theFRAMEthatcreatesthisBLOCKhasfinished*/

(eval.c)

Notethatframeisnotapointer.ThisisbecausetheentirecontentofstructFRAMEwillbeallcopiedandpreserved.TheentirestructFRAMEis(forbetterperformance)allocatedonthemachinestack,butBLOCKcouldpersistlongerthantheFRAMEthatcreatesit,thepreservationisapreparationforthatcase.

Additionally,structBLOCKTAGisseparatedinordertodetectthesameblockwhenmultipleProcobjectsarecreatedfromtheblock.TheProcobjectswhichwerecreatedfromtheonesameblockhavethesameBLOCKTAG.

ruby_iter

Thestackruby_iterindicateswhethercurrentlycallingmethodisaniterator(whetheritiscalledwithablock).Theframeisstructiter.ButforconsistencyI’llcallitITER.

Page 662: Ruby Hacking Guide

▼ruby_iter

767staticstructiter*ruby_iter;

763structiter{764intiter;/*thebelowthree*/765structiter*prev;766};

769#defineITER_NOT0/*thecurrentlyevaluatedmethodisnotaniterator*/770#defineITER_PRE1/*themethodwhichisgoingtobeevaluatednextisaniterator*/771#defineITER_CUR2/*thecurrentlyevaluatedmethodisaniterator*/(eval.c)

Althoughforeachmethodwecandeterminewhetheritisaniteratorornot,there’sanotherstructthatisdistinctfromstructFRAME.Why?

It’sobviousyouneedtoinformittothemethodwhen“itisaniterator”,butyoualsoneedtoinformthefactwhen“itisnotaniterator”.However,pushingawholeBLOCKjustforthisisveryheavy.Itwillalsocausethatinthecallersidetheproceduressuchasvariablereferenceswouldneedlesslyincrease.Thus,it’sbettertopushthesmallerandlighterITERinsteadofBLOCK.ThiswillbediscussedindetailinChapter16:Blocks.

ruby_dyna_vars

Theblocklocalvariablespace.TheframestructisstructRVarmapthathasalreadyseeninPart2.Formnowon,I’llcallitjustVARS.

▼structRVarmap

Page 663: Ruby Hacking Guide

52structRVarmap{53structRBasicsuper;54IDid;/*thenameofthevariable*/55VALUEval;/*thevalueofthevariable*/56structRVarmap*next;57};

(env.h)

NotethataframeisnotasinglestructRVarmapbutalistofthestructs(Fig.3).Andeachframeiscorrespondingtoalocalvariablescope.Sinceitcorrespondsto“localvariablescope”andnot“blocklocalvariablescope”,forinstance,evenifblocksarenested,onlyasinglelistisusedtoexpress.Thebreakbetweenblocksaresimilartotheoneoftheparser,itisexpressedbyaRVarmap(header)whoseidis0.Detailsaredeferredagain.ItwillbeexplainedinChapter16:Blocks.

Fig.3:ruby_dyna_vars

ruby_class

ruby_classrepresentsthecurrentclasstowhichamethodis

Page 664: Ruby Hacking Guide

defined.Sinceselfwillbethatclasswhenit’sanormalclassdefinitionstatement,ruby_class==self.But,whenitisthetoplevelorinthemiddleofparticularmethodslikeevalandinstance_eval,self!=ruby_classispossible.

Theframeofruby_classisasimpleVALUEandthere’snoparticularframestruct.Then,howcoulditbelikeastack?Moreover,thereweremanystructswithouttheprevpointer,howcouldtheseformastack?Theanswerisdeferredtothenextsection.

Fromnowon,I’llcallthisframeCLASS.

ruby_cref

ruby_crefrepresentstheinformationofthenestingofaclass.I’llcallthisframeCREFwiththesamewayofnamingasbefore.Itsstructis…

▼ruby_cref

847staticNODE*ruby_cref=0;

(eval.c)

…surprisinglyNODE.Thisisusedjustasa“definedstructwhichcanbepointedbyaVALUE”.ThenodetypeisNODE_CREFandtheassignmentsofitsmembersareshownbelow:

UnionMember MacroToAccess Usage

Page 665: Ruby Hacking Guide

u1.value nd_clss theouterclass(VALUE)u2 – –u3.node nd_next preservethepreviousCREF

Eventhoughthemembernameisnd_next,thevalueitactuallyhasisthe“previous(prev)”CREF.Takingthefollowingprogramasanexample,I’llexplaintheactualappearance.

classAclassBclassCnil#(A)endendend

Fig.4showshowruby_crefiswhenevaluatingthecode(A).

Fig.4:ruby_cref

However,illustratingthisimageeverytimeistediousanditsintentionbecomesunclear.Therefore,thesamestateasFig.4willbeexpressedinthefollowingnotation:

A←B←C

Page 666: Ruby Hacking Guide

PUSH/POPMacrosForeachstackframestruct,themacrostopushandpopareavailable.Forinstance,PUSH_FRAMEandPOP_FRAMEforFRAME.Becausethesewillappearinamoment,I’llthenexplaintheusageandcontent.

TheotherstatesWhiletheyarenotsoimportantasthemainstacks,theevaluatorofrubyhastheseveralotherstates.Thisisabrieflistofthem.However,someofthemarenotstacks.Actually,mostofthemarenot.

VariableName Type Meaning

scope_vmode int thedefaultvisibilitywhenamethodisdefined

ruby_in_eval int whetherornotparsingaftertheevaluationisstarted

ruby_current_node NODE* thefilenameandthelinenumberofwhatcurrentlybeingevaluated

ruby_safe_level int $SAFEruby_errinfo VALUE theexceptioncurrentlybeinghandled

ruby_wrapper VALUE thewrappermoduletoisolatetheenvironment

Page 667: Ruby Hacking Guide

ModuleDefinition

Theclassstatementandthemodulestatementandthesingletonclassdefinitionstatement,theyareallimplementedinsimilarways.

Becauseseeingsimilarthingscontinuouslythreetimesisnotinteresting,thistimelet’sexaminethemodulestatementwhichhastheleastelements(thus,issimple).

Firstofall,whatisthemodulestatement?Conversely,whatshouldhappenisthemodulestatement?Let’strytolistupseveralfeatures:

anewmoduleobjectshouldbecreatedthecreatedmoduleshouldbeselfitshouldhaveanindependentlocalvariablescopeifyouwriteaconstantassignment,aconstantshouldbedefinedonthemoduleifyouwriteaclassvariableassignment,aclassvariableshouldbedefinedonthemodule.ifyouwriteadefstatement,amethodshouldbedefinedonthemodule

Whatisthewaytoarchivethesethings?…isthepointofthissection.Now,let’sstarttolookatthecodes.

Investigation

Page 668: Ruby Hacking Guide

▼TheSourceProgram

moduleMa=1end

▼ItsSyntaxTree

NODE_MODULEnd_cname=9621(M)nd_body:NODE_SCOPEnd_rval=(null)nd_tbl=3[_~a]nd_next:NODE_LASGNnd_cnt=2nd_value:NODE_LITnd_lit=1:Fixnum

nd_cnameseemsthemodulename.cnameisprobablyeitherConstNAMEorClassNAME.Idumpedseveralthingsandfoundthatthere’salwaysNODE_SCOPEinnd_body.Sinceitsmembernd_tblholdsalocalvariabletableanditsnameissimilartostructSCOPE,itappearscertainthatthisNODE_SCOPEplaysanimportantroletocreatealocalvariablescope.

NODE_MODULE

Let’sexaminethehandlerofNODE_MODULEofrb_eval().Thepartsthatarenotclosetothemainline,suchasruby_raise()anderrorhandlingwerecutdrastically.Sofar,therehavebeenalotof

Page 669: Ruby Hacking Guide

cuttingworksfor200pages,ithasalreadybecameunnecessarytoshowtheoriginalcode.

▼rb_eval()−NODE_MODULE(simplified)

caseNODE_MODULE:{VALUEmodule;

if(rb_const_defined_at(ruby_class,node->nd_cname)){/*justobtainthealreadycreatedmodule*/module=rb_const_get(ruby_class,node->nd_cname);}else{/*createanewmoduleandsetitintotheconstant*/module=rb_define_module_id(node->nd_cname);rb_const_set(ruby_cbase,node->nd_cname,module);rb_set_class_path(module,ruby_class,rb_id2name(node->nd_cname));}

result=module_setup(module,node->nd_body);}break;

First,we’dliketomakesurethemoduleisnestedanddefinedabove(themoduleholdedby)ruby_class.Wecanunderstanditfromthefactthatitcallsruby_const_xxxx()onruby_class.Justonceruby_cbaseappears,butitisusuallyidenticaltoruby_class,sowecanignoreit.Eveniftheyaredifferent,itrarelycausesaproblem.

Thefirsthalf,itisbranchingbyifbecauseitneedstocheckifthemodulehasalreadybeendefined.Thisisbecause,inRuby,wecando“additional”definitionsonthesameonemoduleanynumberoftimes.

Page 670: Ruby Hacking Guide

moduleMdefa#M#aisdeifnedendendmoduleM#addadefinition(notre-definingoroverwriting)defb#M#bisdefinedendend

Inthisprogram,thetwomethods,aandb,willbedefinedonthemoduleM.

Inthiscase,ontheseconddefinitionofMthemoduleMwasalreadysettotheconstant,justobtainingandusingitwouldbesufficient.IftheconstantMdoesnotexistyet,itmeansthefirstdefinitionandthemoduleiscreated(byrb_define_module_id())

Lastly,module_setup()isthefunctionexecutingthebodyofamodulestatement.Notonlythemodulestatementsbuttheclassstatementsandthesingletonclassstatementsareexecutedbymodule_setup().ThisisthereasonwhyIsaid“allofthesethreetypeofstatementsaresimilarthings”.Fornow,I’dlikeyoutonotethatnode->nd_body(NODE_SCOPE)ispassedasanargument.

module_setup

Forthemoduleandclassandsingletonclassstatements,module_setup()executestheirbodies.Finally,theRubystackmanipulationswillappearinlargeamounts.

▼module_setup()

Page 671: Ruby Hacking Guide

3424staticVALUE3425module_setup(module,n)3426VALUEmodule;3427NODE*n;3428{3429NODE*volatilenode=n;3430intstate;3431structFRAMEframe;3432VALUEresult;/*OK*/3433TMP_PROTECT;34343435frame=*ruby_frame;3436frame.tmp=ruby_frame;3437ruby_frame=&frame;34383439PUSH_CLASS();3440ruby_class=module;3441PUSH_SCOPE();3442PUSH_VARS();3443/*(A)ruby_scope->local_varsinitialization*/3444if(node->nd_tbl){3445VALUE*vars=TMP_ALLOC(node->nd_tbl[0]+1);3446*vars++=(VALUE)node;3447ruby_scope->local_vars=vars;3448rb_mem_clear(ruby_scope->local_vars,node->nd_tbl[0]);3449ruby_scope->local_tbl=node->nd_tbl;3450}3451else{3452ruby_scope->local_vars=0;3453ruby_scope->local_tbl=0;3454}34553456PUSH_CREF(module);3457ruby_frame->cbase=(VALUE)ruby_cref;3458PUSH_TAG(PROT_NONE);3459if((state=EXEC_TAG())==0){3460if(trace_func){3461call_trace_func("class",ruby_current_node,ruby_class,3462ruby_frame->last_func,3463ruby_frame->last_class);

Page 672: Ruby Hacking Guide

3464}3465result=rb_eval(ruby_class,node->nd_next);3466}3467POP_TAG();3468POP_CREF();3469POP_VARS();3470POP_SCOPE();3471POP_CLASS();34723473ruby_frame=frame.tmp;3474if(trace_func){3475call_trace_func("end",ruby_last_node,0,3476ruby_frame->last_func,ruby_frame->last_class);3477}3478if(state)JUMP_TAG(state);34793480returnresult;3481}

(eval.c)

Thisistoobigtoreadallinonegulp.Let’scutthepartsthatseemsunnecessary.

First,thepartsaroundtrace_funccanbedeletedunconditionally.

Wecanseetheidiomsrelatedtotags.Let’ssimplifythembyexpressingwiththeRuby’sensure.

Immediatelyafterthestartofthefunction,theargumentnispurposefullyassignedtothelocalvariablenode,butvolatileisattachedtonodeanditwouldneverbeassignedafterthat,thusthisistopreventfrombeinggarbagecollected.Ifweassumethattheargumentwasnodefromthebeginning,itwouldnotchangethemeaning.

Page 673: Ruby Hacking Guide

Inthefirsthalfofthefunction,there’sthepartmanipulatingruby_framecomplicatedly.Itisobviouslypairedupwiththepartruby_frame=frame.tmpinthelasthalf.We’llfocusonthispartlater,butforthetimebeingthiscanbeconsideredaspushpopofruby_frame.

Plus,itseemsthatthecode(A)canbe,ascommented,summarizedastheinitializationofruby_scope->local_vars.Thiswillbediscussedlater.

Consequently,itcouldbesummarizedasfollows:

▼module_setup(simplified)

staticVALUEmodule_setup(module,node)VALUEmodule;NODE*node;{structFRAMEframe;VALUEresult;

pushFRAMEPUSH_CLASS();ruby_class=module;PUSH_SCOPE();PUSH_VARS();ruby_scope->local_varsinitializaionPUSH_CREF(module);ruby_frame->cbase=(VALUE)ruby_cref;beginresult=rb_eval(ruby_class,node->nd_next);ensurePOP_TAG();POP_CREF();POP_VARS();

Page 674: Ruby Hacking Guide

POP_SCOPE();POP_CLASS();popFRAMEendreturnresult;}

Itdoesrb_eval()withnode->nd_next,soit’scertainthatthisisthecodeofthemodulebody.Theproblemsareabouttheothers.Thereare5pointstosee.

ThingsoccuronPUSH_SCOPE()PUSH_VARS()HowthelocalvariablespaceisallocatedTheeffectofPUSH_CLASSTherelationshipbetweenruby_crefandruby_frame->cbaseWhatisdonebymanipulatingruby_frame

Let’sinvestigatetheminorder.

CreatingalocalvariablescopePUSH_SCOPEpushesalocalvariablespaceandPUSH_VARS()pushesablocklocalvariablespace,thusanewlocalvariablescopeiscreatedbythesetwo.Let’sexaminethecontentsofthesemacrosandwhatisdone.

▼PUSH_SCOPE()POP_SCOPE()

852#definePUSH_SCOPE()do{\853volatileint_vmode=scope_vmode;\854structSCOPE*volatile_old;\

Page 675: Ruby Hacking Guide

855NEWOBJ(_scope,structSCOPE);\856OBJSETUP(_scope,0,T_SCOPE);\857_scope->local_tbl=0;\858_scope->local_vars=0;\859_scope->flags=0;\860_old=ruby_scope;\861ruby_scope=_scope;\862scope_vmode=SCOPE_PUBLIC

869#definePOP_SCOPE()\870if(ruby_scope->flags&SCOPE_DONT_RECYCLE){\871if(_old)scope_dup(_old);\872}\873if(!(ruby_scope->flags&SCOPE_MALLOC)){\874ruby_scope->local_vars=0;\875ruby_scope->local_tbl=0;\876if(!(ruby_scope->flags&SCOPE_DONT_RECYCLE)&&\877ruby_scope!=top_scope){\878rb_gc_force_recycle((VALUE)ruby_scope);\879}\880}\881ruby_scope->flags|=SCOPE_NOSTACK;\882ruby_scope=_old;\883scope_vmode=_vmode;\884}while(0)

(eval.c)

Asthesameastags,SCOPEsalsocreateastackbybeingsynchronizedwiththemachinestack.Whatdifferentiateslightlyisthatthespacesofthestackframesareallocatedintheheap,themachinestackisusedinordertocreatethestackstructure(Fig.5.).

Page 676: Ruby Hacking Guide

Fig.5.ThemachinestackandtheSCOPEStack

Additionally,theflagslikeSCOPE_somethingrepeatedlyappearinginthemacrosarenotabletobeexplaineduntilIfinishtotalkallaboutinwhatformeachstackframeisrememberedandaboutblocks.Thus,thesewillbediscussedinChapter16:Blocksallatonce.

AllocatingthelocalvariablespaceAsImentionedmanytimes,thelocalvariablescopeisrepresentedbystructSCOPE.ButstructSCOPEisliterallya“scope”anditdoesnothavetherealbodytostorelocalvariables.Toputitmoreprecisely,ithasthepointertoaspacebutthere’sstillnoarrayattheplacewheretheonepointsto.Thefollowingpartofmodule_setuppreparesthearray.

▼Thepreparationofthelocalvariableslots

Page 677: Ruby Hacking Guide

3444if(node->nd_tbl){3445VALUE*vars=TMP_ALLOC(node->nd_tbl[0]+1);3446*vars++=(VALUE)node;3447ruby_scope->local_vars=vars;3448rb_mem_clear(ruby_scope->local_vars,node->nd_tbl[0]);3449ruby_scope->local_tbl=node->nd_tbl;3450}3451else{3452ruby_scope->local_vars=0;3453ruby_scope->local_tbl=0;3454}

(eval.c)

TheTMP_ALLOC()atthebeginningwillbedescribedinthenextsection.IfIputitshortly,itis“allocathatisassuredtoallocateonthestack(therefore,wedonotneedtoworryaboutGC)”.

node->nd_tblholdsinfactthelocalvariablenametablethathasappearedinChapter12:Syntaxtreeconstruction.Itmeansthatnd_tbl[0]containsthetablesizeandtherestisanarrayofID.Thistableisdirectlypreservedtolocal_tblofSCOPEandlocal_varsisallocatedtostorethelocalvariablevalues.Becausetheyareconfusing,it’sagoodthingwritingsomecommentssuchas“Thisisthevariablename”,“thisisthevalue”.Theonewithtblisforthenames.

Page 678: Ruby Hacking Guide

Fig.6.ruby_scope->local_vars

Whereisthisnodeused?Iexaminedthealllocal_varsmembersbutcouldnotfindtheaccesstoindex-1ineval.c.Expandingtherangeoffilestoinvestigate,Ifoundtheaccessingc.c.

▼rb_gc_mark_children()—T_SCOPE

815caseT_SCOPE:816if(obj->as.scope.local_vars&&(obj->as.scope.flags&SCOPE_MALLOC)){817intn=obj->as.scope.local_tbl[0]+1;818VALUE*vars=&obj->as.scope.local_vars[-1];819820while(n--){821rb_gc_mark(*vars);822vars++;823}824}825break;

(gc.c)

Apparently,thisisamechanismtoprotectnodefromGC.Butwhyisitnecessarytotomarkithere?nodeispurposefullystoreintothevolatilelocalvariable,soitwouldnotbegarbage-collectedduringtheexecutionofmodule_setup().

Honestlyspeaking,Iwasthinkingitmightmerelybeamistakeforawhilebutitturnedoutit’sactuallyveryimportant.Theissueisthisatthenextlineofthenextline:

▼ruby_scope->local_tbl

Page 679: Ruby Hacking Guide

3449ruby_scope->local_tbl=node->nd_tbl;

(eval.c)

Thelocalvariablenametablepreparedbytheparserisdirectlyused.Whenisthistablefreed?It’sthetimewhenthenodebecomenottobereferredfromanywhere.Then,whenshouldnodebefreed?It’sthetimeaftertheSCOPEassignedonthislinewilldisappearcompletely.Then,whenisthat?

SCOPEsometimespersistslongerthanthestatementthatcausesthecreationofit.AsitwillbediscussedatChapter16:Blocks,ifaProcobjectiscreated,itrefersSCOPE.Thus,Ifmodule_setup()hasfinished,theSCOPEcreatedthereisnotnecessarilybewhatisnolongerused.That’swhyit’snotsufficientthatnodeisonlyreferredfrom(thestackframeof)module_setup().Itmustbereferred“directly”fromSCOPE.

Ontheotherhand,thevolatilenodeofthelocalvariablecannotberemoved.Withoutit,nodeisfloatingonairuntilitwillbeassignedtolocal_vars.

Howeverthen,local_varsofSCOPEisnotsafe,isn’tit?TMP_ALLOC()is,asImentioned,theallocationonthestack,itbecomesinvalidatthetimemodule_setup()ends.Thisisinfact,atthemomentwhenProciscreated,theallocationmethodisabruptlyswitchedtomalloc().DetailswillbedescribedinChapter16:Blocks.

Page 680: Ruby Hacking Guide

Lastly,rb_mem_clear()seemszero-fillingbutactuallyitisQnil-fillingtoanarrayofVALUE(array.c).Bythis,alldefinedlocalvariablesareinitializedasnil.

TMP_ALLOC

Next,let’sreadTMP_ALLOCthatallocatesthelocalvariablespace.ThismacroisactuallypairedwithTMP_PROTECTexistingsilentlyatthebeginningofmodule_setup().Itstypicalusageisthis:

VALUE*ptr;TMP_PROTECT;

ptr=TMP_ALLOC(size);

ThereasonwhyTMP_PROTECTisintheplaceforthelocalvariabledefinitionsisthat…Let’sseeitsdefinition.

▼TMP_ALLOC()

1769#ifdefC_ALLOCA1770#defineTMP_PROTECTNODE*volatiletmp__protect_tmp=01771#defineTMP_ALLOC(n)\1772(tmp__protect_tmp=rb_node_newnode(NODE_ALLOCA,\1773ALLOC_N(VALUE,n),tmp__protect_tmp,n),\1774(void*)tmp__protect_tmp->nd_head)1775#else1776#defineTMP_PROTECTtypedefintfoobazzz1777#defineTMP_ALLOC(n)ALLOCA_N(VALUE,n)1778#endif

(eval.c)

Page 681: Ruby Hacking Guide

…itisbecauseitdefinesalocalvariable.

AsdescribedinChapter5:Garbagecollection,intheenvironmentof#ifdefC_ALLOCA(thatis,thenativealloca()doesnotexist)malloca()isusedtoemulatealloca().However,theargumentsofamethodareobviouslyVALUEsandtheGCcouldnotfindaVALUEifitisstoredintheheap.Therefore,itisenforcedthatGCcanfinditthroughNODE.

Fig.7.anchorthespacetothestackthroughNODE

Onthecontrary,intheenvironmentwiththetruealloca(),wecannaturallyusealloca()andthere’snoneedtouseTMP_PROTECT.Thus,aharmlessstatementisarbitrarilywritten.

Bytheway,whydotheywanttousealloca()verymuchbyallmeans.It’smerelybecause"alloca()isfasterthanmalloc()",theysaid.Onecanthinkthatit’snotsoworthtocareaboutsuchtinydifference,butbecausethecoreoftheevaluatoristhebiggest

Page 682: Ruby Hacking Guide

bottleneckofruby,…thesameasabove.

Changingtheplacetodefinemethodson.

Thevalueofthestackruby_classistheplacetodefineamethodonatthetime.Conversely,ifonepushavaluetoruby_class,itchangestheclasstodefineamethodon.Thisisexactlywhatisnecessaryforaclassstatement.Therefore,It’salsonecessarytodoPUSH_CLASS()inmodule_setup().Hereisthecodeforit:

PUSH_CLASS();ruby_class=module;::POP_CLASS();

Whyistheretheassignmenttoruby_classafterdoingPUSH_CLASS().Wecanunderstanditunexpectedlyeasilybylookingatthedefinition.

▼PUSH_CLASS()POP_CLASS()

841#definePUSH_CLASS()do{\842VALUE_class=ruby_class

844#definePOP_CLASS()ruby_class=_class;\845}while(0)

(eval.c)

Becauseruby_classisnotmodifiedeventhoughPUSH_CLASSisdone,

Page 683: Ruby Hacking Guide

itisnotactuallypusheduntilsettingbyhand.Thus,thesetwoarecloserto“saveandrestore”ratherthan“pushandpop”.

YoumightthinkthatitcanbeacleanermacroifpassingaclassastheargumentofPUSH_CLASS()…It’sabsolutelytrue,butbecausetherearesomeplaceswecannotobtaintheclassbeforepushing,itisinthisway.

NestingClassesruby_crefrepresentstheclassnestinginformationatruntime.Therefore,it’snaturallypredictedthatruby_crefwillbepushedonthemodulestatementsorontheclassstatements.Inmodule_setup(),itispushedasfollows:

PUSH_CREF(module);ruby_frame->cbase=(VALUE)ruby_cref;::POP_CREF();

Here,moduleisthemodulebeingdefined.Let’salsoseethedefinitionsofPUSH_CREF()andPOP_CREF().

▼PUSH_CREF()POP_CREF()

849#definePUSH_CREF(c)\ruby_cref=rb_node_newnode(NODE_CREF,(c),0,ruby_cref)850#definePOP_CREF()ruby_cref=ruby_cref->nd_next

(eval.c)

Page 684: Ruby Hacking Guide

UnlikePUSH_SCOPEorsomething,therearenotanycomplicatedtechniquesandit’sveryeasytodealwith.It’salsonotgoodifthere’scompletelynotanysuchthing.

Theproblemremainsunsolvediswhatisthemeaningofruby_frame->cbase.ItistheinformationtoreferaclassvariableoraconstantfromthecurrentFRAME.Detailswillbediscussedinthelastsectionofthischapter.

ReplacingframesLastly,let’sfocusonthemanipulationofruby_frame.Thefirstthingisitsdefinition:

structFRAMEframe;

Itisnotapointer.ThismeansthattheentireFRAMEisallocatedonthestack.BoththemanagementstructureoftheRubystackandthelocalvariablespaceareonthestack,butinthecaseofFRAMEtheentirestructisstoredonthestack.Theextremeconsumptionofthemachinestackbyrubyisthefruitofthese“smalltechniques”pilingup.

Thennext,let’slookatwheredoingseveralthingswithframe.

frame=*ruby_frame;/*copytheentirestruct*/frame.tmp=ruby_frame;/*protecttheoriginalFRAMEfromGC*/ruby_frame=&frame;/*replaceruby_frame*/::

Page 685: Ruby Hacking Guide

ruby_frame=frame.tmp;/*restore*/

Thatis,ruby_frameseemstemporarilyreplaced(notpushing).Whyisitdoingsuchthing?

IdescribedthatFRAMEis“pushedonmethodcalls”,buttobemoreprecise,itisthestackframetorepresent“themainenvironmenttoexecuteaRubyprogram”.Youcaninferitfrom,forinstance,ruby_frame->cbasewhichappearedpreviously.last_funcwhichis“thelastcalledmethodname”alsosuggestsit.

Then,whyisFRAMEnotstraightforwardlypushed?ItisbecausethisistheplacewhereitisnotallowedtopushFRAME.FRAMEiswantedtobepushed,butifFRAMEispushed,itwillappearinthebacktracesoftheprogramwhenanexceptionoccurs.Thebacktracesarethingsdisplayedlikefollowings:

%rubyt.rbt.rb:11:in`c':someerroroccured(ArgumentError)fromt.rb:7:in`b'fromt.rb:3:in`a'fromt.rb:14

Butthemodulestatementsandtheclassstatementsarenotmethodcalls,soitisnotdesirabletoappearinthis.That’swhyitis“replaced”insteadof“pushed”.

Themethoddefinition

Page 686: Ruby Hacking Guide

Asthenexttopicofthemoduledefinitions,let’slookatthemethoddefinitions.

Investigation▼TheSourceProgram

defm(a,b,c)nilend

▼ItsSyntaxTree

NODE_DEFNnd_mid=9617(m)nd_noex=2(NOEX_PRIVATE)nd_defn:NODE_SCOPEnd_rval=(null)nd_tbl=5[_~abc]nd_next:NODE_ARGSnd_cnt=3nd_rest=-1nd_opt=(null)NODE_NIL

Idumpedseveralthingsandfoundthatthere’salwaysNODE_SCOPEinnd_defn.NODE_SCOPEis,aswe’veseenatthemodulestatements,thenodetostoretheinformationtopushalocalvariablescope.

NODE_DEFN

Page 687: Ruby Hacking Guide

Subsequently,wewillexaminethecorrespondingcodeofrb_eval().Thispartcontainsalotoferrorhandlingsandtedious,theyareallomittedagain.Thewayofomittingisasusual,deletingtheeverypartstodirectlyorindirectlycallrb_raise()rb_warn()rb_warning().

▼rb_eval()−NODE_DEFN(simplified)

NODE*defn;intnoex;

if(SCOPE_TEST(SCOPE_PRIVATE)||node->nd_mid==init){noex=NOEX_PRIVATE;(A)}elseif(SCOPE_TEST(SCOPE_PROTECTED)){noex=NOEX_PROTECTED;(B)}elseif(ruby_class==rb_cObject){noex=node->nd_noex;(C)}else{noex=NOEX_PUBLIC;(D)}

defn=copy_node_scope(node->nd_defn,ruby_cref);rb_add_method(ruby_class,node->nd_mid,defn,noex);result=Qnil;

Inthefirsthalf,therearethewordslikeprivateorprotected,soitisprobablyrelatedtovisibility.noex,whichisusedasthenamesofflags,seemsNOdeEXposure.Let’sexaminetheifstatementsinorder.

(A)SCOPE_TEST()isamacrotocheckifthere’sanargumentflaginscope_vmode.Therefore,thefirsthalfofthisconditionalstatement

Page 688: Ruby Hacking Guide

means“isitaprivatescope?”.Thelasthalfmeans“it’sprivateifthisisdefininginitialize”.Themethodinitializetoinitializeanobjectwillunquestionablybecomeprivate.

(B)Itisprotectedifthescopeisprotected(notsurprisingly).Myfeelingisthatthere’refewcasesprotectedisrequiredinRuby.

(C)Thisisabug.Ifoundthisjustbeforethesubmissionofthisbook,soIcouldn’tfixthisbeforehand.Inthelatestcodethispartisprobablyalreadyremoved.Theoriginalintentionistoenforcethemethodsdefinedattopleveltobeprivate.

(D)Ifitisnotanyoftheaboveconditions,itispublic.

Actually,there’snotathingtoworthtocareaboutuntilhere.Theimportantpartisthenexttwolines.

defn=copy_node_scope(node->nd_defn,ruby_cref);rb_add_method(ruby_class,node->nd_mid,defn,noex);

copy_node_scope()isafunctiontocopy(only)NODE_SCOPEattachedtothetopofthemethodbody.Itisimportantthatruby_crefispassed…butdetailswillbedescribedsoon.

Aftercopying,thedefinitionisfinishedbyaddingitbyrb_add_method().Theplacetodefineonisofcourseruby_class.

copy_node_scope()

Page 689: Ruby Hacking Guide

copy_node_scope()iscalledonlyfromthetwoplaces:themethoddefinition(NODE_DEFN)andthesingletonmethoddefinition(NODE_DEFS)inrb_eval().Therefore,lookingatthesetwoissufficienttodetecthowitisused.Plus,theusagesatthesetwoplacesarealmostthesame.

▼copy_node_scope()

1752staticNODE*1753copy_node_scope(node,rval)1754NODE*node;1755VALUErval;1756{1757NODE*copy=rb_node_newnode(NODE_SCOPE,0,rval,node->nd_next);17581759if(node->nd_tbl){1760copy->nd_tbl=ALLOC_N(ID,node->nd_tbl[0]+1);1761MEMCPY(copy->nd_tbl,node->nd_tbl,ID,node->nd_tbl[0]+1);1762}1763else{1764copy->nd_tbl=0;1765}1766returncopy;1767}

(eval.c)

Imentionedthattheargumentrvalistheinformationoftheclassnesting(ruby_cref)ofwhenthemethodisdefined.Apparently,itisrvalbecauseitwillbesettond_rval.

Inthemainifstatementcopiesnd_tblofNODE_SCOPE.Itisalocalvariablenametableinotherwords.The+1atALLOC_Nistoadditionallyallocatethespacefornd_tbl[0].Aswe’veseeninPart

Page 690: Ruby Hacking Guide

2,nd_tbl[0]holdsthelocalvariablescount,thatwas“theactuallengthofnd_tbl–1”.

Tosummarize,copy_node_scope()makesacopyoftheNODE_SCOPEwhichistheheaderofthemethodbody.However,nd_rvalisadditionallysetanditistheruby_cref(theclassnestinginformation)ofwhentheclassisdefined.Thisinformationwillbeusedlaterwhenreferringconstantsorclassvariables.

rb_add_method()

Thenextthingisrb_add_method()thatisthefunctiontoregisteramethodentry.

▼rb_add_method()

237void238rb_add_method(klass,mid,node,noex)239VALUEklass;240IDmid;241NODE*node;242intnoex;243{244NODE*body;245246if(NIL_P(klass))klass=rb_cObject;247if(ruby_safe_level>=4&&(klass==rb_cObject||!OBJ_TAINTED(klass))){248rb_raise(rb_eSecurityError,"Insecure:can'tdefinemethod");249}250if(OBJ_FROZEN(klass))rb_error_frozen("class/module");251rb_clear_cache_by_id(mid);252body=NEW_METHOD(node,noex);253st_insert(RCLASS(klass)->m_tbl,mid,body);254}

Page 691: Ruby Hacking Guide

(eval.c)

NEW_METHOD()isamacrotocreateNODE.rb_clear_cache_by_id()isafunctiontomanipulatethemethodcache.Thiswillbeexplainedinthenextchapter“Method”.

Let’slookatthesyntaxtreewhichiseventuallystoredinm_tblofaclass.Ipreparednodedump-methodforthiskindofpurposes.(nodedump-method:comeswithnodedump.nodedumpistools/nodedump.tar.gzoftheattachedCD-ROM)

%ruby-e'classCdefm(a)puts"ok"endendrequire"nodedump-method"NodeDump.dumpC,:m#dumpthemethodmoftheclassC'NODE_METHODnd_noex=0(NOEX_PUBLIC)nd_cnt=0nd_body:NODE_SCOPEnd_rval=Object<-Cnd_tbl=3[_~a]nd_next:NODE_ARGSnd_cnt=1nd_rest=-1nd_opt=(null)U⽛S頏著

**unhandled**

Page 692: Ruby Hacking Guide

ThereareNODE_METHODatthetopandNODE_SCOPEpreviouslycopiedbycopy_node_scope()atthenext.Theseprobablyrepresenttheheaderofamethod.Idumpedseveralthingsandthere’snotanyNODE_SCOPEwiththemethodsdefinedinC,thusitseemstoindicatethatthemethodisdefinedatRubylevel.

Additionally,atnd_tblofNODE_SCOPEtheparametervariablename(a)appears.Imentionedthattheparametervariablesareequivalenttothelocalvariables,andthisbrieflyimpliesit.

I’llomittheexplanationaboutNODE_ARGSherebecauseitwillbedescribedatthenextchapter“Method”.

Lastly,thend_cntoftheNODE_METHOD,it’snotsonecessarytocareaboutthistime.Itisusedwhenhavingtodowithalias.

AssignmentandReference

Cometothinkofit,mostofthestacksareusedtorealizeavarietyofvariables.Wehavelearnedtopushvariousstacks,thistimelet’sexaminethecodetoreferencevariables.

Localvariable

Page 693: Ruby Hacking Guide

Theallnecessaryinformationtoassignorreferlocalvariableshasappeared,soyouareprobablyabletopredict.Therearethefollowingtwopoints:

localvariablescopeisanarraywhichispointedbyruby_scope->local_vars

thecorrespondencebetweeneachlocalvariablenameandeacharrayindexhasalreadyresolvedattheparserlevel.

Therefore,thecodeforthelocalvariablereferencenodeNODE_LVARisasfollows:

▼rb_eval()−NODE_LVAR

2975caseNODE_LVAR:2976if(ruby_scope->local_vars==0){2977rb_bug("unexpectedlocalvariable");2978}2979result=ruby_scope->local_vars[node->nd_cnt];2980break;

(eval.c)

Itgoeswithoutsayingbutnode->nd_cntisthevaluethatlocal_cnt()oftheparserreturns.

Constant

CompleteSpecificationInChapter6:Variablesandconstants,Italkedaboutinwhatform

Page 694: Ruby Hacking Guide

constantsarestoredandAPI.Constantsarebelongtoclassesandinheritedasthesameasmethods.Asfortheiractualappearances,theyareregisteredtoiv_tblofstructRClasswithinstancevariablesandclassvariables.

Thesearchingpathofaconstantisfirstlytheouterclass,secondlythesuperclass,however,rb_const_get()onlysearchesthesuperclass.Why?Toanswerthisquestion,Ineedtorevealthelastspecificationofconstants.Takealookatthefollowingcode:

classAC=5defA.newputsCsuperendend

A.newisasingletonmethodofA,soitsclassisthesingletonclass(A).Ifitisinterpretedbyfollowingtherule,itcannotobtaintheconstantCwhichisbelongstoA.

Butbecauseitiswrittensoclose,tobecometowantrefertheconstantCishumannature.Therefore,suchreferenceispossibleinRuby.ItcanbesaidthatthisspecificationreflectsthecharacteristicofRuby“Theemphasisisontheappearanceofthesourcecode”.

IfIgeneralizethisrule,whenreferringaconstantfrominsideofamethod,bysettingtheplacewhichthemethoddefinitionis“written”asthestartpoint,itreferstheconstantoftheouterclass.

Page 695: Ruby Hacking Guide

And,“theclassofwherethemethodiswritten”dependsonitscontext,thusitcouldnotbehandledwithouttheinformationfromboththeparserandtheevaluator.Thisiswhyrb_cost_get()didnothavethesearchingpathoftheouterclass.

cbase

Then,let’slookatthecodetoreferconstantsincludingtheouterclass.Theordinaryconstantreferencestowhich::isnotattached,becomeNODE_CONSTinthesyntaxtree.Thecorrespondingcodeinrb_eval()is…

▼rb_eval()−NODE_CONST

2994caseNODE_CONST:2995result=ev_const_get(RNODE(ruby_frame->cbase),node->nd_vid,self);2996break;

(eval.c)

First,nd_vidappearstobeVariableIDanditprobablymeansaconstantname.And,ruby_frame->cbaseis“theclasswherethemethoddefinitioniswritten”.Thevaluewillbesetwheninvokingthemethod,thusthecodetosethasnotappearedyet.Andtheplacewherethevaluetobesetcomesfromisthend_rvalthathasappearedincopy_node_scope()ofthemethoddefinition.I’dlikeyoutogobackalittleandcheckthatthememberholdstheruby_crefofwhenthemethodisdefined.

Page 696: Ruby Hacking Guide

Thismeans,first,theruby_creflinkisbuiltwhendefiningaclassoramodule.AssumethatthejustdefinedclassisC(Fig.81),

Definingthemethodm(thisisprobablyC#m)here,thenthecurrentruby_crefismemorizedbythemethodentry(Fig.82).

Afterthat,whentheclassstatementfinishedtheruby_crefwouldstarttopointanothernode,butnode->nd_rvalnaturallycontinuestopointtothesamething.(Fig.83)

Then,wheninvokingthemethodC#m,getnode->nd_rvalandinsertintothejustpushedruby_frame->cbase(Fig.84)

…Thisisthemechanism.Complicated.

Page 697: Ruby Hacking Guide

Fig8.CREFTrasfer

ev_const_get()

Now,let’sgobacktothecodeofNODE_CONST.Sinceonlyev_const_get()isleft,we’lllookatit.

Page 698: Ruby Hacking Guide

▼ev_const_get()

1550staticVALUE1551ev_const_get(cref,id,self)1552NODE*cref;1553IDid;1554VALUEself;1555{1556NODE*cbase=cref;1557VALUEresult;15581559while(cbase&&cbase->nd_next){1560VALUEklass=cbase->nd_clss;15611562if(NIL_P(klass))returnrb_const_get(CLASS_OF(self),id);1563if(RCLASS(klass)->iv_tbl&&st_lookup(RCLASS(klass)->iv_tbl,id,&result)){1564returnresult;1565}1566cbase=cbase->nd_next;1567}1568returnrb_const_get(cref->nd_clss,id);1569}

(eval.c)

((Accordingtotheerrata,thedescriptionofev_const_get()waswrong.Iomitthispartfornow.))

ClassvariableWhatclassvariablesrefertoisalsoruby_cref.Needlesstosay,unliketheconstantswhichsearchovertheouterclassesoneafteranother,itusesonlythefirstelement.Let’slookatthecodeofNODE_CVARwhichisthenodetorefertoaclassvariable.

Page 699: Ruby Hacking Guide

Whatisthecvar_cbase()?Ascbaseisattached,itisprobablyrelatedtoruby_frame->cbase,buthowdotheydiffer?Let’slookatit.

▼cvar_cbase()

1571staticVALUE1572cvar_cbase()1573{1574NODE*cref=RNODE(ruby_frame->cbase);15751576while(cref&&cref->nd_next&&FL_TEST(cref->nd_clss,FL_SINGLETON)){1577cref=cref->nd_next;1578if(!cref->nd_next){1579rb_warn("classvariableaccessfromtoplevelsingletonmethod");1580}1581}1582returncref->nd_clss;1583}

(eval.c)

Ittraversescbaseuptotheclassthatisnotthesingletonclass,itseems.Thisfeatureisaddedtocounterthefollowingkindofcode:

classCclassC@@cvar=1@@cvar=1class<<CdefC.mdefm@@cvar@@cvarendenddefC.m2defm2@@cvar+@@cvar@@cvar+@@cvarendendendendend

Page 700: Ruby Hacking Guide

Boththeleftandrightcodeendsupdefiningthesamemethod,butifyouwriteinthewayoftherightsideitistedioustowritetheclassnamerepeatedlyasthenumberofmethodsincreases.Therefore,whendefiningmultiplesingletonmethods,manypeoplechoosetowriteintheleftsidewayofusingthesingletonclassdefinitionstatementtobundle.

However,thesetwodiffersinthevalueofruby_cref.Theoneusingthesingletonclassdefinitionisruby_cref=(C)andtheotheronedefiningsingletonmethodsseparatelyisruby_cref=C.Thismaycausetodifferintheplaceswhereclassvariablesreferto,sothisisnotconvenient.

Therefore,assumingit’srarecasetodefineclassvariablesonsingletonclasses,itskipsoversingletonclasses.Thisreflectsagainthattheemphasisismoreontheusabilityratherthantheconsistency.

And,whenthecaseisaconstantreference,sinceitsearchesalloftheouterclasses,Cisincludedinthesearchpathineitherway,sothere’snoproblem.Plus,asforanassignment,sinceitcouldn’tbewritteninsidemethodsinthefirstplace,itisalsonotrelated.

MultipleAssignmentIfsomeoneasked“whereisthemostcomplicatedspecificationofRuby?”,Iwouldinstantlyanswerthatitismultipleassignment.Itisevenimpossibletounderstandthebigpictureofmultiple

Page 701: Ruby Hacking Guide

assignment,IhaveanaccountofwhyIthinkso.Inshort,thespecificationofthemultipleassignmentisdefinedwithoutevenasubtleintentiontoconstructsothatthewholespecificationiswell-organized.Thebasisofthespecificationisalways“thebehaviorwhichseemsconvenientinseveraltypicalusecases”.ThiscanbesaidabouttheentireRuby,butparticularlyaboutthemultipleassignment.

Then,howcouldweavoidbeinglostinthejungleofcodes.Thisissimilartoreadingthestatefulscanneranditisnotseeingthewholepicture.There’snowholepictureinthefirstplace,wecouldnotseeit.Cuttingthecodeintoblockslike,thiscodeiswrittenforthisspecification,thatcodeiswrittenforthatspecification,…understandingthecorrespondencesonebyoneinsuchmanneristheonlyway.

Butthisbookistounderstandtheoverallstructureofrubyandisnot“AdvancedRubyProgramming”.Thus,dealingwithverytinythingsisnotfruitful.Sohere,weonlythinkaboutthebasicstructureofmultipleassignmentandtheverysimple“multiple-to-multiple”case.

First,followingthestandard,let’sstartwiththesyntaxtree.

▼TheSourceProgram

a,b=7,8

▼ItsSyntaxTree

Page 702: Ruby Hacking Guide

NODE_MASGNnd_head:NODE_ARRAY[0:NODE_LASGNnd_cnt=2nd_value:1:NODE_LASGNnd_cnt=3nd_value:]nd_value:NODE_REXPANDnd_head:NODE_ARRAY[0:NODE_LITnd_lit=7:Fixnum1:NODE_LITnd_lit=8:Fixnum]

Boththeleft-handandright-handsidesarethelistsofNODE_ARRAY,there’sadditionallyNODE_REXPANDintherightside.REXPANDmaybe“RightvalueEXPAND”.Wearecuriousaboutwhatthisnodeisdoing.Let’ssee.

▼rb_eval()−NODE_REXPAND

2575caseNODE_REXPAND:2576result=avalue_to_svalue(rb_eval(self,node->nd_head));2577break;

(eval.c)

Page 703: Ruby Hacking Guide

Youcanignoreavalue_to_svalue().NODE_ARRAYisevaluatedbyrb_eval(),(becauseitisthenodeofthearrayliteral),itisturnedintoaRubyarrayandreturnedback.So,beforetheleft-handsideishandled,allintheright-handsideareevaluated.Thisenableseventhefollowingcode:

a,b=b,a#swapvariablesinoneline

Let’slookatNODE_MASGNintheleft-handside.

▼rb_eval()−NODE_MASGN

2923caseNODE_MASGN:2924result=massign(self,node,rb_eval(self,node->nd_value),0);2925break;

(eval.c)

Hereisonlytheevaluationoftheright-handside,therestsaredelegatedtomassign().

massign()

▼massi……

3917staticVALUE3918massign(self,node,val,pcall)3919VALUEself;3920NODE*node;3921VALUEval;3922intpcall;3923{

Page 704: Ruby Hacking Guide

(eval.c)

I’msorrythisishalfway,butI’dlikeyoutostopandpayattentiontothe4thargument.pcallisProcCALL,thisindicateswhetherornotthefunctionisusedtocallProcobject.BetweenProccallsandtheothersthere’salittledifferenceinthestrictnessofthecheckofthemultipleassignments,soaflagisreceivedtocheck.Obviously,thevalueisdecidedtobeeither0or1.

Then,I’dlikeyoutolookatthepreviouscodecallingmassign(),itwaspcall=0.Therefore,weprobablydon’tmindifassumingitispcall=0forthetimebeingandextractingthevariables.Thatis,whenthere’sanargumentlikepcallwhichisslightlychangingthebehavior,wealwaysneedtoconsiderthetwopatternsofscenarios,soitisreallycumbersome.Ifthere’sonlyoneactualfunctionmassign(),tothinkasifthereweretwofunctions,pcall=0andpcall=1,iswaysimplertoread.

Whenwritingaprogramwemustavoidduplicationsasmuchaspossible,butthisprincipleisunrelatedifitiswhenreading.Ifpatternsarelimited,copyingitandlettingittoberedundantisrathertherightapproach.Therearewordings“optimizeforspeed”“optimizeforthecodesize”,inthiscasewe’ll“optimizeforreadability”.

So,assumingitispcall=0andcuttingthecodesasmuchaspossibleandthefinalappearanceisshownasfollows:

Page 705: Ruby Hacking Guide

▼massign()(simplified)

staticVALUEmassign(self,node,val/*,pcall=0*/)VALUEself;NODE*node;VALUEval;{NODE*list;longi=0,len;

val=svalue_to_mvalue(val);len=RARRAY(val)->len;list=node->nd_head;/*(A)*/for(i=0;list&&i<len;i++){assign(self,list->nd_head,RARRAY(val)->ptr[i],pcall);list=list->nd_next;}/*(B)*/if(node->nd_args){if(node->nd_args==(NODE*)-1){/*nocheckformere`*'*/}elseif(!list&&i<len){assign(self,node->nd_args,rb_ary_new4(len-i,RARRAY(val)->ptr+i),pcall);}else{assign(self,node->nd_args,rb_ary_new2(0),pcall);}}

/*(C)*/while(list){i++;assign(self,list->nd_head,Qnil,pcall);list=list->nd_next;}returnval;}

Page 706: Ruby Hacking Guide

valistheright-handsidevalue.Andthere’sthesuspiciousconversioncalledsvalue_to_mvalue(),sincemvalue_to_svalue()appearedpreviouslyandsvalue_to_mvalue()inthistime,soyoucaninfer“itmustbegettingback”.((errata:itwasavalue_to_svalue()inthepreviouscase.Therefore,it’shardtoinfer“gettingback”,butyoucanignorethemanyway.))Thus,thebotharedeleted.Inthenextline,sinceitusesRARRAY(),youcaninferthattheright-handsidevalueisanArrayofRuby.Meanwhile,theleft-handsideisnode->nd_head,soitisthevalueassignedtothelocalvariablelist.Thislistisalsoanode(NODE_ARRAY).

We’lllookatthecodebyclause.

(A)assignis,asthenamesuggests,afunctiontoperformanone-to-oneassignment.Sincetheleft-handsideisexpressedbyanode,ifitis,forinstance,NODE_IASGN(anassignmenttoaninstancevariable),itassignswithrb_ivar_set().So,whatitisdoinghereisadjustingtoeitherlistandvalwhichisshorteranddoingone-to-oneassignments.(Fig.9)

Fig.9.assignwhencorresponded

(B)ifthereareremaindersontheright-handside,turnthemintoa

Page 707: Ruby Hacking Guide

Rubyarrayandassignitinto(theleft-handsideexpressedby)thenode->nd_args.

(C)ifthereareremaindersontheleft-handside,assignniltoallofthem.

Bytheway,theprocedurewhichisassumingpcall=0thencuttingoutisverysimilartothedataflowanalytics/constantfoldingsusedontheoptimizationphaseofcompilers.Therefore,wecanprobablyautomateittosomeextent.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 708: Ruby Hacking Guide

RubyHackingGuide

Page 709: Ruby Hacking Guide

Chapter15:Methods

Inthischapter,I’lltalkaboutmethodsearchingandinvoking.

Searchingmethods

TerminologyInthischapter,bothmethodcallsandmethoddefinitionsarediscussed,andtherewillappearreallyvarious“arguments”.Therefore,tomakeitnotconfusing,let’sstrictlydefinetermshere:

m(a)#aisa"normalargument"m(*list)#listisan"arrayargument"m(&block)#blockisa"blockargument"

defm(a)#aisa"normalparameter"defm(a=nil)#aisan"optionparameter",nilis"itdefaultvalue".defm(*rest)#restisa"restparameter"defm(&block)#blockisa"blockparameter"

Inshort,theyareall“arguments”whenpassingand“parameters”whenreceiving,andeachadjectiveisattachedaccordingtoitstype.

However,amongtheabovethings,the“blockarguments”andthe“blockparameters”willbediscussedinthenextchapter.

Page 710: Ruby Hacking Guide

Investigation▼TheSourceProgram

obj.method(7,8)

▼ItsSyntaxTree

NODE_CALLnd_mid=9049(method)nd_recv:NODE_VCALLnd_mid=9617(obj)nd_args:NODE_ARRAY[0:NODE_LITnd_lit=7:Fixnum1:NODE_LITnd_lit=8:Fixnum]

ThenodeforamethodcallisNODE_CALL.Thend_argsholdstheargumentsasalistofNODE_ARRAY.

Additionally,asthenodesformethodcalls,therearealsoNODE_FCALLandNODE_VCALL.NODE_FCALLisforthe“method(args)”form,NODE_VCALLcorrespondstomethodcallsinthe“method”formthatisthesameformasthelocalvariables.FCALLandVCALLcouldactuallybeintegratedintoone,butbecausethere’snoneedtoprepareargumentswhenitisVCALL,theyareseparatedfromeachotheronlyinordertosavebothtimesandmemoriesforit.

Page 711: Ruby Hacking Guide

Now,let’slookatthehandlerofNODE_CALLinrb_eval().

▼rb_eval()−NODE_CALL

2745caseNODE_CALL:2746{2747VALUErecv;2748intargc;VALUE*argv;/*usedinSETUP_ARGS*/2749TMP_PROTECT;27502751BEGIN_CALLARGS;2752recv=rb_eval(self,node->nd_recv);2753SETUP_ARGS(node->nd_args);2754END_CALLARGS;27552756SET_CURRENT_SOURCE();2757result=rb_call(CLASS_OF(recv),recv,node->nd_mid,argc,argv,0);2758}2759break;

(eval.c)

Theproblemsareprobablythethreemacros,BEGIN_CALLARGSSETUP_ARGS()END_CALLARGS.Itseemsthatrb_eval()istoevaluatethereceiverandrb_call()istoinvokethemethod,wecanroughlyimaginethattheevaluationoftheargumentsmightbedoneinthethreemacros,butwhatisactuallydone?BEGIN_CALLARGSandEND_CALLARGSaredifficulttounderstandbeforetalkingabouttheiterators,sotheyareexplainedinthenextchapter“Block”.Here,let’sinvestigateonlyaboutSETUP_ARGS().

SETUP_ARGS()

SETUP_ARGS()isthemacrotoevaluatetheargumentsofamethod.

Page 712: Ruby Hacking Guide

Insideofthismacro,asthecommentintheoriginalprogramsays,thevariablesnamedargcandargvareused,sotheymustbedefinedinadvance.AndbecauseitusesTMP_ALLOC(),itmustuseTMP_PROTECTinadvance.Therefore,somethinglikethefollowingisaboilerplate:

intargc;VALUE*argv;/*usedinSETUP_ARGS*/TMP_PROTECT;

SETUP_ARGS(args_node);

args_nodeis(thenoderepresents)theargumentsofthemethod,turnitintoanarrayofthevaluesobtainedbyevaluatingit,andstoreitinargv.Let’slookatit:

▼SETUP_ARGS()

1780#defineSETUP_ARGS(anode)do{\1781NODE*n=anode;\1782if(!n){\noarguments1783argc=0;\1784argv=0;\1785}\1786elseif(nd_type(n)==NODE_ARRAY){\onlynormalarguments1787argc=n->nd_alen;\1788if(argc>0){\argumentspresent1789inti;\1790n=anode;\1791argv=TMP_ALLOC(argc);\1792for(i=0;i<argc;i++){\1793argv[i]=rb_eval(self,n->nd_head);\1794n=n->nd_next;\1795}\1796}\1797else{\noarguments

Page 713: Ruby Hacking Guide

1798argc=0;\1799argv=0;\1800}\1801}\1802else{\bothoroneofanarrayargument1803VALUEargs=rb_eval(self,n);\andablockargument1804if(TYPE(args)!=T_ARRAY)\1805args=rb_ary_to_ary(args);\1806argc=RARRAY(args)->len;\1807argv=ALLOCA_N(VALUE,argc);\1808MEMCPY(argv,RARRAY(args)->ptr,VALUE,argc);\1809}\1810}while(0)

(eval.c)

Thisisabitlong,butsinceitclearlybranchesinthreeways,notsoterribleactually.Themeaningofeachbranchiswrittenascomments.

Wedon’thavetocareaboutthecasewithnoarguments,thetworestbranchesaredoingsimilarthings.Roughlyspeaking,whattheyaredoingconsistsofthreesteps:

allocateaspacetostoretheargumentsevaluatetheexpressionsoftheargumentscopythevalueintothevariablespace

IfIwriteinthecode(andtidyupalittle),itbecomesasfollows.

/*****elseifclause、argc!=0*****/inti;n=anode;argv=TMP_ALLOC(argc);/*1*/for(i=0;i<argc;i++){argv[i]=rb_eval(self,n->nd_head);/*2,3*/

Page 714: Ruby Hacking Guide

n=n->nd_next;}

/*****elseclause*****/VALUEargs=rb_eval(self,n);/*2*/if(TYPE(args)!=T_ARRAY)args=rb_ary_to_ary(args);argc=RARRAY(args)->len;argv=ALLOCA_N(VALUE,argc);/*1*/MEMCPY(argv,RARRAY(args)->ptr,VALUE,argc);/*3*/

TMP_ALLOC()isusedintheelseifside,butALLOCA_N(),whichisordinaryalloca(),isusedintheelseside.Why?Isn’titdangerousintheC_ALLOCAenvironmentbecausealloca()isequivalenttomalloc()?

Thepointisthat“intheelsesidethevaluesofargumentsarealsostoredinargs”.IfIillustrate,itwouldlooklikeFigure1.

Page 715: Ruby Hacking Guide

Figure1:Beingintheheapisallright.

IfatleastoneVALUEisonthestack,otherscanbesuccessivelymarkedthroughit.ThiskindofVALUEplaysaroletotieuptheotherVALUEstothestacklikeananchor.Namely,itbecomes“anchorVALUE”.Intheelseside,argsistheanchorVALUE.

Page 716: Ruby Hacking Guide

Foryourinformation,“anchorVALUE”isthewordjustcoinednow.

rb_call()

SETUP_ARGS()isrelativelyoffthetrack.Let’sgobacktothemainline.Thefunctiontoinvokeamethod,itisrb_call().Intheoriginalthere’recodeslikeraisingexceptionswhenitcouldnotfindanything,asusualI’llskipallofthem.

▼rb_call()(simplified)

staticVALUErb_call(klass,recv,mid,argc,argv,scope)VALUEklass,recv;IDmid;intargc;constVALUE*argv;intscope;{NODE*body;intnoex;IDid=mid;structcache_entry*ent;

/*searchovermethodcache*/ent=cache+EXPR1(klass,mid);if(ent->mid==mid&&ent->klass==klass){/*cachehit*/klass=ent->origin;id=ent->mid0;noex=ent->noex;body=ent->method;}else{/*cachemiss,searchingstep-by-step*/body=rb_get_method_body(&klass,&id,&noex);}

Page 717: Ruby Hacking Guide

/*...checkthevisibility...*/

returnrb_call0(klass,recv,mid,id,argc,argv,body,noex&NOEX_UNDEF);}

Thebasicwayofsearchingmethodswasdiscussedinchapter2:“Object”.Itisfollowingitssuperclassesandsearchingm_tbl.Thisisdonebysearch_method().

Theprincipleiscertainlythis,butwhenitcomestothephasetoexecuteactually,ifitsearchesbylookingupitshashmanytimesforeachmethodcall,itsspeedwouldbetooslow.Toimprovethis,inruby,onceamethodiscalled,itwillbecached.Ifamethodiscalledonce,it’softenimmediatelycalledagain.Thisisknownasanexperientialfactandthiscacherecordsthehighhitrate.

Whatislookingupthecacheisthefirsthalfofrb_call().Onlywith

ent=cache+EXPR1(klass,mid);

thisline,thecacheissearched.We’llexamineitsmechanismindetaillater.

Whenanycachewasnothit,thenextrb_get_method_body()searchestheclasstreestep-by-stepandcachestheresultatthesametime.Figure2showstheentireflowofsearching.

Page 718: Ruby Hacking Guide

Figure2:MethodSearch

MethodCacheNext,let’sexaminethestructureofthemethodcacheindetail.

▼MethodCache

180#defineCACHE_SIZE0x800181#defineCACHE_MASK0x7ff182#defineEXPR1(c,m)((((c)>>3)^(m))&CACHE_MASK)183184structcache_entry{/*methodhashtable.*/185IDmid;/*method'sid*/186IDmid0;/*method'soriginalid*/187VALUEklass;/*receiver'sclass*/188VALUEorigin;/*wheremethoddefined*/189NODE*method;190intnoex;191};192193staticstructcache_entrycache[CACHE_SIZE];

(eval.c)

IfIdescribethemechanismshortly,itisahashtable.Imentionedthattheprincipleofthehashtableistoconvertatablesearchtoanindexingofanarray.Threethingsarenecessarytoaccomplish:anarraytostorethedata,akey,andahashfunction.

First,thearrayhereisanarrayofstructcache_entry.Andthe

Page 719: Ruby Hacking Guide

methodisuniquelydeterminedbyonlytheclassandthemethodname,sothesetwobecomethekeyofthehashcalculation.Therestisdonebycreatingahashfunctiontogeneratetheindex(0x000~0x7ff)ofthecachearrayformthekey.ItisEXPR1().Amongitsarguments,cistheclassobjectandmisthemethodname(ID).(Figure3)

Figure3:MethodCache

However,EXPR1()isnotaperfecthashfunctionoranything,soadifferentmethodcangeneratethesameindexcoincidentally.Butbecausethisisnothingmorethanacache,conflictsdonotcauseaproblem.Itjustslowsitsperformancedownalittle.

TheeffectofMethodCacheBytheway,howmucheffectiveisthemethodcacheinactuality?Wecouldnotbeconvincedjustbybeingsaid“itisknownas…”.Let’smeasurebyourselves.

Type Program HitRategeneratingLALRparser raccruby.y 99.9%

Page 720: Ruby Hacking Guide

generatingamailthread amailer 99.1%generatingadocument rd2htmlrubyrefm.rd 97.8%

Surprisingly,inallofthethreeexperimentsthehitrateismorethan95%.Thisisawesome.Apparently,theeffectof“itisknowas…”isoutstanding.

Invocation

rb_call0()

Therehavebeenmanythingsandfinallywearrivedatthemethodinvoking.However,thisrb_call0()ishuge.Asit’smorethan200lines,itwouldcometo5,6pages.Ifthewholepartislaidouthere,itwouldbedisastrous.Let’slookatitbydividingintosmallportions.Startingwiththeoutline:

▼rb_call0()(Outline)

4482staticVALUE4483rb_call0(klass,recv,id,oid,argc,argv,body,nosuper)4484VALUEklass,recv;4485IDid;4486IDoid;4487intargc;/*OK*/4488VALUE*argv;/*OK*/4489NODE*body;/*OK*/4490intnosuper;4491{4492NODE*b2;/*OK*/4493volatileVALUEresult=Qnil;

Page 721: Ruby Hacking Guide

4494intitr;4495staticinttick;4496TMP_PROTECT;44974498switch(ruby_iter->iter){4499caseITER_PRE:4500itr=ITER_CUR;4501break;4502caseITER_CUR:4503default:4504itr=ITER_NOT;4505break;4506}45074508if((++tick&0xff)==0){4509CHECK_INTS;/*betterthannothing*/4510stack_check();4511}4512PUSH_ITER(itr);4513PUSH_FRAME();45144515ruby_frame->last_func=id;4516ruby_frame->orig_func=oid;4517ruby_frame->last_class=nosuper?0:klass;4518ruby_frame->self=recv;4519ruby_frame->argc=argc;4520ruby_frame->argv=argv;45214522switch(nd_type(body)){/*...mainprocess...*/46984699default:4700rb_bug("unknownnodetype%d",nd_type(body));4701break;4702}4703POP_FRAME();4704POP_ITER();4705returnresult;4706}

(eval.c)

Page 722: Ruby Hacking Guide

First,anITERispushedandwhetherornotthemethodisaniteratorisfinallyfixed.AsitsvalueisusedbythePUSH_FRAME()whichcomesimmediatelyafterit,PUSH_ITER()needstoappearbeforehand.PUSH_FRAME()willbediscussedsoon.

AndifIfirstdescribeaboutthe“…mainprocess…”part,itbranchesbasedonthefollowingnodetypesandeachbranchdoesitsinvokingprocess.

NODE_CFUNC methodsdefinedinCNODE_IVAR attr_readerNODE_ATTRSET attr_writerNODE_SUPER superNODE_ZSUPER superwithoutargumentsNODE_DMETHOD invokeUnboundMethodNODE_BMETHOD invokeMethodNODE_SCOPE methodsdefinedinRuby

Someoftheabovenodesarenotexplainedinthisbookbutnotsoimportantandcouldbeignored.TheimportantthingsareonlyNODE_CFUNC,NODE_SCOPEandNODE_ZSUPER.

PUSH_FRAME()

▼PUSH_FRAME()POP_FRAME()

536#definePUSH_FRAME()do{\537structFRAME_frame;\538_frame.prev=ruby_frame;\539_frame.tmp=0;\540_frame.node=ruby_current_node;\

Page 723: Ruby Hacking Guide

541_frame.iter=ruby_iter->iter;\542_frame.cbase=ruby_frame->cbase;\543_frame.argc=0;\544_frame.argv=0;\545_frame.flags=FRAME_ALLOCA;\546ruby_frame=&_frame

548#definePOP_FRAME()\549ruby_current_node=_frame.node;\550ruby_frame=_frame.prev;\551}while(0)

(eval.c)

First,we’dliketomakesuretheentireFRAMEisallocatedonthestack.Thisisidenticaltomodule_setup().Therestisbasicallyjustdoingordinaryinitializations.

IfIaddonemoredescription,theflagFRAME_ALLOCAindicatestheallocationmethodoftheFRAME.FRAME_ALLOCAobviouslyindicates“itisonthestack”.

rb_call0()–NODE_CFUNCAlotofthingsarewritteninthispartoftheoriginalcode,butmostofthemarerelatedtotrace_funcandsubstantivecodeisonlythefollowingline:

▼rb_call0()−NODE_CFUNC(simplified)

caseNODE_CFUNC:result=call_cfunc(body->nd_cfnc,recv,len,argc,argv);break;

Page 724: Ruby Hacking Guide

Then,asforcall_cfunc()…

▼call_cfunc()(simplified)

4394staticVALUE4395call_cfunc(func,recv,len,argc,argv)4396VALUE(*func)();4397VALUErecv;4398intlen,argc;4399VALUE*argv;4400{4401if(len>=0&&argc!=len){4402rb_raise(rb_eArgError,"wrongnumberofarguments(%dfor%d)",4403argc,len);4404}44054406switch(len){4407case-2:4408return(*func)(recv,rb_ary_new4(argc,argv));4409break;4410case-1:4411return(*func)(argc,argv,recv);4412break;4413case0:4414return(*func)(recv);4415break;4416case1:4417return(*func)(recv,argv[0]);4418break;4419case2:4420return(*func)(recv,argv[0],argv[1]);4421break;::4475default:4476rb_raise(rb_eArgError,"toomanyarguments(%d)",len);4477break;4478}4479returnQnil;/*notreached*/4480}

Page 725: Ruby Hacking Guide

(eval.c)

Asshownabove,itbranchesbasedontheargumentcount.Themaximumargumentcountis15.

NotethatneitherSCOPEorVARSispushedwhenitisNODE_CFUNC.ItmakessensebecauseamethoddefinedinCdoesnotuseRuby’slocalvariables.Butitsimultaneouslymeansthatifthe“current”localvariablesareaccessedbyC,theyareactuallythelocalvariablesofthepreviousFRAME.Andinsomeplaces,say,rb_svar(eval.c),itisactuallydone.

rb_call0()–NODE_SCOPENODE_SCOPEistoinvokeamethoddefinedinRuby.ThispartformsthefoundationofRuby.

▼rb_call0()−NODE_SCOPE(outline)

4568caseNODE_SCOPE:4569{4570intstate;4571VALUE*local_vars;/*OK*/4572NODE*saved_cref=0;45734574PUSH_SCOPE();4575/*(A)forwardCREF*/4576if(body->nd_rval){4577saved_cref=ruby_cref;4578ruby_cref=(NODE*)body->nd_rval;4579ruby_frame->cbase=body->nd_rval;4580}

Page 726: Ruby Hacking Guide

/*(B)initializeruby_scope->local_vars*/4581if(body->nd_tbl){4582local_vars=TMP_ALLOC(body->nd_tbl[0]+1);4583*local_vars++=(VALUE)body;4584rb_mem_clear(local_vars,body->nd_tbl[0]);4585ruby_scope->local_tbl=body->nd_tbl;4586ruby_scope->local_vars=local_vars;4587}4588else{4589local_vars=ruby_scope->local_vars=0;4590ruby_scope->local_tbl=0;4591}4592b2=body=body->nd_next;45934594PUSH_VARS();4595PUSH_TAG(PROT_FUNC);45964597if((state=EXEC_TAG())==0){4598NODE*node=0;4599inti;

/*……(C)assigntheargumentstothelocalvariables……*/

4666if(trace_func){4667call_trace_func("call",b2,recv,id,klass);4668}4669ruby_last_node=b2;/*(D)methodbody*/4670result=rb_eval(recv,body);4671}4672elseif(state==TAG_RETURN){/*backviareturn*/4673result=prot_tag->retval;4674state=0;4675}4676POP_TAG();4677POP_VARS();4678POP_SCOPE();4679ruby_cref=saved_cref;4680if(trace_func){4681call_trace_func("return",ruby_last_node,recv,id,klass);4682}4683switch(state){4684case0:

Page 727: Ruby Hacking Guide

4685break;46864687caseTAG_RETRY:4688if(rb_block_given_p()){4689JUMP_TAG(state);4690}4691/*fallthrough*/4692default:4693jump_tag_but_local_jump(state);4694break;4695}4696}4697break;

(eval.c)

(A)CREFforwarding,whichwasdescribedatthesectionofconstantsinthepreviouschapter.Inotherwords,cbaseistransplantedtoFRAMEfromthemethodentry.

(B)Thecontenthereiscompletelyidenticaltowhatisdoneatmodule_setup().Anarrayisallocatedatlocal_varsofSCOPE.WiththisandPUSH_SCOPE()andPUSH_VARS(),thelocalvariablescopecreationiscompleted.Afterthis,onecanexecuterb_eval()intheexactlysameenvironmentastheinteriorofthemethod.

(C)Thissetsthereceivedargumentstotheparametervariables.Theparametervariablesareinessenceidenticaltothelocalvariables.ThingssuchasthenumberofargumentsarespecifiedbyNODE_ARGS,allithastodoissettingonebyone.Detailswillbeexplainedsoon.And,

(D)thisexecutesthemethodbody.Obviously,thereceiver(recv)

Page 728: Ruby Hacking Guide

becomesself.Inotherwords,itbecomesthefirstargumentofrb_eval().Afterall,themethodiscompletelyinvoked.

SetParametersThen,we’llexaminethetotallyskippedpart,whichsetsparameters.Butbeforethat,I’dlikeyoutofirstcheckthesyntaxtreeofthemethodagain.

%ruby-rnodedump-e'defm(a)nilend'NODE_SCOPEnd_rval=(null)nd_tbl=3[_~a]nd_next:NODE_BLOCKnd_head:NODE_ARGSnd_cnt=1nd_rest=-1nd_opt=(null)nd_next:NODE_BLOCKnd_head:NODE_NEWLINEnd_file="-e"nd_nth=1nd_next:NODE_NILnd_next=(null)

NODE_ARGSisthenodetospecifytheparametersofamethod.Iaggressivelydumpedseveralthings,anditseemeditsmembersareusedasfollows:

nd_cnt thenumberofthenormalparameters

Page 729: Ruby Hacking Guide

nd_rest thevariableIDoftherestparameter.-1iftherestparameterismissing

nd_opt holdsthesyntaxtreetorepresentthedefaultvaluesoftheoptionparameters.alistofNODE_BLOCK

Ifonehasthisamountoftheinformation,thelocalvariableIDforeachparametervariablecanbeuniquelydetermined.First,Imentionedthat0and1arealways$_and$~.In2andlater,thenecessarynumberofordinaryparametersareinline.ThenumberofoptionparameterscanbedeterminedbythelengthofNODE_BLOCK.Againnexttothem,therest-parametercomes.

Forexample,ifyouwriteadefinitionasbelow,

defm(a,b,c=nil,*rest)lvar1=nilend

localvariableIDsareassignedasfollows.

0123456$_$~abcrestlvar1

Areyoustillwithme?Takingthisintoconsiderations,let’slookatthecode.

▼rb_call0()−NODE_SCOPE−assignmentsofarguments

4601if(nd_type(body)==NODE_ARGS){/*nobody*/4602node=body;/*NODE_ARGS*/4603body=0;/*themethodbody*/4604}

Page 730: Ruby Hacking Guide

4605elseif(nd_type(body)==NODE_BLOCK){/*hasbody*/4606node=body->nd_head;/*NODE_ARGS*/4607body=body->nd_next;/*themethodbody*/4608}4609if(node){/*havesomewhatparameters*/4610if(nd_type(node)!=NODE_ARGS){4611rb_bug("noargument-node");4612}46134614i=node->nd_cnt;4615if(i>argc){4616rb_raise(rb_eArgError,"wrongnumberofarguments(%dfor%d)",4617argc,i);4618}4619if(node->nd_rest==-1){/*norestparameter*//*countingthenumberofparameters*/4620intopt=i;/*thenumberofparameters(iisnd_cnt)*/4621NODE*optnode=node->nd_opt;46224623while(optnode){4624opt++;4625optnode=optnode->nd_next;4626}4627if(opt<argc){4628rb_raise(rb_eArgError,4629"wrongnumberofarguments(%dfor%d)",argc,opt);4630}/*assigningatthesecondtimeinrb_call0*/4631ruby_frame->argc=opt;4632ruby_frame->argv=local_vars+2;4633}46344635if(local_vars){/*hasparameters*/4636if(i>0){/*hasnormalparameters*/4637/*+2toskipthespacesfor$_and$~*/4638MEMCPY(local_vars+2,argv,VALUE,i);4639}4640argv+=i;argc-=i;4641if(node->nd_opt){/*hasoptionparameters*/4642NODE*opt=node->nd_opt;46434644while(opt&&argc){4645assign(recv,opt->nd_head,*argv,1);

Page 731: Ruby Hacking Guide

4646argv++;argc--;4647opt=opt->nd_next;4648}4649if(opt){4650rb_eval(recv,opt);4651}4652}4653local_vars=ruby_scope->local_vars;4654if(node->nd_rest>=0){/*hasrestparameter*/4655VALUEv;4656/*makeanarrayoftheremainningparametersandassignittoavariable*/4657if(argc>0)4658v=rb_ary_new4(argc,argv);4659else4660v=rb_ary_new2(0);4661ruby_scope->local_vars[node->nd_rest]=v;4662}4663}4664}

(eval.c)

Sincecommentsareaddedmorethanbefore,youmightbeabletounderstandwhatitisdoingbyfollowingstep-by-step.

OnethingI’dliketomentionisaboutargcandargvofruby_frame.Itseemstobeupdatedonlywhenanyrest-parameterdoesnotexist,whyisitonlywhenanyrest-parameterdoesnotexist?

Thispointcanbeunderstoodbythinkingaboutthepurposeofargcandargv.Thesemembersactuallyexistforsuperwithoutarguments.Itmeansthefollowingform:

super

Page 732: Ruby Hacking Guide

Thissuperhasabehaviortodirectlypasstheparametersofthecurrentlyexecutingmethod.Toenabletopassatthemoment,theargumentsaresavedinruby_frame->argv.

Goingbacktothepreviousstoryhere,ifthere’sarest-parameter,passingtheoriginalparameterslistsomehowseemsmoreconvenient.Ifthere’snot,theoneafteroptionparametersareassignedseemsbetter.

defm(a,b,*rest)super#probably5,6,7,8shouldbepassedendm(5,6,7,8)

defm(a,b=6)super#probably5,6shouldbepassedendm(5)

Thisisaquestionofwhichisbetterasaspecificationratherthan“itmustbe”.Ifamethodhasarest-parameter,itsupposedtoalsohavearest-parameteratsuperclass.Thus,ifthevalueafterprocessedispassed,there’sthehighpossibilityofbeinginconvenient.

Now,I’vesaidvariousthings,butthestoryofmethodinvocationisalldone.Therestis,astheendingofthischapter,lookingattheimplementationofsuperwhichisjustdiscussed.

super

Page 733: Ruby Hacking Guide

WhatcorrespondstosuperareNODE_SUPERandNODE_ZSUPER.NODE_SUPERisordinarysuper,andNODE_ZSUPERissuperwithoutarguments.

▼rb_eval()−NODE_SUPER

2780caseNODE_SUPER:2781caseNODE_ZSUPER:2782{2783intargc;VALUE*argv;/*usedinSETUP_ARGS*/2784TMP_PROTECT;2785/*(A)casewhensuperisforbidden*/2786if(ruby_frame->last_class==0){2787if(ruby_frame->orig_func){2788rb_name_error(ruby_frame->last_func,2789"superclassmethod`%s'disabled",2790rb_id2name(ruby_frame->orig_func));2791}2792else{2793rb_raise(rb_eNoMethodError,"supercalledoutsideofmethod");2794}2795}/*(B)setuporevaluateparameters*/2796if(nd_type(node)==NODE_ZSUPER){2797argc=ruby_frame->argc;2798argv=ruby_frame->argv;2799}2800else{2801BEGIN_CALLARGS;2802SETUP_ARGS(node->nd_args);2803END_CALLARGS;2804}2805/*(C)yetmysteriousPUSH_ITER()*/2806PUSH_ITER(ruby_iter->iter?ITER_PRE:ITER_NOT);2807SET_CURRENT_SOURCE();2808result=rb_call(RCLASS(ruby_frame->last_class)->super,2809ruby_frame->self,ruby_frame->orig_func,

Page 734: Ruby Hacking Guide

2810argc,argv,3);2811POP_ITER();2812}2813break;

(eval.c)

Forsuperwithoutarguments,Isaidthatruby_frame->argvisdirectlyusedasarguments,thisisdirectlyshownat(B).

(C)justbeforecallingrb_call(),doingPUSH_ITER().Thisisalsowhatcannotbeexplainedindetail,butinthiswaytheblockpassedtothecurrentmethodcanbehandedovertothenextmethod(meaning,themethodofsuperclassthatisgoingtobecalled).

Andfinally,(A)whenruby_frame->last_classis0,callingsuperseemsforbidden.Sincetheerrormessagesays“mustbeenabledbyrb_enable_super()”,itseemsitbecomescallablebycallingrb_enable_super().((errata:Theerrormessage“mustbeenabledbyrb_enable_super()”existsnotinthislistbutinrb_call_super().))Why?

First,Ifweinvestigateinwhatkindofsituationlast_classbecomes0,itseemsthatitiswhileexecutingthemethodwhosesubstanceisdefinedinC(NODE_CFUNC).Moreover,itisthesamewhendoingaliasorreplacingsuchmethod.

I’veunderstooduntilthere,buteventhoughreadingsourcecodes,Icouldn’tunderstandthesubsequentsofthem.BecauseIcouldn’t,

Page 735: Ruby Hacking Guide

Isearched“rb_enable_super”overtheruby’smailinglistarchivesandfoundit.Accordingtothatmail,thesituationlookslikeasfollows:

Forexample,there’samethodnamedString.new.Ofcourse,thisisamethodtocreateastring.String.newcreatesastructofT_STRING.Therefore,youcanexpectthatthereceiverisalwaysofT_STRINGwhenwritinganinstancemethodsofString.

Then,superofString.newisObject.new.Object.newcreateastructofT_OBJECT.WhathappensifString.newisreplacedbynewdefinitionandsuperiscalled?

defString.newsuperend

Asaconsequence,anobjectwhosestructisofT_OBJECTbutwhoseclassisStringiscreated.However,amethodofStringiswrittenwithexpectationofastructofT_STRING,sonaturallyitdowns.

Howcanweavoidthis?Theansweristoforbidtocallanymethodexpectingastructofadifferentstructtype.Buttheinformationof“expectingstructtype”isnotattachedtomethod,andalsonottoclass.Forexample,ifthere’sawaytoobtainT_STRINGfromStringclass,itcanbecheckedbeforecalling,butcurrentlywecan’tdosuchthing.Therefore,asthesecond-bestplan,“superfrommethodsdefinedinCisforbidden”isdefined.Inthisway,ifthelayerofmethodsatClevelispreciselycreated,itcannotbegot

Page 736: Ruby Hacking Guide

downatleast.And,whenthecaseis“It’sabsolutelysafe,soallowsuper”,supercanbeenabledbycallingrb_enable_super().

Inshort,theheartoftheproblemismissmatchofstructtypes.Thisisthesameastheproblemthatoccursattheallocationframework.

Then,howtosolvethisistosolvetherootoftheproblemthat“theclassdoesnotknowthestruct-typeoftheinstance”.But,inordertoresolvethis,atleastnewAPIisnecessary,andifdoingmoredeeply,compatibilitywillbelost.Therefore,forthetimebeing,thefinalsolutionhasnotdecidedyet.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 737: Ruby Hacking Guide

RubyHackingGuide

Page 738: Ruby Hacking Guide

Chapter16:Blocks

Iterator

Inthischapter,BLOCK,whichisthelastbignameamongthesevenRubystacks,comesin.Afterfinishingthis,theinternalstateoftheevaluatorisvirtuallyunderstood.

TheWholePictureWhatisthemechanismofiterators?First,let’sthinkaboutasmallprogramasbelow:

▼TheSourceProgram

iter_method()do9#amarktofindthisblockend

Let’scheckthetermsjustincase.Asforthisprogram,iter_methodisaniteratormethod,do~endisaniteratorblock.Hereisthesyntaxtreeofthisprogrambeingdumped.

▼ItsSyntaxTree

Page 739: Ruby Hacking Guide

NODE_ITERnd_iter:NODE_FCALLnd_mid=9617(iter_method)nd_args=(null)nd_var=(null)nd_body:NODE_LITnd_lit=9:Fixnum

Lookingfortheblockbyusingthe9writtenintheiteratorblockasatrace,wecanunderstandthatNODE_ITERseemstorepresenttheiteratorblock.AndNODE_FCALLwhichcallsiter_methodisatthe“below”ofthatNODE_ITER.Inotherwords,thenodeofiteratorblockappearsearlierthanthecalloftheiteratormethod.Thismeans,beforecallinganiteratormethod,ablockispushedatanothernode.

Andcheckingbyfollowingtheflowofcodewithdebugger,Ifoundthattheinvocationofaniteratorisseparatedinto3steps:NODE_ITERNODE_CALLandNODE_YIELD.Thismeans,

1. pushablock(NODE_ITER)2. callthemethodwhichisaniterator(NODE_CALL)3. yield(NODE_YEILD)

that’sall.

PushablockFirst,let’sstartwiththefirststep,thatisNODE_ITER,whichisthe

Page 740: Ruby Hacking Guide

nodetopushablock.

▼rb_eval()−NODE_ITER(simplified)

caseNODE_ITER:{iter_retry:PUSH_TAG(PROT_FUNC);PUSH_BLOCK(node->nd_var,node->nd_body);

state=EXEC_TAG();if(state==0){PUSH_ITER(ITER_PRE);result=rb_eval(self,node->nd_iter);POP_ITER();}elseif(_block.tag->dst==state){state&=TAG_MASK;if(state==TAG_RETURN||state==TAG_BREAK){result=prot_tag->retval;}}POP_BLOCK();POP_TAG();switch(state){case0:break;

caseTAG_RETRY:gotoiter_retry;

caseTAG_BREAK:break;

caseTAG_RETURN:return_value(result);/*fallthrough*/default:JUMP_TAG(state);}}

Page 741: Ruby Hacking Guide

break;

Sincetheoriginalcodecontainsthesupportoftheforstatement,itisdeleted.Afterremovingthecoderelatingtotags,thereareonlypush/popofITERandBLOCKleft.Becausetherestisordinarilydoingrb_eval()withNODE_FCALL,theseITERandBLOCKarethenecessaryconditionstoturnamethodintoaniterator.

ThenecessityofpushingBLOCKisfairlyreasonable,butwhat’sITERfor?Actually,tothinkaboutthemeaningofITER,youneedtothinkfromtheviewpointofthesidethatusesBLOCK.

Forexample,supposeamethodisjustcalled.Andruby_blockexists.ButsinceBLOCKispushedregardlessofthebreakofmethodcalls,theexistenceofablockdoesnotmeantheblockispushedforthatmethod.It’spossiblethattheblockispushedforthepreviousmethod.(Figure1)

Figure1:noone-to-onecorrespondencebetweenFRAMEandBLOCK

So,inordertodetermineforwhichmethodtheblockispushed,

Page 742: Ruby Hacking Guide

ITERisused.BLOCKisnotpushedforeachFRAMEbecausepushingBLOCKisalittleheavy.Howmuchheavyis,let’scheckitinpractice.

PUSH_BLOCK()

TheargumentofPUSH_BLOCK()is(thesyntaxtreeof)theblockparameterandtheblockbody.

▼PUSH_BLOCK()POP_BLOCK()

592#definePUSH_BLOCK(v,b)do{\593structBLOCK_block;\594_block.tag=new_blktag();\595_block.var=v;\596_block.body=b;\597_block.self=self;\598_block.frame=*ruby_frame;\599_block.klass=ruby_class;\600_block.frame.node=ruby_current_node;\601_block.scope=ruby_scope;\602_block.prev=ruby_block;\603_block.iter=ruby_iter->iter;\604_block.vmode=scope_vmode;\605_block.flags=BLOCK_D_SCOPE;\606_block.dyna_vars=ruby_dyna_vars;\607_block.wrapper=ruby_wrapper;\608ruby_block=&_block

610#definePOP_BLOCK()\611if(_block.tag->flags&(BLOCK_DYNAMIC))\612_block.tag->flags|=BLOCK_ORPHAN;\613elseif(!(_block.scope->flags&SCOPE_DONT_RECYCLE))\614rb_gc_force_recycle((VALUE)_block.tag);\615ruby_block=_block.prev;\616}while(0)

(eval.c)

Page 743: Ruby Hacking Guide

Let’smakesurethataBLOCKis“thesnapshotoftheenvironmentofthemomentofcreation”.Asaproofofit,exceptforCREFandBLOCK,thesixstackframesaresaved.CREFcanbesubstitutedbyruby_frame->cbase,there’snoneedtopush.

And,I’dliketocheckthethreepointsaboutthemechanismofpush.BLOCKisfullyallocatedonthestack.BLOCKcontainsthefullcopyofFRAMEatthemoment.BLOCKisdifferentfromtheothermanystackframestructsinhavingthepointertothepreviousBLOCK(prev).

TheflagsusedinvariouswaysatPOP_BLOCK()isnotexplainednowbecauseitcanonlybeunderstoodafterseeingtheimplementationofProclater.

Andthetalkisabout“BLOCKisheavy”,certainlyitseemsalittleheavy.Whenlookinginsideofnew_blktag(),wecanseeitdoesmalloc()andstoreplentyofmembers.Butlet’sdeferthefinaljudgeuntilafterlookingatandcomparingwithPUSH_ITER().

PUSH_ITER()

▼PUSH_ITER()POP_ITER()

773#definePUSH_ITER(i)do{\774structiter_iter;\775_iter.prev=ruby_iter;\776_iter.iter=(i);\777ruby_iter=&_iter

Page 744: Ruby Hacking Guide

779#definePOP_ITER()\780ruby_iter=_iter.prev;\781}while(0)

(eval.c)

Onthecontrary,thisisapparentlylight.Itonlyusesthestackspaceandhasonlytwomembers.EvenifthisispushedforeachFRAME,itwouldprobablymatterlittle.

IteratorMethodCallAfterpushingablock,thenextthingistocallaniteratormethod(amethodwhichisaniterator).Therealsoneedsalittlemachinery.Doyourememberthatthere’sacodetomodifythevalueofruby_iteratthebeginningofrb_call0?Here.

▼rb_call0()−movingtoITER_CUR

4498switch(ruby_iter->iter){4499caseITER_PRE:4500itr=ITER_CUR;4501break;4502caseITER_CUR:4503default:4504itr=ITER_NOT;4505break;4506}

(eval.c)

SinceITER_PREispushedpreviouslyatNODE_TER,thiscodemakesruby_iterITER_CUR.Atthismoment,amethodfinally“becomes”an

Page 745: Ruby Hacking Guide

iterator.Figure2showsthestateofthestacks.

Figure2:thestateoftheRubystacksonaniteratorcall.

Thepossiblevalueofruby_iterisnottheoneoftwobooleanvalues(forthatmethodornot),butoneofthreestepsbecausethere’salittlegapbetweenthetimingswhenpushingablockandinvokinganiteratormethod.Forexample,there’stheevaluationoftheargumentsofaniteratormethod.Sinceit’spossiblethatitcontainsmethodcallsinsideit,there’sthepossibilitythatoneofthatmethodsmistakenlythinksthatthejustpushedblockisforitselfandusesitduringtheevaluation.Therefore,thetimingwhenamethodbecomesaniterator,thismeansturningintoITER_CUR,hastobetheplaceinsideofrb_call()thatisjustbeforefinishingtheinvocation.

▼theprocessingorder

method(arg){block}#pushablock

method(arg){block}#evaluatethearuguments

Page 746: Ruby Hacking Guide

method(arg){block}#amethodcall

Forexample,inthelastchapter“Method”,there’samacronamedBEGIN_CALLARGSatahandlerofNODE_CALL.ThisiswheremakinguseofthethirdstepITER.Let’sgobackalittleandtrytoseeit.

BEGIN_CALLARGSEND_CALLARGS

▼BEGIN_CALLARGSEND_CALLARGS

1812#defineBEGIN_CALLARGSdo{\1813structBLOCK*tmp_block=ruby_block;\1814if(ruby_iter->iter==ITER_PRE){\1815ruby_block=ruby_block->prev;\1816}\1817PUSH_ITER(ITER_NOT)

1819#defineEND_CALLARGS\1820ruby_block=tmp_block;\1821POP_ITER();\1822}while(0)

(eval.c)

Whenruby_iterisITER_PRE,aruby_blockissetaside.Thiscodeisimportant,forinstance,inthebelowcase:

obj.m1{yield}.m2{nil}

Theevaluationorderofthisexpressionis:

1. pushtheblockofm22. pushtheblockofm1

Page 747: Ruby Hacking Guide

3. callthemethodm14. callthemethodm2

Therefore,iftherewasnotBEGIN_CALLARGS,m1willcalltheblockofm2.

And,ifthere’sonemoreiteratorconnected,thenumberofBEGIN_CALLARGSincreasesatthesametimeinthiscase,sothere’snoproblem.

BlockInvocationThethirdphaseofiteratorinvocation,itmeansthelastphase,isblockinvocation.

▼rb_eval()−NODE_YIELD

2579caseNODE_YIELD:2580if(node->nd_stts){2581result=avalue_to_yvalue(rb_eval(self,node->nd_stts));2582}2583else{2584result=Qundef;/*noarg*/2585}2586SET_CURRENT_SOURCE();2587result=rb_yield_0(result,0,0,0);2588break;

(eval.c)

nd_sttsistheparameterofyield.avalue_to_yvalue()wasmentionedalittleatthemultipleassignments,butyoucanignorethis.

Page 748: Ruby Hacking Guide

((errata:actually,itwasnotmentioned.Youcanignorethisanyway.))Theheartofthebehaviorisnotthisbutrb_yield_0().Sincethisfunctionisalsoverylong,Ishowthecodeafterextremelysimplifyingit.Mostofthemethodstosimplifyarepreviouslyused.

cutthecodesrelatingtotrace_func.cuterrorscutthecodesexistonlytopreventfromGCAsthesameasmassign(),there’stheparameterpcall.Thisparameteristochangethelevelofrestrictionoftheparametercheck,sonotimportanthere.Therefore,assumepcal=0andperformconstantfoldings.

Andthistime,Iturnonthe“optimizeforreadabilityoption”asfollows.

whenacodebranchinghasequivalentkindofbranches,leavethemainoneandcuttherest.ifaconditionistrue/falseinthealmostallcase,assumeitistrue/false.assumethere’snotagjumpoccurs,deleteallcodesrelatingtotag.

Ifthingsaredoneuntilthis,itbecomesveryshorter.

▼rb_yield_0()(simplified)

staticVALUE

Page 749: Ruby Hacking Guide

rb_yield_0(val,self,klass,/*pcall=0*/)VALUEval,self,klass;{volatileVALUEresult=Qnil;volatileVALUEold_cref;volatileVALUEold_wrapper;structBLOCK*volatileblock;structSCOPE*volatileold_scope;structFRAMEframe;intstate;

PUSH_VARS();PUSH_CLASS();block=ruby_block;frame=block->frame;frame.prev=ruby_frame;ruby_frame=&(frame);old_cref=(VALUE)ruby_cref;ruby_cref=(NODE*)ruby_frame->cbase;old_wrapper=ruby_wrapper;ruby_wrapper=block->wrapper;old_scope=ruby_scope;ruby_scope=block->scope;ruby_block=block->prev;ruby_dyna_vars=new_dvar(0,0,block->dyna_vars);ruby_class=block->klass;self=block->self;

/*settheblockarguments*/massign(self,block->var,val,pcall);

PUSH_ITER(block->iter);/*executetheblockbody*/result=rb_eval(self,block->body);POP_ITER();

POP_CLASS();/*……collectruby_dyna_vars……*/POP_VARS();ruby_block=block;ruby_frame=ruby_frame->prev;ruby_cref=(NODE*)old_cref;ruby_wrapper=old_wrapper;

Page 750: Ruby Hacking Guide

ruby_scope=old_scope;

returnresult;}

Asyoucansee,themoststackframesarereplacedwithwhatsavedatruby_block.Thingstosimplesave/restoreareeasytounderstand,solet’sseethehandlingoftheotherframesweneedtobecarefulabout.

FRAME

structFRAMEframe;

frame=block->frame;/*copytheentirestruct*/frame.prev=ruby_frame;/*bythesetwolines……*/ruby_frame=&(frame);/*……frameispushed*/

Differingfromtheotherframes,aFRAMEisnotusedinthesavedstate,butanewFRAMEiscreatedbyduplicating.ThiswouldlooklikeFigure3.

Figure3:pushacopiedframe

Page 751: Ruby Hacking Guide

Aswe’veseenthecodeuntilhere,itseemsthatFRAMEwillneverbe“reused”.WhenpushingFRAME,anewFRAMEwillalwaysbecreated.

BLOCK

block=ruby_block;:ruby_block=block->prev;:ruby_block=block;

WhatisthemostmysteriousisthisbehaviorofBLOCK.Wecan’teasilyunderstandwhetheritissavingorpopping.It’scomprehensiblethatthefirststatementandthethirdstatementareasapair,andthestatewillbeeventuallyback.However,whatistheconsequenceofthesecondstatement?

ToputtheconsequenceofI’veponderedalotinonephrase,“goingbacktotheruby_blockofatthemomentwhenpushingtheblock”.Aniteratoris,inshort,thesyntaxtogobacktothepreviousframe.Therefore,allwehavetodoisturningthestateofthestackframeintowhatwasatthemomentwhencreatingtheblock.And,thevalueofruby_blockatthemomentwhencreatingtheblockis,itseemscertainthatitwasblock->prev.Therefore,itiscontainedinprev.

Additionally,forthequestion“isitnoproblemtoassumewhatinvokedisalwaysthetopofruby_block?”,there’snochoicebutsaying“astherb_yield_0side,youcanassumeso”.Topushthe

Page 752: Ruby Hacking Guide

blockwhichshouldbeinvokedonthetopoftheruby_blockistheworkofthesidetopreparetheblock,andnottheworkofrb_yield_0.

AnexampleofitisBEGIN_CALLARGSwhichwasdiscussedinthepreviouschapter.Whenaniteratorcallcascades,thetwoblocksarepushedandthetopofthestackwillbetheblockwhichshouldnotbeused.Therefore,itispurposefullycheckedandsetaside.

VARS

Cometothinkofit,IthinkwehavenotlookedthecontentsofPUSH_VARS()andPOP_VARS()yet.Let’sseethemhere.

▼PUSH_VARS()POP_VARS()

619#definePUSH_VARS()do{\620structRVarmap*volatile_old;\621_old=ruby_dyna_vars;\622ruby_dyna_vars=0

624#definePOP_VARS()\625if(_old&&(ruby_scope->flags&SCOPE_DONT_RECYCLE)){\626if(RBASIC(_old)->flags)/*ifwerenotrecycled*/\627FL_SET(_old,DVAR_DONT_RECYCLE);\628}\629ruby_dyna_vars=_old;\630}while(0)

(eval.c)

Thisisalsonotpushinganewstruct,tosay“setaside/restore”iscloser.Inpractice,inrb_yield_0,PUSH_VARS()isusedonlytoset

Page 753: Ruby Hacking Guide

asidethevalue.Whatactuallypreparesruby_dyna_varsisthisline.

ruby_dyna_vars=new_dvar(0,0,block->dyna_vars);

Thistakesthedyna_varssavedinBLOCKandsetsit.Anentryisattachedatthesametime.I’dlikeyoutorecallthedescriptionofthestructureofruby_dyna_varsinPart2,itsaidtheRVarmapwhoseidis0suchastheonecreatedhereisusedasthebreakbetweenblockscopes.

However,infact,betweentheparserandtheevaluator,theformofthelinkstoredinruby_dyna_varsisslightlydifferent.Let’slookatthedvar_asgn_curr()function,whichassignsablocklocalvariableatthecurrentblock.

▼dvar_asgn_curr()

737staticinlinevoid738dvar_asgn_curr(id,value)739IDid;740VALUEvalue;741{742dvar_asgn_internal(id,value,1);743}

699staticvoid700dvar_asgn_internal(id,value,curr)701IDid;702VALUEvalue;703intcurr;704{705intn=0;706structRVarmap*vars=ruby_dyna_vars;707

Page 754: Ruby Hacking Guide

708while(vars){709if(curr&&vars->id==0){710/*firstnullisadvarheader*/711n++;712if(n==2)break;713}714if(vars->id==id){715vars->val=value;716return;717}718vars=vars->next;719}720if(!ruby_dyna_vars){721ruby_dyna_vars=new_dvar(id,value,0);722}723else{724vars=new_dvar(id,value,ruby_dyna_vars->next);725ruby_dyna_vars->next=vars;726}727}

(eval.c)

Thelastifstatementistoaddavariable.Ifwefocusonthere,wecanseealinkisalwayspushedinatthe“next”toruby_dyna_vars.Thismeans,itwouldlooklikeFigure4.

Page 755: Ruby Hacking Guide

Figure4:thestructureofruby_dyna_vars

Thisdiffersfromthecaseoftheparserinonepoint:theheaders(id=0)toindicatethebreaksofscopesareattachedbeforethelinks.Ifaheaderisattachedafterthelinks,thefirstoneofthescopecannotbeinsertedproperly.(Figure5)((errata:Itwasdescribedthatruby_dyna_varsoftheevaluatoralwaysformsasinglestraightlink.Butaccordingtotheerrata,itwaswrong.Thatpartandrelevantdescriptionsareremoved.))

Figure5:Theentrycannotbeinsertedproperly.

Page 756: Ruby Hacking Guide

TargetSpecifiedJumpThecoderelatestojumptagsareomittedinthepreviouslyshowncode,butthere’saneffortthatwe’veneverseenbeforeinthejumpofrb_yield_0.Whyistheeffortnecessary?I’lltellthereasoninadvance.I’dlikeyoutoseethebelowprogram:

[0].eachdobreakend#theplacetoreachbybreak

likethisway,inthecasewhendoingbreakfrominsideofablock,itisnecessarytogetoutoftheblockandgotothemethodthatpushedtheblock.Whatdoesitactuallymean?Let’sthinkbylookingatthe(dynamic)callgraphwheninvokinganiterator.

rb_eval(NODE_ITER)....catch(TAG_BREAK)rb_eval(NODE_CALL)....catch(TAG_BREAK)rb_eval(NODE_YIELD)rb_yield_0rb_eval(NODE_BREAK)....throw(TAG_BREAK)

SincewhatpushedtheblockisNODE_ITER,itshouldgobacktoaNODE_ITERwhendoingbreak.However,NODE_CALLiswaitingforTAG_BREAKbeforeNODE_ITER,inordertoturnabreakovermethodsintoanerror.Thisisaproblem.WeneedtosomehowfindawaytogostraightbacktoaNODE_ITER.

Andactually,“goingbacktoaNODE_ITER”willstillbeaproblem.Ifiteratorsarenesting,therecouldbemultipleNODE_ITERs,thusthe

Page 757: Ruby Hacking Guide

onecorrespondstothecurrentblockisnotalwaysthefirstNODE_ITER.Inotherwords,weneedtorestrictonly“theNODE_ITERthatpushedthecurrentlybeinginvokedblock”

Then,let’sseehowthisisresolved.

▼rb_yield_0()−thepartsrelatestotags

3826PUSH_TAG(PROT_NONE);3827if((state=EXEC_TAG())==0){/*……evaluatethebody……*/3838}3839else{3840switch(state){3841caseTAG_REDO:3842state=0;3843CHECK_INTS;3844gotoredo;3845caseTAG_NEXT:3846state=0;3847result=prot_tag->retval;3848break;3849caseTAG_BREAK:3850caseTAG_RETURN:3851state|=(serial++<<8);3852state|=0x10;3853block->tag->dst=state;3854break;3855default:3856break;3857}3858}3859POP_TAG();

(eval.c)

ThepartsofTAG_BREAKandTAG_RETURNarecrucial.

Page 758: Ruby Hacking Guide

First,serialisastaticvariableofrb_yield_0(),itsvaluewillbedifferenteverytimecallingrb_yield_0.“serial”istheserialof“serialnumber”.

Thereasonwhyleftshiftingby8bitsseemsinordertoavoidoverlappingthevaluesofTAG_xxxx.TAG_xxxxisintherangebetween0x1~0x8,4bitsareenough.And,thebit-orof0x10seemstopreventserialfromoverflow.In32-bitmachine,serialcanuseonly24bits(only16milliontimes),recentmachinecanletitoverflowwithinlessthan10seconds.Ifthishappens,thetop24bitsbecomeall0inline.Therefore,if0x10didnotexist,statewouldbethesamevalueasTAG_xxxx(SeealsoFigure6).

Figure6:block->tag->dst

Now,tag->dstbecamethevaluewhichdiffersfromTAG_xxxxandisuniqueforeachcall.Inthissituation,becauseanordinaryswitchaspreviousonescannotreceiveit,thesidetostopjumpsshouldneedeffortstosomeextent.Theplacewheremakinganeffortisthisplaceofrb_eval:NODE_ITER:

▼rb_eval()−NODE_ITER(tostopjumps)

Page 759: Ruby Hacking Guide

caseNODE_ITER:{state=EXEC_TAG();if(state==0){/*……invokeaniterator……*/}elseif(_block.tag->dst==state){state&=TAG_MASK;if(state==TAG_RETURN||state==TAG_BREAK){result=prot_tag->retval;}}}

IncorrespondingNODE_ITERandrb_yield_0,blockshouldpointtothesamething,sotag->dstwhichwassetatrb_yield_0comesinhere.Becauseofthis,onlythecorrespondingNODE_ITERcanproperlystopthejump.

CheckofablockWhetherornotacurrentlybeingevaluatedmethodisaniterator,inotherwords,whetherthere’sablock,canbecheckedbyrb_block_given_p().Afterreadingtheaboveall,wecantellitsimplementation.

▼rb_block_given_p()

3726int3727rb_block_given_p()3728{3729if(ruby_frame->iter&&ruby_block)3730returnQtrue;3731returnQfalse;

Page 760: Ruby Hacking Guide

3732}

(eval.c)

Ithinkthere’snoproblem.WhatI’dliketotalkaboutthistimeisactuallyanotherfunctiontocheck,itisrb_f_block_given_p().

▼rb_f_block_given_p()

3740staticVALUE3741rb_f_block_given_p()3742{3743if(ruby_frame->prev&&ruby_frame->prev->iter&&ruby_block)3744returnQtrue;3745returnQfalse;3746}

(eval.c)

ThisisthesubstanceofRuby’sblock_given?.Incomparisontorb_block_given_p(),thisisdifferentincheckingtheprevofruby_frame.Whyisthis?

Thinkingaboutthemechanismtopushablock,tocheckthecurrentruby_framelikerb_block_given_p()isright.Butwhencallingblock_given?fromRuby-level,sinceblock_given?itselfisamethod,anextraFRAMEispushed.Hence,weneedtocheckthepreviousone.

Proc

Page 761: Ruby Hacking Guide

TodescribeaProcobjectfromtheviewpointofimplementing,itis“aBLOCKwhichcanbebringouttoRubylevel”.BeingabletobringouttoRubylevelmeanshavingmorelatitude,butitalsomeanswhenandwhereitwillbeusedbecomescompletelyunpredictable.Focusingonhowtheinfluenceofthisfactis,let’slookattheimplementation.

ProcobjectcreationAProcobjectiscreatedwithProc.new.Itssubstanceisproc_new().

▼proc_new()

6418staticVALUE6419proc_new(klass)6420VALUEklass;6421{6422volatileVALUEproc;6423structBLOCK*data,*p;6424structRVarmap*vars;64256426if(!rb_block_given_p()&&!rb_f_block_given_p()){6427rb_raise(rb_eArgError,"triedtocreateProcobjectwithoutablock");6428}6429/*(A)allocatebothstructRDataandstructBLOCK*/6430proc=Data_Make_Struct(klass,structBLOCK,blk_mark,blk_free,data);6431*data=*ruby_block;64326433data->orig_thread=rb_thread_current();6434data->wrapper=ruby_wrapper;6435data->iter=data->prev?Qtrue:Qfalse;/*(B)theessentialinitializationisfinishedbyhere*/6436frame_dup(&data->frame);

Page 762: Ruby Hacking Guide

6437if(data->iter){6438blk_copy_prev(data);6439}6440else{6441data->prev=0;6442}6443data->flags|=BLOCK_DYNAMIC;6444data->tag->flags|=BLOCK_DYNAMIC;64456446for(p=data;p;p=p->prev){6447for(vars=p->dyna_vars;vars;vars=vars->next){6448if(FL_TEST(vars,DVAR_DONT_RECYCLE))break;6449FL_SET(vars,DVAR_DONT_RECYCLE);6450}6451}6452scope_dup(data->scope);6453proc_save_safe_level(proc);64546455returnproc;6456}

(eval.c)

ThecreationofaProcobjectitselfisunexpectedlysimple.Between(A)and(B),aspaceforanProcobjectisallocatedanditsinitializationcompletes.Data_Make_Struct()isasimplemacrothatdoesbothmalloc()andData_Wrap_Struct()atthesametime.

Theproblemsexistafterthat:

frame_dup()

blk_copy_prev()

FL_SET(vars,DVAR_DONT_RECYCLE)

scope_dup()

Thesefourhavethesamepurposes.Theyare:

Page 763: Ruby Hacking Guide

moveallofwhatwereputonthemachinestacktotheheap.preventfromcollectingevenifafterPOP

Here,“all”meanstheallthingsincludingprev.Fortheallstackframespushedthere,itduplicateseachframebydoingmalloc()andcopying.VARSisusuallyforcedtobecollectedbyrb_gc_force_recycle()atthesamemomentofPOP,butthisbehaviorisstoppedbysettingtheDVAR_DONT_RECYCLEflag.Andsoon.Reallyextremethingsaredone.

Whyaretheseextremethingsnecessary?Thisisbecause,unlikeiteratorblocks,aProccanpersistlongerthanthemethodthatcreatedit.AndtheendofamethodmeansthethingsallocatedonthemachinestacksuchasFRAME,ITER,andlocal_varsofSCOPEareinvalidated.It’seasytopredictwhattheconsequenceofusingtheinvalidatedmemories.(Anexampleanswer:itbecomestroublesome).

ItriedtocontriveawaytoatleastusethesameFRAMEfrommultipleProc,butsincetherearetheplacessuchasold_framewheresettingasidethepointerstothelocalvariables,itdoesnotseemgoingwell.Ifitrequiresaloteffortsinanyway,anothereffort,say,allocatingallofthemwithmalloc()fromthefristplace,seemsbettertogiveitatry.

Anyway,Isentimentallythinkthatit’ssurprisingthatitrunswiththatspeedeventhoughdoingtheseextremethings.Indeed,ithasbecomeagoodtime.

Page 764: Ruby Hacking Guide

FloatingFramePreviously,Imentioneditjustinonephrase“duplicateallframes”,butsincethatwasunclear,let’slookatmoredetails.Thepointsarethenexttwo:

HowtoduplicateallWhyallofthemareduplicated

Thenfirst,let’sstartwiththesummaryofhoweachstackframeissaved.

Frame location hasprevpointer?FRAME stack yesSCOPE stack nolocal_tbl heaplocal_vars stackVARS heap noBLOCK stack yes

CLASSCREFITERarenotnecessarythistime.SinceCLASSisageneralRubyobject,rb_gc_force_recycle()isnotcalledwithitevenbymistake(it’simpossible)andbothCREFandITERbecomesunnecessaryafterstoringitsvaluesatthemomentinFRAME.Thefourframesintheabovetableareimportantbecausethesewillbemodifiedorreferredtomultipletimeslater.Therestthreewillnot.

Then,thistalkmovestohowtoduplicateall.Isaid“how”,butitdoesnotaboutsuchas“bymalloc()”.Theproblemishowto

Page 765: Ruby Hacking Guide

duplicate“all”.Itisbecause,hereI’dlikeyoutoseetheabovetable,therearesomeframeswithoutanyprevpointer.Inotherwords,wecannotfollowlinks.Inthissituation,howcanweduplicateall?

Afairlyclevertechniqueusedtocounterthis.Let’stakeSCOPEasanexample.Afunctionnamedscope_dup()isusedpreviouslyinordertoduplicateSCOPE,solet’sseeitfirst.

▼scope_dup()onlythebeginning

6187staticvoid6188scope_dup(scope)6189structSCOPE*scope;6190{6191ID*tbl;6192VALUE*vars;61936194scope->flags|=SCOPE_DONT_RECYCLE;

(eval.c)

Asyoucansee,SCOPE_DONT_RECYCLEisset.Thennext,takealookatthedefinitionofPOP_SCOPE():

▼POP_SCOPE()onlythebeginning

869#definePOP_SCOPE()\870if(ruby_scope->flags&SCOPE_DONT_RECYCLE){\871if(_old)scope_dup(_old);\872}\

(eval.c)

Page 766: Ruby Hacking Guide

Whenitpops,ifSCOPE_DONT_RECYCLEflagwassettothecurrentSCOPE(ruby_scope),italsodoesscope_dup()ofthepreviousSCOPE(_old).Inotherwords,SCOPE_DONT_RECYCLEisalsosettothisone.Inthisway,onebyone,theflagispropagatedatthetimewhenitpops.(Figure7)

Figure7:flagpropagation

SinceVARSalsodoesnothaveanyprevpointer,thesametechniqueisusedtopropagatetheDVAR_DONT_RECYCLEflag.

Next,thesecondpoint,trytothinkabout“whyallofthemareduplicated”.WecanunderstandthatthelocalvariablesofSCOPEcanbereferredtolaterifitsProciscreated.However,isitnecessarytocopyallofthemincludingthepreviousSCOPEinordertoaccomplishthat?

Honestlyspeaking,Icouldn’tfindtheanswerofthisquestionandhasbeenworriedabouthowcanIwritethissectionforalmostthreedays,I’vejustgottheanswer.Takealookatthenextprogram:

defget_procProc.new{nil}

Page 767: Ruby Hacking Guide

end

env=get_proc{p'ok'}eval("yield",env)

Ihavenotexplainedthisfeature,butbypassingaProcobjectasthesecondargumentofeval,youcanevaluatethestringinthatenvironment.

Itmeans,asthereaderswhohavereaduntilherecanprobablytell,itpushesthevariousenvironmentstakenfromtheProc(meaningBLOCK)andevaluates.Inthiscase,itnaturallyalsopushesBLOCKandyoucanturntheBLOCKintoaProcagain.Then,usingtheProcwhendoingeval…ifthingsaredonelikethis,youcanaccessalmostallinformationofruby_blockfromRubylevelasyoulike.Thisisthereasonwhytheentirestacksneedtobefullyduplicated.((errata:wecannotaccessruby_blockaswelikefromRubylevel.ThereasonwhyallSCOPEsareduplicatedwasnotunderstood.Itseemsallwecandoistoinvestigatethemailinglistarchivesofthetimewhenthischangewasapplied.(Itisstillnotcertainwhetherwecanfindoutthereasoninthisway.)))

InvocationofProcNext,we’lllookattheinvocationofacreatedProc.SinceProc#callcanbeusedfromRubytoinvoke,wecanfollowthesubstanceofit.

ThesubstanceofProc#callisproc_call():

Page 768: Ruby Hacking Guide

▼proc_call()

6570staticVALUE6571proc_call(proc,args)6572VALUEproc,args;/*OK*/6573{6574returnproc_invoke(proc,args,Qtrue,Qundef);6575}

(eval.c)

Delegatetoproc_invoke().WhenIlookupinvokeinadictionary,itwaswrittensuchas“callon(God,etc.)forhelp”,butwhenitisinthecontextofprogramming,itisoftenusedinthealmostsamemeaningas“activate”.

Theprototypeoftheproc_invoke()is,

proc_invoke(VALUEproc,VALUEargs,intpcall,VALUEself)

However,accordingtothepreviouscode,pcall=Qtrueandself=Qundefinthiscase,sothesetwocanberemovedbyconstantfoldings.

▼proc_invoke(simplified)

staticVALUEproc_invoke(proc,args,/*pcall=Qtrue*/,/*self=Qundef*/)VALUEproc,args;VALUEself;{structBLOCK*volatileold_block;structBLOCK_block;structBLOCK*data;

Page 769: Ruby Hacking Guide

volatileVALUEresult=Qnil;intstate;volatileintorphan;volatileintsafe=ruby_safe_level;volatileVALUEold_wrapper=ruby_wrapper;structRVarmap*volatileold_dvars=ruby_dyna_vars;

/*(A)takeBLOCKfromprocandassignittodata*/Data_Get_Struct(proc,structBLOCK,data);/*(B)blk_orphan*/orphan=blk_orphan(data);

ruby_wrapper=data->wrapper;ruby_dyna_vars=data->dyna_vars;/*(C)pushBLOCKfromdata*/old_block=ruby_block;_block=*data;ruby_block=&_block;

/*(D)transitiontoITER_CUR*/PUSH_ITER(ITER_CUR);ruby_frame->iter=ITER_CUR;

PUSH_TAG(PROT_NONE);state=EXEC_TAG();if(state==0){proc_set_safe_level(proc);/*(E)invoketheblock*/result=rb_yield_0(args,self,0,pcall);}POP_TAG();

POP_ITER();if(ruby_block->tag->dst==state){state&=TAG_MASK;/*targetspecifiedjump*/}ruby_block=old_block;ruby_wrapper=old_wrapper;ruby_dyna_vars=old_dvars;ruby_safe_level=safe;

switch(state){case0:

Page 770: Ruby Hacking Guide

break;caseTAG_BREAK:result=prot_tag->retval;break;caseTAG_RETURN:if(orphan){/*orphanprocedure*/localjump_error("returnfromproc-closure",prot_tag->retval);}/*fallthrough*/default:JUMP_TAG(state);}returnresult;}

Thecrucialpointsarethree:C,D,andE.

(C)AtNODE_ITERaBLOCKiscreatedfromthesyntaxtreeandpushed,butthistime,aBLOCKistakenfromProcandpushed.

(D)ItwasITER_PREbeforebecomingITER_CURatrb_call0(),butthistimeitgoesdirectlyintoITER_CUR.

(E)Ifthecasewasanordinaryiterator,itsmethodcallexistsbeforeyeildoccursthengoingtorb_yield_0,butthistimerb_yield_()isdirectlycalledandinvokesthejustpushedblock.

Inotherwords,inthecaseofiterator,theproceduresareseparatedintothreeplaces,NODE_ITER~rb_call0()~NODE_YIELD.Butthistime,theyaredoneallatonce.

Finally,I’lltalkaboutthemeaningofblk_orphan().Asthenamesuggests,itisafunctiontodeterminethestateof“themethod

Page 771: Ruby Hacking Guide

whichcreatedtheProchasfinished”.Forexample,theSCOPEusedbyaBLOCKhasalreadybeenpopped,youcandetermineithasfinished.

BlockandProcInthepreviouschapter,variousthingsaboutargumentsandparametersofmethodsarediscussed,butIhavenotdescribedaboutblockparametersyet.Althoughitisbrief,hereI’llperformthefinalpartofthatseries.

defm(&block)end

Thisisa“blockparameter”.Thewaytoenablethisisverysimple.Ifmisaniterator,itiscertainthataBLOCKwasalreadypushed,turnitintoaProcandassigninto(inthiscase)thelocalvariableblock.HowtoturnablockintoaProcisjustcallingproc_new(),whichwaspreviouslydescribed.Thereasonwhyjustcallingisenoughcanbealittleincomprehensible.HoweverwhicheverProc.neworm,thesituation“amethodiscalledandaBLOCKispushed”isthesame.Therefore,fromClevel,anytimeyoucanturnablockintoaProcbyjustcallingproc_new().

Andifmisnotaniterator,allwehavetodoissimplyassigningnil.

Next,itisthesidetopassablock.

m(&block)

Page 772: Ruby Hacking Guide

Thisisa“blockargument”.Thisisalsosimple,takeaBLOCKfrom(aProcobjectstoredin)blockandpushit.WhatdiffersfromPUSH_BLOCK()isonlywhetheraBLOCKhasalreadybeencreatedinadvanceornot.

Thefunctiontodothisprocedureisblock_pass().Ifyouarecuriousabout,checkandconfirmaroundit.However,itreallydoesjustonlywhatwasdescribedhere,it’spossibleyou’llbedisappointed…

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 773: Ruby Hacking Guide

RubyHackingGuide

Page 774: Ruby Hacking Guide

Chapter17:Dynamic

evaluation

Overview

Ihavealreadyfinishedtodescribeaboutthemechanismoftheevaluatorbythepreviouschapter.Inthischapter,byincludingtheparserinadditiontoit,let’sexaminethebigpictureas“theevaluatorinabroadsense”.Therearethreetargets:eval,Module#module_evalandObject#instance_eval.

eval

I’vealreadydescribedabouteval,butI’llintroducemoretinythingsaboutithere.

Byusingeval,youcancompileandevaluateastringatruntimeintheplace.Itsreturnvalueisthevalueofthelastexpressionoftheprogram.

peval("1+1")#2

Page 775: Ruby Hacking Guide

Youcanalsorefertoavariableinitsscopefrominsideofastringtoeval.

lvar=5@ivar=6peval("lvar+@ivar")#11

Readerswhohavebeenreadinguntilherecannotsimplyreadandpassovertheword“itsscope”.Forinstance,youarecuriousabouthowisits“scope”ofconstants,aren’tyou?Iam.Toputthebottomlinefirst,basicallyyoucanthinkitdirectlyinheritstheenvironmentofoutsideofeval.

Andyoucanalsodefinemethodsanddefineclasses.

defaeval('classC;deftest()puts("ok")endend')end

a()#defineclassCandC#testC.new.test#showsok

Moreover,asmentionedalittleinthepreviouschapter,whenyoupassaProcasthesecondargument,thestringcanbeevaluatedinitsenvironment.

defnew_envn=5Proc.new{nil}#turntheenvironmentofthismethodintoanobjectandreturnitend

peval('n*3',new_env())#15

Page 776: Ruby Hacking Guide

module_evalandinstance_evalWhenaProcispassedasthesecondargumentofeval,theevaluationscanbedoneinitsenvironment.module_evalandinstance_evalisitslimited(orshortcut)version.Withmodule_eval,youcanevaluateinanenvironmentthatisasifinamodulestatementoraclassstatement.

lvar="toplevellvar"#alocalvariabletoconfirmthisscope

moduleMendM.module_eval(<<'EOS')#asuitablesituationtousehere-documentplvar#referablepself#showsMdefok#defineM#okputs'ok'endEOS

Withinstance_eval,youcanevaluateinanenvironmentwhoseselfofthesingletonclassstatementistheobject.

lvar="toplevellvar"#alocalvariabletoconfirmthisscope

obj=Object.newobj.instance_eval(<<'EOS')plvar#referablepself#shows#<Object:0x40274f5c>defok#defineobj.okputs'ok'endEOS

Additionally,thesemodule_evalandinstance_evalcanalsobeused

Page 777: Ruby Hacking Guide

asiterators,ablockisevaluatedineachenvironmentinthatcase.Forinstance,

obj=Object.newpobj##<Object:0x40274fac>obj.instance_eval{pself##<Object:0x40274fac>}

Likethis.

However,betweenthecasewhenusingastringandthecasewhenusingablock,thebehavioraroundlocalvariablesisdifferenteachother.Forexample,whencreatingablockintheamethodthendoinginstance_evalitinthebmethod,theblockwouldrefertothelocalvariablesofa.Whencreatingastringintheamethodthendoinginstance_evalitinthebmethod,frominsideofthestring,itwouldrefertothelocalvariablesofb.Thescopeoflocalvariablesisdecided“atcompiletime”,theconsequencediffersbecauseastringiscompiledeverytimebutablockiscompiledwhenloadingfiles.

eval

eval()

TheevalofRubybranchesmanytimesbasedonthepresenceandabsenceoftheparameters.Let’sassumetheformofcallislimited

Page 778: Ruby Hacking Guide

tothebelow:

eval(prog_string,some_block)

Then,sincethismakestheactualinterfacefunctionrb_f_eval()almostmeaningless,we’llstartwiththefunctioneval()whichisonesteplower.Thefunctionprototypeofeval()is:

staticVALUEeval(VALUEself,VALUEsrc,VALUEscope,char*file,intline);

scopeistheProcofthesecondparameter.fileandlineisthefilenameandlinenumberofwhereastringtoevalissupposedtobelocated.Then,let’sseethecontent:

▼eval()(simplified)

4984staticVALUE4985eval(self,src,scope,file,line)4986VALUEself,src,scope;4987char*file;4988intline;4989{4990structBLOCK*data=NULL;4991volatileVALUEresult=Qnil;4992structSCOPE*volatileold_scope;4993structBLOCK*volatileold_block;4994structRVarmap*volatileold_dyna_vars;4995VALUEvolatileold_cref;4996intvolatileold_vmode;4997volatileVALUEold_wrapper;4998structFRAMEframe;4999NODE*nodesave=ruby_current_node;5000volatileintiter=ruby_frame->iter;5001intstate;

Page 779: Ruby Hacking Guide

50025003if(!NIL_P(scope)){/*alwaystruenow*/5009Data_Get_Struct(scope,structBLOCK,data);5010/*pushBLOCKfromdata*/5011frame=data->frame;5012frame.tmp=ruby_frame;/*topreventfromGC*/5013ruby_frame=&(frame);5014old_scope=ruby_scope;5015ruby_scope=data->scope;5016old_block=ruby_block;5017ruby_block=data->prev;5018old_dyna_vars=ruby_dyna_vars;5019ruby_dyna_vars=data->dyna_vars;5020old_vmode=scope_vmode;5021scope_vmode=data->vmode;5022old_cref=(VALUE)ruby_cref;5023ruby_cref=(NODE*)ruby_frame->cbase;5024old_wrapper=ruby_wrapper;5025ruby_wrapper=data->wrapper;5032self=data->self;5033ruby_frame->iter=data->iter;5034}5045PUSH_CLASS();5046ruby_class=ruby_cbase;/*==ruby_frame->cbase*/50475048ruby_in_eval++;5049if(TYPE(ruby_class)==T_ICLASS){5050ruby_class=RBASIC(ruby_class)->klass;5051}5052PUSH_TAG(PROT_NONE);5053if((state=EXEC_TAG())==0){5054NODE*node;50555056result=ruby_errinfo;5057ruby_errinfo=Qnil;5058node=compile(src,file,line);5059if(ruby_nerrs>0){5060compile_error(0);5061}5062if(!NIL_P(result))ruby_errinfo=result;5063result=eval_node(self,node);5064}5065POP_TAG();

Page 780: Ruby Hacking Guide

5066POP_CLASS();5067ruby_in_eval--;5068if(!NIL_P(scope)){/*alwaystruenow*/5069intdont_recycle=ruby_scope->flags&SCOPE_DONT_RECYCLE;50705071ruby_wrapper=old_wrapper;5072ruby_cref=(NODE*)old_cref;5073ruby_frame=frame.tmp;5074ruby_scope=old_scope;5075ruby_block=old_block;5076ruby_dyna_vars=old_dyna_vars;5077data->vmode=scope_vmode;/*savethemodificationofthevisibilityscope*/5078scope_vmode=old_vmode;5079if(dont_recycle){/*……copySCOPEBLOCKVARS……*/5097}5098}5104if(state){5105if(state==TAG_RAISE){/*……prepareanexceptionobject……*/5121rb_exc_raise(ruby_errinfo);5122}5123JUMP_TAG(state);5124}51255126returnresult;5127}

(eval.c)

Ifthisfunctionisshownwithoutanypreamble,youprobablyfeel“oww!”.Butwe’vedefeatedmanyfunctionsofeval.cuntilhere,sothisisnotenoughtobeanenemyofus.Thisfunctionisjustcontinuouslysaving/restoringthestacks.Thepointsweneedtocareaboutareonlythebelowthree:

unusuallyFRAMEisalsoreplaced(notcopiedandpushed)ruby_crefissubstituted(?)byruby_frame->cbase

Page 781: Ruby Hacking Guide

onlyscope_vmodeisnotsimplyrestoredbutinfluencesdata.

Andthemainpartsarethecompile()andeval_node()locatedaroundthemiddle.Thoughit’spossiblethateval_node()hasalreadybeenforgotten,itisthefunctiontostarttheevaluationoftheparameternode.Itwasalsousedinruby_run().

Hereiscompile().

▼compile()

4968staticNODE*4969compile(src,file,line)4970VALUEsrc;4971char*file;4972intline;4973{4974NODE*node;49754976ruby_nerrs=0;4977Check_Type(src,T_STRING);4978node=rb_compile_string(file,src,line);49794980if(ruby_nerrs==0)returnnode;4981return0;4982}

(eval.c)

ruby_nerrsisthevariableincrementedinyyerror().Inotherwords,ifthisvariableisnon-zero,itindicatesmorethanoneparseerrorhappened.And,rb_compile_string()wasalreadydiscussedinPart2.ItwasafunctiontocompileaRubystringintoasyntaxtree.

Page 782: Ruby Hacking Guide

Onethingbecomesaproblemhereislocalvariable.Aswe’veseeninChapter12:Syntaxtreeconstruction,localvariablesaremanagedbyusinglvtbl.However,sinceaSCOPE(andpossiblyalsoVARS)alreadyexists,weneedtoparseinthewayofwritingoverandaddingtoit.Thisisinfacttheheartofeval(),andistheworstdifficultpart.Let’sgobacktoparse.yagainandcompletethisinvestigation.

top_local

I’vementionedthatthefunctionsnamedlocal_push()local_pop()areusedwhenpushingstructlocal_vars,whichisthemanagementtableoflocalvariables,butactuallythere’sonemorepairoffunctionstopushthemanagementtable.Itisthepairoftop_local_init()andtop_local_setup().Theyarecalledinthissortofway.

▼Howtop_local_init()iscalled

program:{top_local_init();}compstmt{top_local_setup();}

Ofcourse,inactualityvariousotherthingsarealsodone,butallofthemarecutherebecauseit’snotimportant.Andthisisthecontentofit:

▼top_local_init()

Page 783: Ruby Hacking Guide

5273staticvoid5274top_local_init()5275{5276local_push(1);5277lvtbl->cnt=ruby_scope->local_tbl?ruby_scope->local_tbl[0]:0;5278if(lvtbl->cnt>0){5279lvtbl->tbl=ALLOC_N(ID,lvtbl->cnt+3);5280MEMCPY(lvtbl->tbl,ruby_scope->local_tbl,ID,lvtbl->cnt+1);5281}5282else{5283lvtbl->tbl=0;5284}5285if(ruby_dyna_vars)5286lvtbl->dlev=1;5287else5288lvtbl->dlev=0;5289}

(parse.y)

Thismeansthatlocal_tbliscopiedfromruby_scopetolvtbl.Asforblocklocalvariables,sinceit’sbettertoseethemallatoncelater,we’llfocusonordinarylocalvariablesforthetimebeing.Next,hereistop_local_setup().

▼top_local_setup()

5291staticvoid5292top_local_setup()5293{5294intlen=lvtbl->cnt;/*thenumberoflocalvariablesafterparsing*/5295inti;/*thenumberoflocalvaraiblesbeforeparsing*/52965297if(len>0){5298i=ruby_scope->local_tbl?ruby_scope->local_tbl[0]:0;52995300if(i<len){5301if(i==0||(ruby_scope->flags&SCOPE_MALLOC)==0){

Page 784: Ruby Hacking Guide

5302VALUE*vars=ALLOC_N(VALUE,len+1);5303if(ruby_scope->local_vars){5304*vars++=ruby_scope->local_vars[-1];5305MEMCPY(vars,ruby_scope->local_vars,VALUE,i);5306rb_mem_clear(vars+i,len-i);5307}5308else{5309*vars++=0;5310rb_mem_clear(vars,len);5311}5312ruby_scope->local_vars=vars;5313ruby_scope->flags|=SCOPE_MALLOC;5314}5315else{5316VALUE*vars=ruby_scope->local_vars-1;5317REALLOC_N(vars,VALUE,len+1);5318ruby_scope->local_vars=vars+1;5319rb_mem_clear(ruby_scope->local_vars+i,len-i);5320}5321if(ruby_scope->local_tbl&&ruby_scope->local_vars[-1]==0){5322free(ruby_scope->local_tbl);5323}5324ruby_scope->local_vars[-1]=0;/*NODEisnotnecessaryanymore*/5325ruby_scope->local_tbl=local_tbl();5326}5327}5328local_pop();5329}

(parse.y)

Sincelocal_varscanbeeitherinthestackorintheheap,itmakesthecodecomplextosomeextent.However,thisisjustupdatinglocal_tblandlocal_varsofruby_scope.(WhenSCOPE_MALLOCwasset,local_varswasallocatedbymalloc()).Andhere,becausethere’snomeaningofusingalloca(),itisforcedtochangeitsallocationmethodtomalloc.

Page 785: Ruby Hacking Guide

BlockLocalVariableBytheway,howaboutblocklocalvariables?Tothinkaboutthis,wehavetogobacktotheentrypointoftheparserfirst,itisyycompile().

▼settingruby_dyna_varsaside

staticNODE*yycompile(f,line){structRVarmap*vars=ruby_dyna_vars;:n=yyparse();:ruby_dyna_vars=vars;}

Thislookslikeameresave-restore,butthepointisthatthisdoesnotcleartheruby_dyna_vars.ThismeansthatalsointheparseritdirectlyaddselementstothelinkofRVarmapcreatedintheevaluator.

However,accordingtothepreviousdescription,thestructureofruby_dyna_varsdiffersbetweentheparserandtheevalutor.Howdoesitdealwiththedifferenceinthewayofattachingtheheader(RVarmapwhoseid=0)?

Whatishelpfulhereisthe“1”oflocal_push(1)intop_local_init().Whentheargumentoflocal_push()becomestrue,itdoesnotattachthefirstheaderofruby_dyna_vars.Itmeans,itwouldlook

Page 786: Ruby Hacking Guide

likeFigure1.Now,itisassuredthatwecanrefertotheblocklocalvariablesoftheoutsidescopefrominsideofastringtoeval.

Figure1:ruby_dyna_varsinsideeval

Well,it’ssurewecanreferto,butdidn’tyousaythatruby_dyna_varsisentirelyfreedintheparser?Whatcanwedoifthelinkcreatedattheevaluatorwillbefreed?…I’dlikethereaderswhonoticedthistoberelievedbyreadingthenextpart.

▼yycompile()−freeingruby_dyna_vars

2386vp=ruby_dyna_vars;2387ruby_dyna_vars=vars;2388lex_strterm=0;2389while(vp&&vp!=vars){2390structRVarmap*tmp=vp;2391vp=vp->next;2392rb_gc_force_recycle((VALUE)tmp);2393}

(parse.y)

Itisdesignedsothattheloopwouldstopwhenitreachesthelinkcreatedattheevaluator(vars).

Page 787: Ruby Hacking Guide

instance_eval

TheWholePictureThesubstanceofModule#module_evalisrb_mod_module_eval(),andthesubstanceofObject#instance_evalisrb_obj_instance_eval().

▼rb_mod_module_eval()rb_obj_instance_eval()

5316VALUE5317rb_mod_module_eval(argc,argv,mod)5318intargc;5319VALUE*argv;5320VALUEmod;5321{5322returnspecific_eval(argc,argv,mod,mod);5323}

5298VALUE5299rb_obj_instance_eval(argc,argv,self)5300intargc;5301VALUE*argv;5302VALUEself;5303{5304VALUEklass;53055306if(rb_special_const_p(self)){5307klass=Qnil;5308}5309else{5310klass=rb_singleton_class(self);5311}53125313returnspecific_eval(argc,argv,klass,self);5314}

(eval.c)

Page 788: Ruby Hacking Guide

Thesetwomethodshaveacommonpartas“amethodtoreplaceselfwithclass”,thatpartisdefinedasspecific_eval().Figure2showsitandalsowhatwillbedescribed.Whatwithparenthesesarecallsbyfunctionpointers.

Figure2:CallGraph

Whicheverinstance_evalormodule_eval,itcanacceptbothablockandastring,thusitbranchesforeachparticularprocesstoyieldandevalrespectively.However,mostofthemarealsocommonagain,thispartisextractedasexec_under().

Butforthosewhoreading,onehavetosimultaneouslyfaceat2times2=4ways,itisnotagoodplan.Therefore,hereweassumeonlythecasewhen

1. itisaninstance_eval2. whichtakesastringasitsargument

Page 789: Ruby Hacking Guide

.Andextractingallfunctionsunderrb_obj_instance_eval()in-line,foldingconstants,we’llreadtheresult.

AfterAbsorbedAfterall,itbecomesverycomprehensibleincomparisontotheonebeforebeingabsorbed.

▼specific_eval()−instance_eval,eval,string

staticVALUEinstance_eval_string(self,src,file,line)VALUEself,src;constchar*file;intline;{VALUEsclass;VALUEresult;intstate;intmode;

sclass=rb_singleton_class(self);

PUSH_CLASS();ruby_class=sclass;PUSH_FRAME();ruby_frame->self=ruby_frame->prev->self;ruby_frame->last_func=ruby_frame->prev->last_func;ruby_frame->last_class=ruby_frame->prev->last_class;ruby_frame->argc=ruby_frame->prev->argc;ruby_frame->argv=ruby_frame->prev->argv;if(ruby_frame->cbase!=sclass){ruby_frame->cbase=rb_node_newnode(NODE_CREF,sclass,0,ruby_frame->cbase);}PUSH_CREF(sclass);

mode=scope_vmode;

Page 790: Ruby Hacking Guide

SCOPE_SET(SCOPE_PUBLIC);PUSH_TAG(PROT_NONE);if((state=EXEC_TAG())==0){result=eval(self,src,Qnil,file,line);}POP_TAG();SCOPE_SET(mode);

POP_CREF();POP_FRAME();POP_CLASS();if(state)JUMP_TAG(state);

returnresult;}

ItseemsthatthispushesthesingletonclassoftheobjecttoCLASSandCREFandruby_frame->cbase.Themainprocessisone-shotofeval().ItisunusualthatthingssuchasinitializingFRAMEbyastruct-copyaremissing,butthisisalsonotcreatesomuchdifference.

BeforebeingabsorbedThoughtheauthorsaiditbecomesmorefriendlytoread,it’spossibleithasbeenalreadysimplesinceitwasnotabsorbed,let’scheckwhereissimplifiedincomparisontothebefore-absorbedone.

Thefirstoneisspecific_eval().SincethisfunctionistosharethecodeoftheinterfacetoRuby,almostallpartsofitistoparsetheparameters.Hereistheresultofcuttingthemall.

Page 791: Ruby Hacking Guide

▼specific_eval()(simplified)

5258staticVALUE5259specific_eval(argc,argv,klass,self)5260intargc;5261VALUE*argv;5262VALUEklass,self;5263{5264if(rb_block_given_p()){

5268returnyield_under(klass,self);5269}5270else{

5294returneval_under(klass,self,argv[0],file,line);5295}5296}

(eval.c)

Asyoucansee,thisisperfectlybranchesintwowaysbasedonwhetherthere’sablockornot,andeachroutewouldneverinfluencetheother.Therefore,whenreading,weshouldreadonebyone.Tobeginwith,theabsorbedversionisenhancedinthispoint.

Andfileandlineareirrelevantwhenreadingyield_under(),thusinthecasewhentherouteofyieldisabsorbedbythemainbody,itmightbecomeobviousthatwedon’thavetothinkabouttheparseoftheseparametersatall.

Next,we’lllookateval_under()andeval_under_i().

▼eval_under()

Page 792: Ruby Hacking Guide

5222staticVALUE5223eval_under(under,self,src,file,line)5224VALUEunder,self,src;5225constchar*file;5226intline;5227{5228VALUEargs[4];52295230if(ruby_safe_level>=4){5231StringValue(src);5232}5233else{5234SafeStringValue(src);5235}5236args[0]=self;5237args[1]=src;5238args[2]=(VALUE)file;5239args[3]=(VALUE)line;5240returnexec_under(eval_under_i,under,under,args);5241}

5214staticVALUE5215eval_under_i(args)5216VALUE*args;5217{5218returneval(args[0],args[1],Qnil,(char*)args[2],(int)args[3]);5219}

(eval.c)

Inthisfunction,inordertomakeitsargumentssingle,itstoresthemintotheargsarrayandpassesit.Wecanimaginethatthisargsexistsasatemporarycontainertopassfromeval_under()toeval_under_i(),butnotsurethatitistrulyso.It’spossiblethatargsismodifiedinsideevec_under().

Asawaytoshareacode,thisisaveryrightwaytodo.Butforthosewhoreadit,thiskindofindirectpassingisincomprehensible.

Page 793: Ruby Hacking Guide

Particularly,becausethereareextracastingsforfileandlinetofoolthecompiler,itishardtoimaginewhatweretheiractualtypes.Thepartsaroundthisentirelydisappearedintheabsorbedversion,soyoudon’thavetoworryaboutgettinglost.

However,it’stoomuchtosaythatabsorbingandextractingalwaysmakesthingseasiertounderstand.Forexample,whencallingexec_under(),underispassedasboththesecondandthirdarguments,butisitallrightiftheexec_under()sideextractsthebothparametervariablesintounder?Thatistosay,thesecondandthirdargumentsofexec_under()are,infact,indicatingCLASSandCREFthatshouldbepushed.CLASSandCREFare“differentthings”,itmightbebettertousedifferentvariables.Alsointhepreviousabsorbedversion,foronlythispoint,

VALUEsclass=.....;VALUEcbase=sclass;

IthoughtthatIwouldwritethisway,butalsothoughtitcouldgivethestrangeimpressionifabruptlyonlythesevariablesareleft,thusitwasextractedassclass.Itmeansthatthisisonlybecauseoftheflowofthetexts.

Bynow,somanytimes,I’veextractedargumentsandfunctions,andforeachtimeIrepeatedlyexplainedthereasontoextract.Theyare

thereareonlyafewpossiblepatternsthebehaviorcanslightlychange

Page 794: Ruby Hacking Guide

Definitely,I’mnotsaying“Inwhateverwaysextractingvariousthingsalwaysmakesthingssimpler”.

Inwhatevercase,whatofthefirstpriorityisthecomprehensibilityforourselfandnotkeepcomplyingthemethodology.Whenextractingmakesthingssimpler,extractit.Whenwefeelthatnotextractingorconverselybundlingasaproceduremakesthingseasiertounderstand,letusdoit.Asforruby,Ioftenextractedthembecausetheoriginaliswrittenproperly,butifasourcecodewaswrittenbyapoorprogrammer,aggressivelybundlingtofunctionsshouldoftenbecomeagoodchoice.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 795: Ruby Hacking Guide

RubyHackingGuide

TranslatedbyVincentISAMBART

Page 796: Ruby Hacking Guide

Chapter18:Loading

Outline

InterfaceAttheRubylevel,therearetwoproceduresthatcanbeusedforloading:requireandload.

require'uri'#loadtheurilibraryload'/home/foo/.myrc'#readaresourcefile

Theyarebothnormalmethods,compiledandevaluatedexactlylikeanyothercode.Itmeansloadingoccursaftercompilationgavecontroltotheevaluationstage.

Thesetwofunctioneachhavetheirownuse.‘require’istoloadlibraries,andloadistoloadanarbitraryfile.Let’sseethisinmoredetails.

require

requirehasfourfeatures:

thefileissearchedforintheloadpath

Page 797: Ruby Hacking Guide

itcanloadextensionlibrariesthe.rb/.soextensioncanbeomittedagivenfileisneverloadedmorethanonce

Ruby’sloadpathisintheglobalvariable$:,whichcontainsanarrayofstrings.Forexample,displayingthecontentofthe$:intheenvironmentIusuallyusewouldshow:

%ruby-e'puts$:'/usr/lib/ruby/site_ruby/1.7/usr/lib/ruby/site_ruby/1.7/i686-linux/usr/lib/ruby/site_ruby/usr/lib/ruby/1.7/usr/lib/ruby/1.7/i686-linux.

Callingputsonanarraydisplaysoneelementoneachlinesoit’seasytoread.

AsIranconfigureusing--prefix=/usr,thelibrarypathis/usr/lib/rubyandbelow,butifyoucompileitnormallyfromthesourcecode,thelibrarieswillbein/usr/local/lib/rubyandbelow.InaWindowsenvironment,therewillalsobeadriveletter.

Then,let’strytorequirethestandardlibrarynkf.sofromtheloadpath.

require'nkf'

Iftherequirednamehasnoextension,requiresilentlycompensates.First,ittrieswith.rb,thenwith.so.Onsome

Page 798: Ruby Hacking Guide

platformsitalsotriestheplatform’sspecificextensionforextensionlibraries,forexample.dllinaWindowsenvironmentor.bundleonMacOSX.

Let’sdoasimulationonmyenvironment.rubychecksthefollowingpathsinsequentialorder.

/usr/lib/ruby/site_ruby/1.7/nkf.rb/usr/lib/ruby/site_ruby/1.7/nkf.so/usr/lib/ruby/site_ruby/1.7/i686-linux/nkf.rb/usr/lib/ruby/site_ruby/1.7/i686-linux/nkf.so/usr/lib/ruby/site_ruby/nkf.rb/usr/lib/ruby/site_ruby/nkf.so/usr/lib/ruby/1.7/nkf.rb/usr/lib/ruby/1.7/nkf.so/usr/lib/ruby/1.7/i686-linux/nkf.rb/usr/lib/ruby/1.7/i686-linux/nkf.sofound!

nkf.sohasbeenfoundin/usr/lib/ruby/1.7/i686-linux.Oncethefilehasbeenfound,require’slastfeature(notloadingthefilemorethanonce)locksthefile.Thelocksarestringsputintheglobalvariable$".Inourcasethestring"nkf.so"hasbeenputthere.Eveniftheextensionhasbeenomittedwhencallingrequire,thefilenamein$"hastheextension.

require'nkf'#afterloadingnkf...p$"#["nkf.so"]thefileislocked

require'nkf'#nothinghappensifwerequireitagainp$"#["nkf.so"]thecontentofthelockarraydoesnotchange

Therearetworeasonsforaddingthemissingextension.Thefirstoneisnottoloadittwiceifthesamefileislaterrequiredwithits

Page 799: Ruby Hacking Guide

extension.Thesecondoneistobeabletoloadbothnkf.rbandnkf.so.Infacttheextensionsaredisparate(.so.dll.bundleetc.)dependingontheplatform,butatlockingtimetheyallbecome.so.That’swhywhenwritingaRubyprogramyoucanignorethedifferencesofextensionsandconsiderit’salwaysso.SoyoucansaythatrubyisquiteUNIXoriented.

Bytheway,$"canbefreelymodifiedevenattheRubylevelsowecannotsayit’sastronglock.Youcanforexampleloadanextensionlibrarymultipletimesifyouclear$".

load

loadisaloteasierthanrequire.Likerequire,itsearchesthefilein$:.ButitcanonlyloadRubyprograms.Furthermore,theextensioncannotbeomitted:thecompletefilenamemustalwaysbegiven.

load'uri.rb'#loadtheURIlibrarythatispartofthestandardlibrary

Inthissimpleexamplewetrytoloadalibrary,buttheproperwaytouseloadisforexampletoloadaresourcefilegivingitsfullpath.

FlowofthewholeprocessIfweroughlysplitit,“loadingafile”canbesplitin:

findingthefilereadingthefileandmappingittoaninternalform

Page 800: Ruby Hacking Guide

evaluatingit

Theonlydifferencebetweenrequireandloadishowtofindthefile.Therestisthesameinboth.

Wewilldevelopthelastevaluationpartalittlemore.LoadedRubyprogramsarebasicallyevaluatedatthetop-level.Itmeansthedefinedconstantswillbetop-levelconstantsandthedefinedmethodswillbefunction-stylemethods.

###mylib.rbMY_OBJECT=Object.newdefmy_p(obj)pobjend

###first.rbrequire'mylib'my_pMY_OBJECT#wecanusetheconstantsandmethodsdefinedinanotherfile

Onlythelocalvariablescopeofthetop-levelchangeswhenthefilechanges.Inotherwords,localvariablescannotbesharedbetweendifferentfiles.YoucanofcoursesharethemusingforexampleProcbutthishasnothingtodowiththeloadmechanism.

Somepeoplealsomisunderstandtheloadingmechanism.Whatevertheclassyouareinwhenyoucallload,itdoesnotchangeanything.Evenif,likeinthefollowingexample,youloadafileinthemodulestatement,itdoesnotserveanypurpose,aseverythingthatisatthetop-leveloftheloadedfileisputattheRubytop-level.

Page 801: Ruby Hacking Guide

require'mylib'#whatevertheplaceyourequirefrom,beitatthetop-levelmoduleSandBoxrequire'mylib'#orinamodule,theresultisthesameend

HighlightsofthischapterWiththeaboveknowledgeinourmind,wearegoingtoread.Butbecausethistimeitsspecificationisdefinedveryparticularly,ifwesimplyreadit,itcouldbejustaenumerationofthecodes.Therefore,inthischapter,wearegoingtoreducethetargettothefollowing3points:

loadingserialisationtherepartitionofthefunctionsinthedifferentsourcefileshowextensionlibrariesareloaded

Regardingthefirstpoint,youwillunderstanditwhenyouseeit.

Forthesecondpoint,thefunctionsthatappearinthischaptercomefrom4differentfiles,eval.cruby.cfile.cdln.c.Whyisthisinthisway?We’lltrytothinkabouttherealisticsituationbehindit.

Thethirdpointisjustlikeitsnamesays.Wewillseehowthecurrentlypopulartrendofexecutiontimeloading,morecommonlyreferredtoasplug-ins,works.Thisisthemostinterestingpartofthischapter,soI’dliketouseasmanypagesaspossibletotalkaboutit.

Page 802: Ruby Hacking Guide

Searchingthelibrary

rb_f_require()

Thebodyofrequireisrb_f_require.First,wewillonlylookatthepartconcerningthefilesearch.Havingmanydifferentcasesisbothersomesowewilllimitourselvestothecasewhennofileextensionisgiven.

▼rb_f_require()(simplifiedversion)

5527VALUE5528rb_f_require(obj,fname)5529VALUEobj,fname;5530{5531VALUEfeature,tmp;5532char*ext,*ftptr;/*OK*/5533intstate;5534volatileintsafe=ruby_safe_level;55355536SafeStringValue(fname);5537ext=strrchr(RSTRING(fname)->ptr,'.');5538if(ext){/*...ifthefileextensionhasbeengiven...*/5584}5585tmp=fname;5586switch(rb_find_file_ext(&tmp,loadable_ext)){5587case0:5588break;55895590case1:5591feature=fname=tmp;5592gotoload_rb;55935594default:5595feature=tmp;5596fname=rb_find_file(tmp);

Page 803: Ruby Hacking Guide

5597gotoload_dyna;5598}5599if(rb_feature_p(RSTRING(fname)->ptr,Qfalse))5600returnQfalse;5601rb_raise(rb_eLoadError,"Nosuchfiletoload--%s",RSTRING(fname)->ptr);56025603load_dyna:/*...loadanextensionlibrary...*/5623returnQtrue;56245625load_rb:/*...loadaRubyprogram...*/5648returnQtrue;5649}

5491staticconstchar*constloadable_ext[]={5492".rb",DLEXT,/*DLEXT=".so",".dll",".bundle"...*/5493#ifdefDLEXT25494DLEXT2,/*DLEXT2=".dll"onCygwin,MinGW*/5495#endif549605497};

(eval.c)

Inthisfunctionthegotolabelsload_rbandload_dynaareactuallylikesubroutines,andthetwovariablesfeatureandfnamearemoreorlesstheirparameters.Thesevariableshavethefollowingmeaning.

variable meaning example

feature thelibraryfilenamethatwillbeputin$" uri.rb、nkf.so

fname thefullpathtothelibrary /usr/lib/ruby/1.7/uri.rb

Thenamefeaturecanbefoundinthefunctionrb_feature_p().This

Page 804: Ruby Hacking Guide

functionchecksifafilehasbeenlocked(wewilllookatitjustafter).

Thefunctionsactuallysearchingforthelibraryarerb_find_file()andrb_find_file_ext().rb_find_file()searchesafileintheloadpath$'.rb_find_file_ext()doesthesamebutthedifferenceisthatittakesasasecondparameteralistofextensions(i.e.loadable_ext)andtriestheminsequentialorder.

Belowwewillfirstlookentirelyatthefilesearchingcode,thenwewilllookatthecodeoftherequirelockinload_rb.

rb_find_file()

Firstthefilesearchcontinuesinrb_find_file().Thisfunctionsearchesthefilepathinthegloballoadpath$'(rb_load_path).Thestringcontaminationcheckistiresomesowe’llonlylookatthemainpart.

▼rb_find_file()(simplifiedversion)

2494VALUE2495rb_find_file(path)2496VALUEpath;2497{2498VALUEtmp;2499char*f=RSTRING(path)->ptr;2500char*lpath;

2530if(rb_load_path){2531longi;2532

Page 805: Ruby Hacking Guide

2533Check_Type(rb_load_path,T_ARRAY);2534tmp=rb_ary_new();2535for(i=0;i<RARRAY(rb_load_path)->len;i++){2536VALUEstr=RARRAY(rb_load_path)->ptr[i];2537SafeStringValue(str);2538if(RSTRING(str)->len>0){2539rb_ary_push(tmp,str);2540}2541}2542tmp=rb_ary_join(tmp,rb_str_new2(PATH_SEP));2543if(RSTRING(tmp)->len==0){2544lpath=0;2545}2546else{2547lpath=RSTRING(tmp)->ptr;2551}2552}

2560f=dln_find_file(f,lpath);2561if(file_load_ok(f)){2562returnrb_str_new2(f);2563}2564return0;2565}

(file.c)

IfwewritewhathappensinRubywegetthefollowing:

tmp=[]#makeanarray$:.eachdo|path|#repeatoneachelementoftheloadpathtmp.pushpathifpath.length>0#checkthepathandpushitendlpath=tmp.join(PATH_SEP)#concatenateallelementsinonestringseparatedbyPATH_SEP

dln_find_file(f,lpath)#mainprocessing

PATH_SEPisthepathseparator:':'underUNIX,';'underWindows.rb_ary_join()createsastringbyputtingitbetweenthedifferent

Page 806: Ruby Hacking Guide

elements.Inotherwords,theloadpaththathadbecomeanarrayisbacktoastringwithaseparator.

Why?It’sonlybecausedln_find_file()takesthepathsasastringwithPATH_SEPasaseparator.Butwhyisdln_find_file()implementedlikethat?It’sjustbecausedln.cisnotalibraryforruby.Evenifithasbeenwrittenbythesameauthor,it’sageneralpurposelibrary.That’spreciselyforthisreasonthatwhenIsortedthefilesbycategoryintheIntroductionIputthisfileintheUtilitycategory.GeneralpurposelibrariescannotreceiveRubyobjectsasparametersorreadrubyglobalvariables.

dln_find_file()alsoexpandsforexample~tothehomedirectory,butinfactthisisalreadydoneintheomittedpartofrb_find_file().Soinruby‘scaseit’snotnecessary.

LoadingwaitHere,filesearchisfinishedquickly.Thencomesistheloadingcode.Ormoreaccurately,itis“uptojustbeforetheload”.Thecodeofrb_f_require()’sload_rbhasbeenputbelow.

▼rb_f_require():load_rb

5625load_rb:5626if(rb_feature_p(RSTRING(feature)->ptr,Qtrue))5627returnQfalse;5628ruby_safe_level=0;5629rb_provide_feature(feature);5630/*theloadingofRubyprogramsisserialised*/

Page 807: Ruby Hacking Guide

5631if(!loading_tbl){5632loading_tbl=st_init_strtable();5633}5634/*partialstate*/5635ftptr=ruby_strdup(RSTRING(feature)->ptr);5636st_insert(loading_tbl,ftptr,curr_thread);/*...loadtheRubyprogramandevaluateit...*/5643st_delete(loading_tbl,&ftptr,0);/*loadingdone*/5644free(ftptr);5645ruby_safe_level=safe;

(eval.c)

Likementionedabove,rb_feature_p()checksifalockhasbeenputin$".Andrb_provide_feature()pushesastringin$",inotherwordslocksthefile.

Theproblemcomesafter.Likethecommentsays“theloadingofRubyprogramsisserialised”.Inotherwords,afilecanonlybeloadedfromonethread,andifduringtheloadinganotherthreadtriestoloadthesamefile,thatthreadwillwaitforthefirstloadingtobefinished.Ifitwerenotthecase:

Thread.fork{require'foo'#Atthebeginningofrequire,foo.rbisaddedto$"}#Howeverthethreadchangesduringtheevaluationoffoo.rbrequire'foo'#foo.rbisalreadyin$"sothefunctionreturnsimmediately#(A)theclassesoffooareused...

Bydoingsomethinglikethis,eventhoughthefoolibraryisnotreallyloaded,thecodeat(A)endsupbeingexecuted.

Theprocesstoenterthewaitingstateissimple.Ast_tableiscreatedinloading_tbl,theassociation“feature=>waitingthread”is

Page 808: Ruby Hacking Guide

recordedinit.curr_threadisineval.c’sfunctions,itsvalueisthecurrentrunningthread.

Themechanismtoenterthewaitingstateisverysimple.Ast_tableiscreatedintheloading_tblglobalvariable,anda“feature=>loadingthread”associationiscreated.curr_threadisavariablefromeval.c,anditsvalueisthecurrentlyrunningthread.Thatmakesanexclusivelock.Andinrb_feature_p(),wewaitfortheloadingthreadtoendlikethefollowing.

▼rb_feature_p()(secondhalf)

5477rb_thread_tth;54785479while(st_lookup(loading_tbl,f,&th)){5480if(th==curr_thread){5481returnQtrue;5482}5483CHECK_INTS;5484rb_thread_schedule();5485}

(eval.c)

Whenrb_thread_schedule()iscalled,thecontrolistransferredtoanotherthread,andthisfunctiononlyreturnsafterthecontrolreturnedbacktothethreadwhereitwascalled.Whenthefilenamedisappearsfromloading_tbl,theloadingisfinishedsothefunctioncanend.Thecurr_threadcheckisnottolockitself(figure1).

Page 809: Ruby Hacking Guide

Figure1:Serialisationofloads

Page 810: Ruby Hacking Guide

LoadingofRubyprograms

rb_load()

Wewillnowlookattheloadingprocessitself.Let’sstartbythepartinsiderb_f_require()’sload_rbloadingRubyprograms.

▼rb_f_require()-load_rb-loading

5638PUSH_TAG(PROT_NONE);5639if((state=EXEC_TAG())==0){5640rb_load(fname,0);5641}5642POP_TAG();

(eval.c)

Therb_load()whichiscalledhereisactuallythe“meat”oftheRuby-levelload.Thismeansitneedstosearchonceagain,butlookingatthesameprocedureonceagainistoomuchtrouble.Therefore,thatpartisomittedinthebelowcodes.

Andthesecondargumentwrapisfoldedwith0becauseitis0intheabovecallingcode.

▼rb_load()(simplifiededition)

voidrb_load(fname,/*wrap=0*/)VALUEfname;{intstate;volatileIDlast_func;

Page 811: Ruby Hacking Guide

volatileVALUEwrapper=0;volatileVALUEself=ruby_top_self;NODE*saved_cref=ruby_cref;

PUSH_VARS();PUSH_CLASS();ruby_class=rb_cObject;ruby_cref=top_cref;/*(A-1)changeCREF*/wrapper=ruby_wrapper;ruby_wrapper=0;PUSH_FRAME();ruby_frame->last_func=0;ruby_frame->last_class=0;ruby_frame->self=self;/*(A-2)changeruby_frame->cbase*/ruby_frame->cbase=(VALUE)rb_node_newnode(NODE_CREF,ruby_class,0,0);PUSH_SCOPE();/*atthetop-levelthevisibilityisprivatebydefault*/SCOPE_SET(SCOPE_PRIVATE);PUSH_TAG(PROT_NONE);ruby_errinfo=Qnil;/*makesureit'snil*/state=EXEC_TAG();last_func=ruby_frame->last_func;if(state==0){NODE*node;

/*(B)thisisdealtwithasevalforsomereasons*/ruby_in_eval++;rb_load_file(RSTRING(fname)->ptr);ruby_in_eval--;node=ruby_eval_tree;if(ruby_nerrs==0){/*noparseerroroccurred*/eval_node(self,node);}}ruby_frame->last_func=last_func;POP_TAG();ruby_cref=saved_cref;POP_SCOPE();POP_FRAME();POP_CLASS();POP_VARS();ruby_wrapper=wrapper;if(ruby_nerrs>0){/*aparseerroroccurred*/

Page 812: Ruby Hacking Guide

ruby_nerrs=0;rb_exc_raise(ruby_errinfo);}if(state)jump_tag_but_local_jump(state);if(!NIL_P(ruby_errinfo))/*anexceptionwasraisedduringtheloading*/rb_exc_raise(ruby_errinfo);}

Justafterwethoughtwe’vebeenthroughthestormofstackmanipulationsweenteredagain.Althoughthisistough,let’scheerupandreadit.

Asthelongfunctionsusuallyare,almostallofthecodeareoccupiedbytheidioms.PUSH/POP,tagprotectingandre-jumping.Amongthem,whatwewanttofocusonisthethingson(A)whichrelatetoCREF.Sincealoadedprogramisalwaysexecutedonthetop-level,itsetsaside(notpush)ruby_crefandbringsbacktop_cref.ruby_frame->cbasealsobecomesanewone.

Andonemoreplace,at(B)somehowruby_in_evalisturnedon.Whatisthepartinfluencedbythisvariable?Iinvestigateditanditturnedoutthatitseemsonlyrb_compile_error().Whenruby_in_evalistrue,themessageisstoredintheexceptionobject,butwhenitisnottrue,themessageisprintedtostderr.Inotherwords,whenitisaparseerrorofthemainprogramofthecommand,itwantstoprintdirectlytostderr,butwheninsideoftheevaluator,itisnotappropriatesoitstopstodoit.Itseemsthe“eval”ofruby_in_evalmeansneithertheevalmethodnortheeval()functionbut“evaluate”asageneralnoun.Or,it’spossibleitindicateseval.c.

Page 813: Ruby Hacking Guide

rb_load_file()

Then,allofasudden,thesourcefileisruby.chere.Ortoputitmoreaccurately,essentiallyitisfavorableiftheentireloadingcodewasputinruby.c,butrb_load()hasnochoicebuttousePUSH_TAGandsuch.Therefore,puttingitineval.cisinevitable.Ifitwerenotthecase,allofthemwouldbeputineval.cinthefirstplace.

Then,itisrb_load_file().

▼rb_load_file()

865void866rb_load_file(fname)867char*fname;868{869load_file(fname,0);870}

(ruby.c)

Delegatedentirely.Thesecondargumentscriptofload_file()isabooleanvalueanditindicateswhetheritisloadingthefileoftheargumentoftherubycommand.Now,becausewe’dliketoassumeweareloadingalibrary,let’sfolditbyreplacingitwithscript=0.Furthermore,inthebelowcode,alsothinkingaboutthemeanings,nonessentialthingshavealreadybeenremoved.

▼load_file()(simplifiededition)

staticvoid

Page 814: Ruby Hacking Guide

load_file(fname,/*script=0*/)char*fname;{VALUEf;{FILE*fp=fopen(fname,"r");(A)if(fp==NULL){rb_load_fail(fname);}fclose(fp);}f=rb_file_open(fname,"r");(B)rb_compile_file(fname,f,1);(C)rb_io_close(f);}

(A)Thecalltofopen()istocheckifthefilecanbeopened.Ifthereisnoproblem,it’simmediatelyclosed.Itmayseemalittleuselessbutit’sanextremelysimpleandyethighlyportableandreliablewaytodoit.

(B)Thefileisopenedonceagain,thistimeusingtheRubylevellibraryFile.open.ThefilewasnotopenedwithFile.openfromthebeginningsoasnottoraiseanyRubyexception.Hereifanyexceptionoccurredwewouldliketohavealoadingerror,butgettingtheerrorsrelatedtoopen,forexampleErrno::ENOENT,Errno::EACCESS…,wouldbeproblematic.Weareinruby.csowecannotstopatagjump.

(C)Usingtheparserinterfacerb_compile_file(),theprogramisreadfromanIOobject,andcompiledinasyntaxtree.Thesyntaxtreeisaddedtoruby_eval_treesothereisnoneedtogettheresult.

Page 815: Ruby Hacking Guide

That’sallfortheloadingcode.Finally,thecallswerequitedeepsothecallgraphofrb_f_require()isshownbellow.

rb_f_require....eval.crb_find_file....file.cdln_find_file....dln.cdln_find_file_1rb_loadrb_load_file....ruby.cload_filerb_compile_file....parse.yeval_node

Youmustbringcallgraphsonalongtrip.It’scommonknowledge.

ThenumberofopenrequiredforloadingPreviously,therewasopenusedjusttocheckifafilecanbeopen,butinfact,duringtheloadingprocessofruby,additionallyotherfunctionssuchasrb_find_file_ext()alsointernallydochecksbyusingopen.Howmanytimesisopen()calledinthewholeprocess?

Ifyou’rewonderingthat,justactuallycountingitistherightattitudeasaprogrammer.Wecaneasilycountitbyusingasystemcalltracer.ThetooltousewouldbestraceonLinux,trussonSolaris,ktraceortrussonBSD.Likethis,foreachOS,thenameisdifferentandthere’snoconsistency,butyoucanfindthembygoogling.

Ifyou’reusingWindows,probablyyourIDEwillhaveatracerbuiltin.Well,asmymainenvironmentisLinux,Ilookedusingstrace.

Page 816: Ruby Hacking Guide

Theoutputisdoneonstderrsoitwasredirectedusing2>&1.

%straceruby-e'require"rational"'2>&1|grep'^open'open("/etc/ld.so.preload",O_RDONLY)=-1ENOENTopen("/etc/ld.so.cache",O_RDONLY)=3open("/usr/lib/libruby-1.7.so.1.7",O_RDONLY)=3open("/lib/libdl.so.2",O_RDONLY)=3open("/lib/libcrypt.so.1",O_RDONLY)=3open("/lib/libc.so.6",O_RDONLY)=3open("/usr/lib/ruby/1.7/rational.rb",O_RDONLY|O_LARGEFILE)=3open("/usr/lib/ruby/1.7/rational.rb",O_RDONLY|O_LARGEFILE)=3open("/usr/lib/ruby/1.7/rational.rb",O_RDONLY|O_LARGEFILE)=3open("/usr/lib/ruby/1.7/rational.rb",O_RDONLY|O_LARGEFILE)=3

Untiltheopenoflibc.so.6,itistheopenusedintheimplementationofdynamiclinks,andtherearetheotherfouropens.Thusitseemsthethreeofthemareuseless.

Loadingofextensionlibraries

rb_f_require()-load_dynaThistimewewillseetheloadingofextensionlibraries.Wewillstartwithrb_f_require()’sload_dyna.However,wedonotneedthepartaboutlockinganymoresoitwasremoved.

▼rb_f_require()-load_dyna

5607{5608intvolatileold_vmode=scope_vmode;

Page 817: Ruby Hacking Guide

56095610PUSH_TAG(PROT_NONE);5611if((state=EXEC_TAG())==0){5612void*handle;56135614SCOPE_SET(SCOPE_PUBLIC);5615handle=dln_load(RSTRING(fname)->ptr);5616rb_ary_push(ruby_dln_librefs,LONG2NUM((long)handle));5617}5618POP_TAG();5619SCOPE_SET(old_vmode);5620}5621if(state)JUMP_TAG(state);

(eval.c)

Bynow,thereisverylittleherewhichisnovel.Thetagsareusedonlyinthewayoftheidiom,andtosave/restorethevisibilityscopeisdoneinthewaywegetusedtosee.Allthatremainsisdln_load().Whatonearthisthatfor?Fortheanswer,continuetothenextsection.

Brushupaboutlinksdln_load()isloadinganextensionlibrary,butwhatdoesloadinganextensionlibrarymean?Totalkaboutit,weneedtodramaticallyrollbackthetalktothephysicalworld,andstartwithaboutlinks.

IthinkcompilingCprogramsis,ofcourse,notanewthingforyou.SinceI’musinggcconLinux,Icancreatearunnableprograminthefollowingmanner.

%gcchello.c

Page 818: Ruby Hacking Guide

Accordingtothefilename,thisisprobablyan“Hello,World!”program.InUNIX,gccoutputsaprogramintoafilenameda.outbydefault,soyoucansubsequentlyexecuteitinthefollowingway:

%./a.outHello,World!

Itiscreatedproperly.

Bytheway,whatisgccactuallydoinghere?Usuallywejustsay“compile”or“compile”,butactually

1. preprocess(cpp)2. compileCintoassembly(cc)3. assembletheassemblylanguageintomachinecode(as)4. link(ld)

therearethesefoursteps.Amongthem,preprocessingandcompilingandassemblingaredescribedinalotofplaces,butthedescriptionoftenendswithoutclearlydescribingaboutthelinkingphase.Itislikeahistoryclassinschoolwhichwouldneverreach“modernage”.Therefore,inthisbook,tryingtoprovidetheextinguishedpart,I’llbrieflysummarizewhatislinking.

Aprogramfinishedtheassemblingphasebecomesan“objectfile”insomewhatformat.Thefollowingformatsaresomeofsuchformatswhicharemajor.

ELF,ExecutableandLinkingFormat(recentUNIX)

Page 819: Ruby Hacking Guide

a.out,assembleroutput(relativelyoldUNIX)COFF,CommonObjectFileFormat(Win32)

Itmightgowithoutsayingthatthea.outasanobjectfileformatandthea.outasadefaultoutputfilenameofccaretotallydifferentthings.Forexample,onmodernLinux,whenwecreateitordinarily,thea.outfileinELFformatiscreated.

And,howtheseobjectfileformatsdiffereachotherisnotimportantnow.Whatwehavetorecognizenowis,alloftheseobjectfilescanbeconsideredas“asetofnames”.Forexample,thefunctionnamesandthevariablenameswhichexistinthisfile.

And,setsofnamescontainedintheobjectfilehavetwotypes.

setofnecessarynames(forinstance,theexternalfunctionscalledinternally.e.g.printf)

setofprovidingnames(forinstance,thefunctionsdefinedinternally.e.g.hello)

Andlinkingis,whengatheringmultipleobjectfiles,checkingif“thesetofprovidingnames”contains“thesetofnecessarynames”entirely,andconnectingthemeachother.Inotherwords,pullingthelinesfromallof“thenecessarynames”,eachlinemustbeconnectedtooneof“theprovidingnames”ofaparticularobjectfile.(Figure.2)Toputthisintechnicalterms,itisresolvingundefinedsymbols.

Page 820: Ruby Hacking Guide

Figure2:objectfilesandlinking

Logicallythisishowitis,butinrealityaprogramcan’trunonlybecauseofthis.Atleast,Cprogramscannotrunwithoutconvertingthenamestotheaddresses(numbers).

So,afterthelogicalconjunctions,thephysicalconjunctionsbecomenecessary.Wehavetomapobjectfilesintotherealmemoryspaceandsubstitutetheallnameswithnumbers.Concretelyspeaking,forinstance,theaddressestojumptoonfunctioncallsareadjustedhere.

And,basedonthetimingwhentodothesetwoconjunctions,linkingisdividedintotwotypes:staticlinkinganddynamiclinking.Staticlinkingfinishestheallphasesduringthecompiletime.Ontheotherhand,dynamiclinkingdeferssomeoftheconjunctionstotheexecutingtime.Andlinkingisfinallycompletedwhenexecuting.

Page 821: Ruby Hacking Guide

However,whatexplainedhereisaverysimpleidealisticmodel,andithasanaspectdistortingtherealityalot.Logicalconjunctionsandphysicalconjunctionsarenotsocompletelyseparated,and“anobjectfileisasetofnames”istoonaive.Butthebehavioraroundthisconsiderablydiffersdependingoneachplatform,describingseriouslywouldendupwithonemorebook.Toobtaintherealisticlevelknowledge,additionally,“ExpertCProgramming:DeepCSecrets”byPetervanderLinden,“LinkersandLoaders”byJohnR.LevineIrecommendtoreadthesebooks.

LinkingthatistrulydynamicAndfinallywegetintoourmaintopic.The“dynamic”in“dynamiclinking”naturallymeansit“occursatexecutiontime”,butwhatpeopleusuallyrefertoas“dynamiclinking”isprettymuchdecidedalreadyatcompiletime.Forexample,thenamesoftheneededfunctions,andwhichlibrarytheycanbefoundin,arealreadyknown.Forinstance,ifyouneedcos(),youknowit’sinlibm,soyouusegcc-lm.Ifyoudidn’tspecifythecorrectlibraryatcompiletime,you’dgetalinkerror.

Butextensionlibrariesaredifferent.Neitherthenamesoftheneededfunctions,orthenameofthelibrarywhichdefinesthemareknownatcompiletime.Weneedtoconstructastringatexecutiontimeandloadandlink.Itmeansthateven“thelogicalconjunctions”inthesenseofthepreviouswordsshouldbedoneentirelyatexecutiontime.Inordertodoit,anothermechanismthatisalittledifferentformtheordinaldynamiclinkingsis

Page 822: Ruby Hacking Guide

required.

Thismanipulation,linkingthatisentirelydecidedatruntime,isusuallycalled“dynamicload”.

DynamicloadAPII’vefinishedtoexplaintheconcept.Therestishowtodothatdynamicloading.Thisisnotadifficultthing.Usuallythere’saspecificAPIpreparedinthesystem,wecanaccomplishitbymerelycallingit.

Forexample,whatisrelativelybroadforUNIXistheAPInameddlopen.However,Ican’tsay“ItisalwaysavailableonUNIX”.Forexample,foralittlepreviousHP-UXhasatotallydifferentinterface,andaNeXT-flavorAPIisusedonMacOSX.Andevenifitisthesamedlopen,itisincludedinlibconBSD-derivedOS,anditisattachedfromoutsideaslibdlonLinux.Therefore,itisdesperatelynotportable.ItdiffersevenamongUNIX-basedplatforms,itisobvioustobecompletelydifferentintheotherOperatingSystems.ItisunlikelythatthesameAPIisused.

Then,howrubyisdoingis,inordertoabsorbthetotallydifferentinterfaces,thefilenameddln.cisprepared.dlnisprobablytheabbreviationof“dynamiclink”.dln_load()isoneoffunctionsofdln.c.

WheredynamicloadingAPIsaretotallydifferenteachother,the

Page 823: Ruby Hacking Guide

onlysavingistheusagepatternofAPIiscompletelythesame.Whicheverplatformyouareon,

1. mapthelibrarytotheaddressspaceoftheprocess2. takethepointerstothefunctionscontainedinthelibrary3. unmapthelibrary

itconsistsofthesetheresteps.Forexample,ifitisdlopen-basedAPI,

1. dlopen2. dlsym3. dlclose

arethecorrespondences.IfitisWin32API,

1. LoadLibrary(orLoadLibraryEx)2. GetProcAddress3. FreeLibrary

arethecorrespondences.

Atlast,I’lltalkaboutwhatdln_load()isdoingbyusingtheseAPIs.Itis,infact,callingInit_xxxx().Byreachinghere,wefinallybecometobeabletoillustratetheentireprocessofrubyfromtheinvocationtothecompletionwithoutanylacks.Inotherwords,whenrubyisinvoked,itinitializestheevaluatorandstartsevaluatingaprogrampassedinsomewhatway.Ifrequireorloadoccursduringtheprocess,itloadsthelibraryandtransfersthe

Page 824: Ruby Hacking Guide

control.TransferringthecontrolmeansparsingandevaluatingifitisaRubylibraryanditmeansloadingandlinkingandfinallycallingInit_xxxx()ifitisanextensionlibrary.

dln_load()

Finally,we’vereachedthecontentofdln_load().dln_load()isalsoalongfunction,butitsstructureissimplebecauseofsomereasons.Takealookattheoutlinefirst.

▼dln_load()(outline)

void*dln_load(file)constchar*file;{#ifdefined_WIN32&&!defined__CYGWIN__loadwithWin32API#elseinitializationdependingoneachplatform#ifdefeachplatform……routinesforeachplatform……#endif#endif#if!defined(_AIX)&&!defined(NeXT)failed:rb_loaderror("%s-%s",error,file);#endifreturn0;/*dummyreturn*/}

Thisway,thepartconnectingtothemainiscompletelyseparatedbasedoneachplatform.Whenthinking,weonlyhavetothinkaboutoneplatformatatime.SupportedAPIsareasfollows:

Page 825: Ruby Hacking Guide

dlopen(MostofUNIX)LoadLibrary(Win32)shl_load(abitoldHP-UX)a.out(veryoldUNIX)rld_load(beforeNeXT4)dyld(NeXTorMacOSX)get_image_symbol(BeOS)GetDiskFragment(MacOs9andbefore)load(abitoldAIX)

dln_load()-dlopen()First,let’sstartwiththeAPIcodeforthedlopenseries.

▼dln_load()-dlopen()

1254void*1255dln_load(file)1256constchar*file;1257{1259constchar*error=0;1260#defineDLN_ERROR()(error=dln_strerror(),\strcpy(ALLOCA_N(char,strlen(error)+1),error))1298char*buf;1299/*writeastring"Init_xxxx"tobuf(thespaceisallocatedwithalloca)*/1300init_funcname(&buf,file);

1304{1305void*handle;1306void(*init_fct)();13071308#ifndefRTLD_LAZY1309#defineRTLD_LAZY1

Page 826: Ruby Hacking Guide

1310#endif1311#ifndefRTLD_GLOBAL1312#defineRTLD_GLOBAL01313#endif13141315/*(A)loadthelibrary*/1316if((handle=(void*)dlopen(file,RTLD_LAZY|RTLD_GLOBAL))==NULL){1317error=dln_strerror();1318gotofailed;1319}1320/*(B)getthepointertoInit_xxxx()*/1321init_fct=(void(*)())dlsym(handle,buf);1322if(init_fct==NULL){1323error=DLN_ERROR();1324dlclose(handle);1325gotofailed;1326}1327/*(C)callInit_xxxx()*/1328(*init_fct)();13291330returnhandle;1331}

1576failed:1577rb_loaderror("%s-%s",error,file);1580}

(dln.c)

(A)theRTLD_LAZYastheargumentofdlopen()indicates“resolvingtheundefinedsymbolswhenthefunctionsareactuallydemanded”Thereturnvalueisthemark(handle)todistinguishthelibraryandwealwaysneedtopassitwhenusingdl*().

(B)dlsym()getsthefunctionpointerfromthelibraryspecifiedbythehandle.IfthereturnvalueisNULL,itmeansfailure.Here,

Page 827: Ruby Hacking Guide

gettingthepointertoInit_xxxx()IfthereturnvalueisNULL,itmeansfailure.Here,thepointertoInit_xxxx()isobtainedandcalled.

dlclose()isnotcalledhere.SincethepointerstothefunctionsoftheloadedlibraryarepossiblyreturnedinsideInit_xxx(),itistroublesomeifdlclose()isdonebecausetheentirelibrarywouldbedisabledtouse.Thus,wecan’tcalldlclose()untiltheprocesswillbefinished.

dln_load()—Win32AsforWin32,LoadLibrary()andGetProcAddress()areused.ItisverygeneralWin32APIwhichalsoappearsonMSDN.

▼dln_load()-Win32

1254void*1255dln_load(file)1256constchar*file;1257{

1264HINSTANCEhandle;1265charwinfile[MAXPATHLEN];1266void(*init_fct)();1267char*buf;12681269if(strlen(file)>=MAXPATHLEN)rb_loaderror("filenametoolong");12701271/*writethe"Init_xxxx"stringtobuf(thespaceisallocatedwithalloca)*/1272init_funcname(&buf,file);12731274strcpy(winfile,file);1275

Page 828: Ruby Hacking Guide

1276/*loadthelibrary*/1277if((handle=LoadLibrary(winfile))==NULL){1278error=dln_strerror();1279gotofailed;1280}12811282if((init_fct=(void(*)())GetProcAddress(handle,buf))==NULL){1283rb_loaderror("%s-%s\n%s",dln_strerror(),buf,file);1284}12851286/*callInit_xxxx()*/1287(*init_fct)();1288returnhandle;

1576failed:1577rb_loaderror("%s-%s",error,file);1580}

(dln.c)

DoingLoadLibrary()thenGetProcAddress().Thepatternissoequivalentthatnothingislefttosay,Idecidedtoendthischapter.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 829: Ruby Hacking Guide

RubyHackingGuide

Page 830: Ruby Hacking Guide

Chapter19:Threads

Outline

RubyInterfaceCometothinkofit,IfeelIhavenotintroducedanactualcodetouseRubythreads.Thisisnotsospecial,buthereI’llintroduceitjustincase.

Thread.fork{whiletrueputs'forkedthread'end}whiletrueputs'mainthread'end

Whenexecutingthisprogram,alotof"forkedthread"and"mainthread"areprintedintheproperlymixedstate.

Ofcourse,otherthanjustcreatingmultiplethreads,therearealsovariouswaystocontrol.There’snotthesynchronizeasareservedwordlikeJava,commonprimitivessuchasMutexorQueueorMonitorareofcourseavailable,andthebelowAPIscanbeusedtocontrola

Page 831: Ruby Hacking Guide

threaditself.

▼ThreadAPI

Thread.pass transfertheexecutiontoanyotherthreadThread.kill(th) terminatestheththreadThread.exit terminatesthethreaditselfThread.stop temporarilystopthethreaditselfThread#join waitingforthethreadtofinishThread#wakeup towakeupthetemporarilystoppedthread

rubyThreadThreadsaresupposedto“runalltogether”,butactuallytheyarerunningforalittletimeinturns.Tobeprecise,bymakingsomeeffortsonamachineofmultiCPU,it’spossiblethat,forinstance,twoofthemarerunningatthesametime.Butstill,iftherearemorethreadsthanthenumberofCPU,theyhavetoruninturns.

Inotherwords,inordertocreatethreads,someonehastoswitchthethreadsinsomewhere.Thereareroughlytwowaystodoit:kernel-levelthreadsanduser-levelthreads.Theyarerespectively,asthenamessuggest,tocreateathreadinkerneloratuser-level.Ifitiskernel-level,bymakinguseofmulti-CPU,multiplethreadscanrunatthesametime.

Then,howaboutthethreadofruby?Itisuser-levelthread.And(Therefore),thenumberofthreadsthatarerunnableatthesametimeislimitedtoone.

Page 832: Ruby Hacking Guide

Isitpreemptive?I’lldescribeaboutthetraitsofrubythreadsinmoredetail.Asanalternativepointofviewofthreads,there’sthepointthatis“isitpreemptive?”.

Whenwesay“thread(system)ispreemptive”,thethreadswillautomaticallybeswitchedwithoutbeingexplicitlyswitchedbyitsuser.Lookingthisfromtheoppositedirection,theusercan’tcontrolthetimingofswitchingthreads.

Ontheotherhand,inanon-preemptivethreadsystem,untiltheuserwillexplicitlysay“Icanpassthecontrolrighttothenextthread”,threadswillneverbeswitched.Lookingthisfromtheoppositedirection,whenandwherethere’sthepossibilityofswitchingthreadsisobvious.

Thisdistinctionisalsoforprocesses,inthatcase,preemptiveisconsideredas“superior”.Forexample,ifaprogramhadabuganditenteredaninfiniteloop,theprocesseswouldneverbeabletoswitch.Thismeansauserprogramcanhaltthewholesystemandisnotgood.And,switchingprocesseswasnon-preemptiveonWindows3.1becauseitsbasewasMS-DOS,butWindows95ispreemptive.Thus,thesystemismorerobust.Hence,itissaidthatWindows95is“superior”to3.1.

Then,howabouttherubythread?ItispreemptiveatRuby-level,andnon-preemptiveatClevel.Inotherwords,whenyouarewritingCcode,youcandeterminealmostcertainlythetimingsof

Page 833: Ruby Hacking Guide

switchingthreads.

Whyisthisdesignedinthisway?Threadsareindeedconvenient,butitsuseralsoneedtopreparecertainminds.Itmeansthatitisnecessarythecodeiscompatibletothethreads.(Itmustbemulti-threadsafe).Inotherwords,inordertomakeitpreemptivealsoinClevel,theallClibrarieshavetobethreadsafe.

Butinreality,therearealsoalotofClibrariesthatarestillnotthreadsafe.Alotofeffortsweremadetoeasetowriteextensionlibraries,butitwouldbebrownifthenumberofusablelibrariesisdecreasedbyrequiringthreadsafety.Therefore,non-preemptiveatClevelisareasonablechoiceforruby.

ManagementSystemWe’veunderstandrubythreadisnon-preemptiveatClevel.Itmeansafteritrunsforawhile,itvoluntarilyletgoofthecontrollingright.Then,I’dlikeyoutosupposethatnowacurrentlybeingexecutedthreadisabouttoquittheexecution.Whowillnextreceivethecontrolright?Butbeforethat,it’simpossibletoguessitwithoutknowinghowthreadsareexpressedinsiderubyinthefirstplace.Let’slookatthevariablesandthedatatypestomanagethreads.

▼thestructuretomanagethreads

864typedefstructthread*rb_thread_t;865staticrb_thread_tcurr_thread=0;

Page 834: Ruby Hacking Guide

866staticrb_thread_tmain_thread;

7301structthread{7302structthread*next,*prev;

(eval.c)

Sincestructthreadisveryhugeforsomereason,thistimeInarroweditdowntotheonlyimportantpart.Itiswhythereareonlythetwo.Thesenextandprevaremembernames,andtheirtypesarerb_thread_t,thuswecanexpectrb_thread_tisconnectedbyadual-directionallinklist.Andactuallyitisnotanordinarydual-directionallist,thebothendsareconnected.Itmeans,itiscircular.Thisisabigpoint.Addingthestaticmain_threadandcurr_threadvariablestoit,thewholedatastructurewouldlooklikeFigure1.

Figure1:thedatastructurestomanagethreads

main_thread(mainthread)meansthethreadexistedatthetime

Page 835: Ruby Hacking Guide

whenaprogramstarted,meaningthe“first”thread.curr_threadisobviouslycurrentthread,meaningthethreadcurrentlyrunning.Thevalueofmain_threadwillneverchangewhiletheprocessisrunning,butthevalueofcurr_threadwillchangefrequently.

Inthisway,becausethelistisbeingacircle,theproceduretochose“thenextthread”becomeseasy.Itcanbedonebymerelyfollowingthenextlink.Onlybythis,wecanrunallthreadsequallytosomeextent.

Whatdoesswitchingthreadsmean?Bytheway,whatisathreadinthefirstplace?Or,whatmakesustosaythreadsareswitched?

Theseareverydifficultquestions.Similartowhataprogramisorwhatanobjectis,whenaskedaboutwhatareusuallyunderstoodbyfeelings,it’shardtoanswerclearly.Especially,“whatisthedifferencebetweenthreadsandprocesses?”isagoodquestion.

Still,inarealisticrange,wecandescribeittosomeextent.Whatnecessaryforthreadsisthecontextofexecuting.Asforthecontextofruby,aswe’veseenbynow,itconsistsofruby_frameandruby_scopeandruby_classandsoon.Andrubyallocatesthesubstanceofruby_frameonthemachinestack,andtherearealsothestackspaceusedbyextensionlibraries,thereforethemachinestackisalsonecessaryasacontextofaRubyprogram.Andfinally,theCPUregistersareindispensable.Thesevariouscontextsarethe

Page 836: Ruby Hacking Guide

elementstoenablethreads,andswitchingthemmeansswitchingthreads.Or,itiscalled“context-switch”.

Thewayofcontext-switchingTheresttalkishowtoswitchcontexts.ruby_scopeandruby_classareeasytoreplace:allocatespacesforthemsomewheresuchastheheapandsetthemasideonebyone.FortheCPUregisters,wecanmakeitbecausewecansaveandwritebackthembyusingsetjmp().Thespacesforbothpurposesarerespectivelypreparedinrb_thread_t.

▼structthread(partial)

7301structthread{7302structthread*next,*prev;7303jmp_bufcontext;

7315structFRAME*frame;/*ruby_frame*/7316structSCOPE*scope;/*ruby_scope*/7317structRVarmap*dyna_vars;/*ruby_dyna_vars*/7318structBLOCK*block;/*ruby_block*/7319structiter*iter;/*ruby_iter*/7320structtag*tag;/*prot_tag*/7321VALUEklass;/*ruby_class*/7322VALUEwrapper;/*ruby_wrapper*/7323NODE*cref;/*ruby_cref*/73247325intflags;/*scope_vmode/rb_trap_immediate/raised*/73267327NODE*node;/*rb_current_node*/73287329inttracing;/*tracing*/7330VALUEerrinfo;/*$!*/7331VALUElast_status;/*$?*/7332VALUElast_line;/*$_*/

Page 837: Ruby Hacking Guide

7333VALUElast_match;/*$~*/73347335intsafe;/*ruby_safe_level*/

(eval.c)

Asshownabove,therearethemembersthatseemtocorrespondtoruby_frameandruby_scope.There’salsoajmp_buftosavetheregisters.

Then,theproblemisthemachinestack.Howcanwesubstitutethem?

Thewaywhichisthemoststraightforwardforthemechanismisdirectlywritingoverthepointertotheposition(end)ofthestack.Usually,itisintheCPUregisters.Sometimesitisaspecificregister,anditisalsopossiblethatageneral-purposeregisterisallocatedforit.Anyway,itisinsomewhere.Forconvenience,we’llcallitthestackpointerfromnowon.Itisobviousthatthedifferentspacecanbeusedasthestackbymodifyingit.ButitisalsoobviousinthiswaywehavetodealwithitforeachCPUandforeachOS,thusitisreallyhardtoservethepotability.

Therefore,rubyusesaveryviolentwaytoimplementthesubstitutionofthemachinestack.Thatis,ifwecan’tmodifythestackpointer,let’smodifytheplacethestackpointerpointsto.Weknowthestackcanbedirectlymodifiedaswe’veseeninthedescriptionaboutthegarbagecollection,therestisslightlychangingwhattodo.Theplacetostorethestackproperlyexistsinstructthread.

Page 838: Ruby Hacking Guide

▼structthread(partial)

7310intstk_len;/*thestacklength*/7311intstk_max;/*thesizeofmemoryallocatedforstk_ptr*/7312VALUE*stk_ptr;/*thecopyofthestack*/7313VALUE*stk_pos;/*thepositionofthestack*/

(eval.c)

HowtheexplanationgoesSofar,I’vetalkedaboutvariousthings,buttheimportantpointscanbesummarizedtothethree:

WhenTowhichthreadHow

toswitchcontext.Thesearealsothepointsofthischapter.Below,I’lldescribethemusingasectionforeachofthethreepointsrespectively.

Trigger

Tobeginwith,it’sthefirstpoint,whentoswitchthreads.Inotherwords,whatisthecauseofswitchingthreads.

Page 839: Ruby Hacking Guide

WaitingI/OForexample,whentryingtoreadinsomethingbycallingIO#getsorIO#read,sincewecanexpectitwilltakealotoftimetoread,it’sbettertoruntheotherthreadsinthemeantime.Inotherwords,aforcibleswitchbecomesnecessaryhere.Belowistheinterfaceofgetc.

▼rb_getc()

1185int1186rb_getc(f)1187FILE*f;1188{1189intc;11901191if(!READ_DATA_PENDING(f)){1192rb_thread_wait_fd(fileno(f));1193}1194TRAP_BEG;1195c=getc(f);1196TRAP_END;11971198returnc;1199}

(io.c)

READ_DATA_PENDING(f)isamacrotocheckifthecontentofthebufferofthefileisstillthere.Ifthere’sthecontentofthebuffer,itmeansitcanmovewithoutanywaitingtime,thusitwouldreaditimmediately.Ifitwasempty,itmeansitwouldtakesometime,thusitwouldrb_thread_wait_fd().Thisisanindirectcauseofswitchingthreads.

Page 840: Ruby Hacking Guide

Ifrb_thread_wait_fd()is“indirect”,therealsoshouldbea“direct”cause.Whatisit?Let’sseetheinsideofrb_thread_wait_fd().

▼rb_thread_wait_fd()

8047void8048rb_thread_wait_fd(fd)8049intfd;8050{8051if(rb_thread_critical)return;8052if(curr_thread==curr_thread->next)return;8053if(curr_thread->status==THREAD_TO_KILL)return;80548055curr_thread->status=THREAD_STOPPED;8056curr_thread->fd=fd;8057curr_thread->wait_for=WAIT_FD;8058rb_thread_schedule();8059}

(eval.c)

There’srb_thread_schedule()atthelastline.Thisfunctionisthe“directcause”.Itistheheartoftheimplementationoftherubythreads,anddoesselectandswitchtothenextthread.

Whatmakesusunderstandthisfunctionhassuchroleis,inmycase,Iknewtheword“scheduling”ofthreadsbeforehand.Evenifyoudidn’tknow,becauseyouremembersnow,you’llbeabletonoticeitatthenexttime.

And,inthiscase,itdoesnotmerelypassthecontroltotheotherthread,butitalsostopsitself.Moreover,ithasanexplicitdeadlinethatis“bythetimewhenitbecomesreadable”.Therefore,this

Page 841: Ruby Hacking Guide

requestshouldbetoldtorb_thread_schedule().Thisistheparttoassignvariousthingstothemembersofcurr_thread.Thereasontostopisstoredinwait_for,theinformationtobeusedwhenwakingupisstoredinfd,respectively.

WaitingtheotherthreadAfterunderstandingthreadsareswitchedatthetimingofrb_thread_schedule(),thistime,conversely,fromtheplacewhererb_thread_schedule()appears,wecanfindtheplaceswherethreadsareswitched.Thenbyscanning,Ifounditinthefunctionnamedrb_thread_join().

▼rb_thread_join()(partial)

8227staticint8228rb_thread_join(th,limit)8229rb_thread_tth;8230doublelimit;8231{

8243curr_thread->status=THREAD_STOPPED;8244curr_thread->join=th;8245curr_thread->wait_for=WAIT_JOIN;8246curr_thread->delay=timeofday()+limit;8247if(limit<DELAY_INFTY)curr_thread->wait_for|=WAIT_TIME;8248rb_thread_schedule();

(eval.c)

ThisfunctionisthesubstanceofThread#join,andThread#joinisamethodtowaituntilthereceiverthreadwillend.Indeed,since

Page 842: Ruby Hacking Guide

there’stimetowait,runningtheotherthreadsiseconomy.Becauseofthis,thesecondreasontoswitchisfound.

WaitingForTimeMoreover,alsointhefunctionnamedrb_thread_wait_for(),rb_thread_schedule()wasfound.Thisisthesubstanceof(Ruby’s)sleepandsuch.

▼rb_thread_wait_for(simplified)

8080void8081rb_thread_wait_for(time)8082structtimevaltime;8083{8084doubledate;

8124date=timeofday()+(double)time.tv_sec+(double)time.tv_usec*1e-6;8125curr_thread->status=THREAD_STOPPED;8126curr_thread->delay=date;8127curr_thread->wait_for=WAIT_TIME;8128rb_thread_schedule();8129}

(eval.c)

timeofday()returnsthecurrenttime.Becausethevalueoftimeisaddedtoit,dateindicatesthetimewhenthewaitingtimeisover.Inotherwords,thisistheorder“I’dliketostopuntilitwillbethespecifictime”.

Switchbyexpirations

Page 843: Ruby Hacking Guide

Intheaboveallcases,becausesomemanipulationsaredonefromRubylevel,consequentlyitcausestoswitchthreads.Inotherwords,bynow,theRuby-levelisalsonon-preemptive.Onlybythis,ifaprogramwastosingle-mindedlykeepcalculating,aparticularthreadwouldcontinuetoruneternally.Therefore,weneedtoletitvoluntarydisposethecontrolrightafterrunningforawhile.Then,howlongathreadcanrunbythetimewhenitwillhavetostop,iswhatI’lltalkaboutnext.

setitimer

Sinceitisthesameeverynowandthen,Ifeellikelackingtheskilltoentertain,butIsearchedtheplaceswherecallingrb_thread_schedule()further.Andthistimeitwasfoundinthestrangeplace.Itishere.

▼catch_timer()

8574staticvoid8575catch_timer(sig)8576intsig;8577{8578#if!defined(POSIX_SIGNAL)&&!defined(BSD_SIGNAL)8579signal(sig,catch_timer);8580#endif8581if(!rb_thread_critical){8582if(rb_trap_immediate){8583rb_thread_schedule();8584}8585elserb_thread_pending=1;8586}8587}

(eval.c)

Page 844: Ruby Hacking Guide

Thisseemssomethingrelatingtosignals.Whatisthis?Ifollowedtheplacewherethiscatch_timer()functionisused,thenitwasusedaroundhere:

▼rb_thread_start_0()(partial)

8620staticVALUE8621rb_thread_start_0(fn,arg,th_arg)8622VALUE(*fn)();8623void*arg;8624rb_thread_tth_arg;8625{

8632#ifdefined(HAVE_SETITIMER)8633if(!thread_init){8634#ifdefPOSIX_SIGNAL8635posix_signal(SIGVTALRM,catch_timer);8636#else8637signal(SIGVTALRM,catch_timer);8638#endif86398640thread_init=1;8641rb_thread_start_timer();8642}8643#endif

(eval.c)

Thismeans,catch_timerisasignalhandlerofSIGVTALRM.

Here,“whatkindofsignalSIGVTALRMis”becomesthequestion.Thisisactuallythesignalsentwhenusingthesystemcallnamedsetitimer.That’swhythere’sacheckofHAVE_SETITIMERjustbeforeit.setitimerisanabbreviationof“SETIntervalTIMER”anda

Page 845: Ruby Hacking Guide

systemcalltotellOStosendsignalswithacertaininterval.

Then,whereistheplacecallingsetitimer?Itistherb_thread_start_timer(),whichiscoincidentlylocatedatthelastofthislist.

Tosumupall,itbecomesthefollowingscenario.setitimerisusedtosendsignalswithacertaininterval.Thesignalsarecaughtbycatch_timer().There,rb_thread_schedule()iscalledandthreadsareswitched.Perfect.

However,signalscouldoccuranytime,ifitwasbasedononlywhatdescribeduntilhere,itmeansitwouldalsobepreemptiveatClevel.Then,I’dlikeyoutoseethecodeofcatch_timer()again.

if(rb_trap_immediate){rb_thread_schedule();}elserb_thread_pending=1;

There’sarequiredconditionthatisdoingrb_thread_schedule()onlywhenitisrb_trap_immediate.Thisisthepoint.rb_trap_immediateis,asthenamesuggests,expressing“whetherornotimmediatelyprocesssignals”,anditisusuallyfalse.ItbecomestrueonlywhilethelimitedtimesuchaswhiledoingI/Oonasinglethread.Inthesourcecode,itisthepartbetweenTRAP_BEGandTRAP_END.

Ontheotherhand,sincerb_thread_pendingissetwhenitisfalse,let’sfollowthis.Thisvariableisusedinthefollowingplace.

Page 846: Ruby Hacking Guide

▼CHECK_INTS−HAVE_SETITIMER

73#ifdefined(HAVE_SETITIMER)&&!defined(__BOW__)74EXTERNintrb_thread_pending;75#defineCHECK_INTSdo{\76if(!rb_prohibit_interrupt){\77if(rb_trap_pending)rb_trap_exec();\78if(rb_thread_pending&&!rb_thread_critical)\79rb_thread_schedule();\80}\81}while(0)

(rubysig.h)

Thisway,insideofCHECK_INTS,rb_thread_pendingischeckedandrb_thread_schedule()isdone.Itmeans,whenreceivingSIGVTALRM,rb_thread_pendingbecomestrue,thenthethreadwillbeswitchedatthenexttimegoingthroughCHECK_INTS.

ThisCHECK_INTShasappearedatvariousplacesbynow.Forexample,rb_eval()andrb_call0()andrb_yeild_0.CHECK_INTSwouldbemeaninglessifitwasnotlocatedwheretheplacefrequentlybeingpassed.Therefore,itisnaturaltoexistintheimportantfunctions.

tick

Weunderstoodthecasewhenthere’ssetitimer.Butwhatifsetitimerdoesnotexist?Actually,theanswerisinCHECK_INTS,whichwe’vejustseen.Itisthedefinitionofthe#elseside.

▼CHECK_INTS−notHAVE_SETITIMER

Page 847: Ruby Hacking Guide

84EXTERNintrb_thread_tick;85#defineTHREAD_TICK50086#defineCHECK_INTSdo{\87if(!rb_prohibit_interrupt){\88if(rb_trap_pending)rb_trap_exec();\89if(!rb_thread_critical){\90if(rb_thread_tick--<=0){\91rb_thread_tick=THREAD_TICK;\92rb_thread_schedule();\93}\94}\95}\96}while(0)

(rubysig.h)

EverytimegoingthroughCHECK_INTS,decrementrb_thread_tick.Whenitbecomes0,dorb_thread_schedule().Inotherwords,themechanismisthatthethreadwillbeswitchedafterTHREAD_TICK(=500)timesgoingthroughCHECK_INTS.

Scheduling

Thesecondpointistowhichthreadtoswitch.Whatsolelyresponsibleforthisdecisionisrb_thread_schedule().

rb_thread_schedule()

Theimportantfunctionsofrubyarealwayshuge.This

Page 848: Ruby Hacking Guide

rb_thread_schedule()hasmorethan220lines.Let’sexhaustivelydivideitintoportions.

▼rb_thread_schedule()(outline)

7819void7820rb_thread_schedule()7821{7822rb_thread_tnext;/*OK*/7823rb_thread_tth;7824rb_thread_tcurr;7825intfound=0;78267827fd_setreadfds;7828fd_setwritefds;7829fd_setexceptfds;7830structtimevaldelay_tv,*delay_ptr;7831doubledelay,now;/*OK*/7832intn,max;7833intneed_select=0;7834intselect_timeout=0;78357836rb_thread_pending=0;7837if(curr_thread==curr_thread->next7838&&curr_thread->status==THREAD_RUNNABLE)7839return;78407841next=0;7842curr=curr_thread;/*startingthread*/78437844while(curr->status==THREAD_KILLED){7845curr=curr->prev;7846}

/*……preparethevariablesusedatselect……*//*……selectifnecessary……*//*……decidethethreadtoinvokenext……*//*……context-switch……*/8045}

(eval.c)

Page 849: Ruby Hacking Guide

(A)Whenthere’sonlyonethread,thisdoesnotdoanythingandreturnsimmediately.Therefore,thetalksafterthiscanbethoughtbasedontheassumptionthattherearealwaysmultiplethreads.

(B)Subsequently,theinitializationofthevariables.Wecanconsiderthepartuntilandincludingthewhileistheinitialization.Sincecurisfollowingprev,thelastalivethread(status!=THREAD_KILLED)willbeset.Itisnot“thefirst”onebecausetherearealotofloopsthat“startwiththenextofcurrthendealwithcurrandend”.

Afterthat,wecanseethesentencesaboutselect.Sincethethreadswitchofrubyisconsiderablydependingonselect,let’sfirststudyaboutselectinadvancehere.

select

selectisasystemcalltowaituntilthepreparationforreadingorwritingacertainfilewillbecompleted.Itsprototypeisthis:

intselect(intmax,fd_set*readset,fd_set*writeset,fd_set*exceptset,structtimeval*timeout);

Inthevariableoftypefd_set,asetoffdthatwewanttocheckisstored.Thefirstargumentmaxis“(themaximumvalueoffdinfd_set)+1”.Thetimeoutisthemaximumwaitingtimeofselect.IftimeoutisNULL,itwouldwaiteternally.Iftimeoutis0,without

Page 850: Ruby Hacking Guide

waitingforevenjustasecond,itwouldonlycheckandreturnimmediately.Asforthereturnvalue,I’lltalkaboutitatthemomentwhenusingit.

I’lltalkaboutfd_setindetail.fd_setcanbemanipulatedbyusingthebelowmacros:

▼fd_setmaipulation

fd_setset;

FD_ZERO(&set)/*initialize*/FD_SET(fd,&set)/*addafiledescriptorfdtotheset*/FD_ISSET(fd,&set)/*trueiffdisintheset*/

fd_setistypicallyabitarray,andwhenwewanttocheckn-thfiledescriptor,then-thbitisset(Figure2).

Figure2:fd_set

I’llshowasimpleusageexampleofselect.

▼ausageexmpleofselect

#include<stdio.h>#include<sys/types.h>#include<sys/time.h>#include<unistd.h>

Page 851: Ruby Hacking Guide

intmain(intargc,char**argv){char*buf[1024];fd_setreadset;

FD_ZERO(&readset);/*initializereadset*/FD_SET(STDIN_FILENO,&readset);/*putstdinintotheset*/select(STDIN_FILENO+1,&readset,NULL,NULL,NULL);read(STDIN_FILENO,buf,1024);/*successwithoutdelay*/exit(0);}

Thiscodeassumethesystemcallisalwayssuccess,thustherearenotanyerrorchecksatall.I’dlikeyoutoseeonlytheflowthatisFD_ZERO→FD_SET→select.SinceherethefifthargumenttimeoutofselectisNULL,thisselectcallwaitseternallyforreadingstdin.Andsincethisselectiscompleted,thenextreaddoesnothavetowaittoreadatall.Byputtingprintinthemiddle,youwillgetfurtherunderstandingsaboutitsbehavior.AndalittlemoredetailedexamplecodeisputintheattachedCD-ROM{seealsodoc/select.html}.

PreparationsforselectNow,we’llgobacktothecodeofrb_thread_schedule().Sincethiscodebranchesbasedonthereasonwhythreadsarewaiting.I’llshowthecontentinshortenedform.

▼rb_thread_schedule()−preparationsforselect

Page 852: Ruby Hacking Guide

7848again:/*initializethevariablesrelatingtoselect*/7849max=-1;7850FD_ZERO(&readfds);7851FD_ZERO(&writefds);7852FD_ZERO(&exceptfds);7853delay=DELAY_INFTY;7854now=-1.0;78557856FOREACH_THREAD_FROM(curr,th){7857if(!found&&th->status<=THREAD_RUNNABLE){7858found=1;7859}7860if(th->status!=THREAD_STOPPED)continue;7861if(th->wait_for&WAIT_JOIN){/*……joinwait……*/7866}7867if(th->wait_for&WAIT_FD){/*……I/Owait……*/7871}7872if(th->wait_for&WAIT_SELECT){/*……selectwait……*/7882}7883if(th->wait_for&WAIT_TIME){/*……timewait……*/7899}7900}7901END_FOREACH_FROM(curr,th);

(eval.c)

Whetheritissupposedtobeornot,whatstandoutarethemacrosnamedFOREACH-some.Thesetwoaredefinedasfollows:

▼FOREACH_THREAD_FROM

7360#defineFOREACH_THREAD_FROM(f,x)x=f;do{x=x->next;7361#defineEND_FOREACH_FROM(f,x)}while(x!=f)

Page 853: Ruby Hacking Guide

(eval.c)

Let’sextractthemforbetterunderstandability.

th=curr;do{th=th->next;{.....}}while(th!=curr);

Thismeans:followthecircularlistofthreadsfromthenextofcurrandprocesscurratlastandend,andmeanwhilethethvariableisused.ThismakesmethinkabouttheRuby’siterators…isthismytoomuchimagination?

Here,we’llgobacktothesubsequenceofthecode,itusesthisabitstrangeloopandchecksifthere’sanythreadwhichneedsselect.Aswe’veseenpreviously,sinceselectcanwaitforreading/writing/exception/timeallatonce,youcanprobablyunderstandI/Owaitsandtimewaitscanbecentralizedbysingleselect.AndthoughIdidn’tdescribeaboutitintheprevioussection,selectwaitsarealsopossible.There’salsoamethodnamedIO.selectintheRuby’slibrary,andyoucanuserb_thread_select()atClevel.Therefore,weneedtoexecutethatselectatthesametime.Bymergingfd_set,multipleselectcanbedoneatonce.

Therestisonlyjoinwait.Asforitscode,let’sseeitjustincase.

Page 854: Ruby Hacking Guide

▼rb_thread_schedule()−selectpreparation−joinwait

7861if(th->wait_for&WAIT_JOIN){7862if(rb_thread_dead(th->join)){7863th->status=THREAD_RUNNABLE;7864found=1;7865}7866}

(eval.c)

Themeaningofrb_thread_dead()isobviousbecauseofitsname.Itdetermineswhetherornotthethreadoftheargumenthasfinished.

CallingselectBynow,we’vefiguredoutwhetherselectisnecessaryornot,andifitisnecessary,itsfd_sethasalreadyprepared.Evenifthere’saimmediatelyinvocablethread(THREAD_RUNNABLE),weneedtocallselectbeforehand.It’spossiblethatthere’sactuallyathreadthatithasalreadybeenwhilesinceitsI/Owaitfinishedandhasthehigherpriority.Butinthatcase,tellselecttoimmediatelyreturnandletitonlycheckifI/Owascompleted.

▼rb_thread_schedule()−select

7904if(need_select){7905/*convertdelayintotimeval*/7906/*iftheresimmediatelyinvocablethreads,doonlyI/Ochecks*/7907if(found){7908delay_tv.tv_sec=0;

Page 855: Ruby Hacking Guide

7909delay_tv.tv_usec=0;7910delay_ptr=&delay_tv;7911}7912elseif(delay==DELAY_INFTY){7913delay_ptr=0;7914}7915else{7916delay_tv.tv_sec=delay;7917delay_tv.tv_usec=(delay-(double)delay_tv.tv_sec)*1e6;7918delay_ptr=&delay_tv;7919}79207921n=select(max+1,&readfds,&writefds,&exceptfds,delay_ptr);7922if(n<0){/*……beingcutinbysignalorsomething……*/7944}7945if(select_timeout&&n==0){/*……timeout……*/7960}7961if(n>0){/*……properlyfinished……*/7989}7990/*Inasomewherethread,itsI/Owaithasfinished.7991rolltheloopagaintodetectthethread*/7992if(!found&&delay!=DELAY_INFTY)7993gotoagain;7994}

(eval.c)

Thefirsthalfoftheblockisaswritteninthecomment.Sincedelayistheusecuntiltheanythreadwillbenextinvocable,itisconvertedintotimevalform.

Inthelasthalf,itactuallycallsselectandbranchesbasedonitsresult.Sincethiscodeislong,Idivideditagain.Whenbeingcutinbyasignal,iteithergoesbacktothebeginningthenprocessesagainorbecomesanerror.Whataremeaningfularetheresttwo.

Page 856: Ruby Hacking Guide

TimeoutWhenselectistimeout,athreadoftimewaitorselectwaitmaybecomeinvocable.Checkaboutitandsearchrunnablethreads.Ifitisfound,setTHREAD_RUNNABLEtoit.

CompletingnormallyIfselectisnormallycompleted,itmeanseitherthepreparationforI/Oiscompletedorselectwaitends.Searchthethreadsthatarenolongerwaitingbycheckingfd_set.Ifitisfound,setTHREAD_RUNNABLEtoit.

DecidethenextthreadTakingalltheinformationintoconsiderations,eventuallydecidethenextthreadtoinvoke.SinceallwhatwasinvocableandallwhathadfinishedwaitingandsoonbecameRUNNABLE,youcanarbitrarypickuponeofthem.

▼rb_thread_schedule()−decidethenextthread

7996FOREACH_THREAD_FROM(curr,th){7997if(th->status==THREAD_TO_KILL){/*(A)*/7998next=th;7999break;8000}8001if(th->status==THREAD_RUNNABLE&&th->stk_ptr){8002if(!next||next->priority<th->priority)/*(B)*/8003next=th;8004}

Page 857: Ruby Hacking Guide

8005}8006END_FOREACH_FROM(curr,th);

(eval.c)

(A)ifthere’sathreadthatisabouttofinish,giveitthehighpriorityandletitfinish.

(B)findoutwhatseemsrunnable.Howeveritseemstoconsiderthevalueofpriority.ThismembercanalsobemodifiedfromRubylevelbyusingTread#priorityThread#priority=.rubyitselfdoesnotespeciallymodifyit.

Ifthesearedonebutthenextthreadcouldnotbefound,inotherwordsifthenextwasnotset,whathappen?Sinceselecthasalreadybeendone,atleastoneofthreadsoftimewaitorI/Owaitshouldhavefinishedwaiting.Ifitwasmissing,therestisonlythewaitsfortheotherthreads,andmoreoverthere’snorunnablethreads,thusthiswaitwillneverend.Thisisadeadlock.

Ofcourse,fortheotherreasons,adeadlockcanhappen,butgenerallyit’sveryhardtodetectadeadlock.Especiallyinthecaseofruby,MutexandsuchareimplementedatRubylevel,theperfectdetectionisnearlyimpossible.

SwitchingThreadsThenextthreadtoinvokehasbeendetermined.I/Oandselectcheckshasalsobeendone.Therestistransferringthecontroltothetargetthread.However,forthelastofrb_thread_schedule()and

Page 858: Ruby Hacking Guide

thecodetoswitchthreads,I’llstartanewsection.

ContextSwitch

Thelastthirdpointisthread-switch,anditiscontext-switch.Thisisthemostinterestingpartofthreadsofruby.

TheBaseLineThenwe’llstartwiththetailofrb_thread_schedule().Sincethestoryofthissectionisverycomplex,I’llgowithasignificantlysimplifiedversion.

▼rb_thread_schedule()(contextswitch)

if(THREAD_SAVE_CONTEXT(curr)){return;}rb_thread_restore_context(next,RESTORE_NORMAL);

AsforthepartofTHREAD_SAVE_CONTEXT(),weneedtoextractthecontentatseveralplacesinordertounderstand.

▼THREAD_SAVE_CONTEXT()

7619#defineTHREAD_SAVE_CONTEXT(th)\7620(rb_thread_save_context(th),thread_switch(setjmp((th)->context)))

Page 859: Ruby Hacking Guide

7587staticint7588thread_switch(n)7589intn;7590{7591switch(n){7592case0:7593return0;7594caseRESTORE_FATAL:7595JUMP_TAG(TAG_FATAL);7596break;7597caseRESTORE_INTERRUPT:7598rb_interrupt();7599break;/*……processvariousabnormalthings……*/7612caseRESTORE_NORMAL:7613default:7614break;7615}7616return1;7617}

(eval.c)

IfImergethethreethenextractit,hereistheresult:

rb_thread_save_context(curr);switch(setjmp(curr->context)){case0:break;caseRESTORE_FATAL:....caseRESTORE_INTERRUPT:..../*……processabnormals……*/caseRESTORE_NORMAL:default:return;}rb_thread_restore_context(next,RESTORE_NORMAL);

Page 860: Ruby Hacking Guide

Atbothofthereturnvalueofsetjmp()andrb_thread_restore_context(),RESTORE_NORMALappears,thisisclearlysuspicious.Sinceitdoeslongjmp()inrb_thread_restore_context(),wecanexpectthecorrespondencebetweensetjmp()andlongjmp().Andifwewillimaginethemeaningalsofromthefunctionnames,

savethecontextofthecurrentthreadsetjmprestorethecontextofthenextthreadlongjmp

Theroughmainflowwouldprobablylooklikethis.Howeverwhatwehavetobecarefulabouthereis,thispairofsetjmp()andlongjmp()isnotcompletedinthisthread.setjmp()isusedtosavethecontextofthisthread,longjmp()isusedtorestorethecontextofthenextthread.Inotherwords,there’sachainofsetjmp/longjmp()asfollows.(Figure3)

Page 861: Ruby Hacking Guide

Figure3:thebackstitchbychainingofsetjmp

WecanrestorearoundtheCPUregisterswithsetjmp()/longjmp(),sotheremainingcontextistheRubystacksinadditiontothemachinestack.rb_thread_save_context()istosaveit,andrb_thread_restore_context()istorestoreit.Let’slookateachoftheminsequentialorder.

rb_thread_save_context()

Now,we’llstartwithrb_thread_save_context(),whichsavesacontext.

▼rb_thread_save_context()(simplified)

7539staticvoid7540rb_thread_save_context(th)7541rb_thread_tth;

Page 862: Ruby Hacking Guide

7542{7543VALUE*pos;7544intlen;7545staticVALUEtval;75467547len=ruby_stack_length(&pos);7548th->stk_len=0;7549th->stk_pos=(rb_gc_stack_start<pos)?rb_gc_stack_start7550:rb_gc_stack_start-len;7551if(len>th->stk_max){7552REALLOC_N(th->stk_ptr,VALUE,len);7553th->stk_max=len;7554}7555th->stk_len=len;7556FLUSH_REGISTER_WINDOWS;7557MEMCPY(th->stk_ptr,th->stk_pos,VALUE,th->stk_len);

/*…………omission…………*/}

(eval.c)

Thelasthalfisjustkeepassigningtheglobalvariablessuchasruby_scopeintoth,soitisomittedbecauseitisnotinteresting.Therest,inthepartshownabove,itattemptstocopytheentiremachinestackintotheplacewhereth->stk_ptrpointsto.

First,itisruby_stack_length()whichwritestheheadaddressofthestackintotheparameterposandreturnsitslength.Therangeofthestackisdeterminedbyusingthisvalueandtheaddressofthebottom-endsideissettoth->stk_ptr.Wecanseesomebranches,itisbecausebothastackextendinghigherandastackextendinglowerarepossible.(Figure4)

Page 863: Ruby Hacking Guide

Fig.4:astackextendingaboveandastackextendingbelow

Afterthat,therestisallocatingamemoryinwhereth->stkptrpointstoandcopyingthestack:allocatethememorywhosesizeisth->stk_maxthencopythestackbythelenlength.

FLUSH_REGISTER_WINDOWSwasdescribedinChapter5:Garbagecollection,soitsexplanationmightnolongerbenecessary.Thisisamacro(whosesubstanceiswritteninAssembler)towritedownthecacheofthestackspacetothememory.Itmustbecalledwhenthetargetistheentirestack.

Page 864: Ruby Hacking Guide

rb_thread_restore_context()

Andfinally,itisrb_thread_restore_context(),whichisthefunctiontorestoreathread.

▼rb_thread_restore_context()

7635staticvoid7636rb_thread_restore_context(th,exit)7637rb_thread_tth;7638intexit;7639{7640VALUEv;7641staticrb_thread_ttmp;7642staticintex;7643staticVALUEtval;76447645if(!th->stk_ptr)rb_bug("unsavedcontext");76467647if(&v<rb_gc_stack_start){7648/*themachinestackextendinglower*/7649if(&v>th->stk_pos)stack_extend(th,exit);7650}7651else{7652/*themachinestackextendinghigher*/7653if(&v<th->stk_pos+th->stk_len)stack_extend(th,exit);7654}

/*omission……backtheglobalvariables*/

7677tmp=th;7678ex=exit;7679FLUSH_REGISTER_WINDOWS;7680MEMCPY(tmp->stk_pos,tmp->stk_ptr,VALUE,tmp->stk_len);76817682tval=rb_lastline_get();7683rb_lastline_set(tmp->last_line);7684tmp->last_line=tval;7685tval=rb_backref_get();7686rb_backref_set(tmp->last_match);

Page 865: Ruby Hacking Guide

7687tmp->last_match=tval;76887689longjmp(tmp->context,ex);7690}

(eval.c)

Thethparameteristhetargettogivetheexecutionback.MEMCPY()andlongjmp()inthelasthalfareattheheart.ThecloserMEMCPY()tothelast,thebetteritis,becauseafterthismanipulation,thestackisinadestroyedstateuntillongjmp().

Nevertheless,therearerb_lastline_set()andrb_backref_set().Theyaretherestorationsof$_and$~.Sincethesetwovariablesarenotonlylocalvariablesbutalsothreadlocalvariables,evenifitisonlyasinglelocalvariableslot,thereareitsasmanyslotsasthenumberofthreads.Thismustbeherebecausetheplaceactuallybeingwrittenbackisthestack.Becausetheyarelocalvariables,theirslotspacesareallocatedwithalloca().

That’sitforthebasics.Butifwemerelywritethestackback,inthecasewhenthestackofthecurrentthreadisshorterthanthestackofthethreadtoswitchto,thestackframeoftheverycurrentlyexecutingfunction(itisrb_thread_restore_context)wouldbeoverwritten.Itmeansthecontentofthethparameterwillbedestroyed.Therefore,inordertopreventthisfromoccurring,wefirstneedtoextendthestack.Thisisdonebythestack_extend()inthefirsthalf.

▼stack_extend()

Page 866: Ruby Hacking Guide

7624staticvoid7625stack_extend(th,exit)7626rb_thread_tth;7627intexit;7628{7629VALUEspace[1024];76307631memset(space,0,1);/*preventarrayfromoptimization*/7632rb_thread_restore_context(th,exit);7633}

(eval.c)

Byallocatingalocalvariable(whichwillbeputatthemachinestackspace)whosesizeis1K,forciblyextendthestack.However,thoughthisisamatterofcourse,doingreturnfromstack_extend()meanstheextendedstackwillshrinkimmediately.Thisiswhyrb_thread_restore_context()iscalledagainimmediatelyintheplace.

Bytheway,thecompletionofthetaskofrb_thread_restore_context()meansithasreachedthecalloflongjmp(),andonceitiscalleditwillneverreturnback.Obviously,thecallofstack_extend()willalsoneverreturn.Therefore,rb_thread_restore_context()doesnothavetothinkaboutsuchaspossibleproceduresafterreturningfromstack_extend().

IssuesThisistheimplementationoftherubythreadswitch.Wecan’tthinkitislightweight.Plentyofmalloc()realloc()andplentyof

Page 867: Ruby Hacking Guide

memcpy()anddoingsetjmp()longjmp()thenfurthermorecallingfunctionstoextendthestack.There’snoproblemtoexpress“Itisdeadlyheavy”.Butinstead,there’snotanysystemcalldependingonaparticularOS,andtherearejustafewassemblyonlyfortheregisterwindowsofSparc.Indeed,thisseemstobehighlyportable.

There’sanotherproblem.Itis,becausethestacksofallthreadsareallocatedtothesameaddress,there’sthepossibilitythatthecodeusingthepointertothestackspaceisnotrunnable.Actually,Tcl/Tkexcellentlymatchesthissituation,inordertobypass,Ruby’sTcl/Tkinterfacereluctantlychosestoaccessonlyfromthemainthread.

Ofcourse,thisdoesnotgoalongwithnativethreads.Itwouldbenecessarytorestrictrubythreadstorunonlyonaparticularnativethreadinordertoletthemworkproperly.InUNIX,therearestillafewlibrariesthatusealotofthreads.ButinWin32,becausethreadsarerunningeverynowandthen,weneedtobecarefulaboutit.

TheoriginalworkisCopyright©2002-2004MineroAOKI.TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License

Page 868: Ruby Hacking Guide

RubyHackingGuide

FinalChapter:Ruby’s

future

Issuestobeaddressed

rubyisn’t‘completelyfinishedsoftware’。It’sstillbeingdeveloped,therearestillalotofissues.Firstly,wewanttotryremovinginherentproblemsinthecurrentinterpreter.

Theorderofthetopicsismostlyinthesameorderasthechaptersofthisbook.

PerformanceofGCTheperformanceofthecurrentGCmightbe“notnotablybad,butnotnotablygood”.“notnotablybad”means“itwon’tcausetroublesinourdailylife”,and“notnotablygood”means“itsdownsidewillbeexposedunderheavyload”.Forexample,ifitisanapplication

Page 869: Ruby Hacking Guide

whichcreatesplentyofobjectsandkeepsholdingthem,itsspeedwouldslowdownradically.EverytimedoingGC,itneedstomarkalloftheobjects,andfurthermoreitwouldbecomestoneedtoinvokeGCmoreoftenbecauseitcan’tcollectthem.Tocounterthisproblem,GenerationalGC,whichwasmentionedinChapter5,mustbeeffective.(Atleast,itissaidsointheory.)

Alsoregardingitsresponsespeed,therearestillroomswecanimprove.WiththecurrentGC,whileitisrunning,theentireinterpretorstops.Thus,whentheprogramisaneditororaGUIapplication,sometimesitfreezesandstopstoreact.Evenifit’sjust0.1second,stoppingwhentypingcharacterswouldgiveaverybadimpression.Currently,therearefewsuchapplicationscreatedor,evenifexists,itssizemightbeenoughsmallnottoexposethisproblem.However,ifsuchapplicationwillactuallybecreatedinthefuture,theremightbethenecessitytoconsiderIncrementalGC.

ImplementationofparserAswesawinPart2,theimplementationofrubyparserhasalreadyutilized@yacc@’sabilitytoalmostitslimit,thusIcan’tthinkitcanendurefurtherexpansions.It’sallrightifthere’snothingplannedtoexpand,butabigname“keywordargument”isplannednextandit’ssadifwecouldnotexpressanotherdemandedgrammarbecauseofthelimitationofyacc.

Reuseofparser

Page 870: Ruby Hacking Guide

Ruby’sparserisverycomplex.Inparticular,dealingwitharoundlex_stateseriouslyisveryhard.Duetothis,embeddingaRubyprogramorcreatingaprogramtodealwithaRubyprogramitselfisquitedifficult.

Forexample,I’mdevelopingatoolnamedracc,whichisprefixedwithRbecauseitisaRuby-versionyacc.Withracc,thesyntaxofgrammarfilesarealmostthesameasyaccbutwecanwriteactionsinRuby.Todoso,itcouldnotdeterminetheendofanactionwithoutparsingRubycodeproperly,but“properly”isverydifficult.Sincethere’snootherchoice,currentlyI’vecompromisedatthelevelthatitcanparse“almostall”.

AsanotherexamplewhichrequiresanalyzingRubyprogram,Icanenumeratesometoolslikeindentandlint,butcreatingsuchtoolalsorequiresalotefforts.Itwouldbedesperateifitissomethingcomplexlikearefactoringtool.

Then,whatcanwedo?Ifwecan’trecreatethesamething,whatif@ruby@’soriginalparsercanbeusedasacomponent?Inotherwords,makingtheparseritselfalibrary.Thisisafeaturewewantbyallmeans.

However,whatbecomesproblemhereis,aslongasyaccisused,wecannotmakeparserreentrant.Itmeans,say,wecannotcallyyparse()recursively,andwecannotcallitfrommultiplethreads.Therefore,itshouldbeimplementedinthewayofnotreturning

Page 871: Ruby Hacking Guide

controltoRubywhileparsing.

HidingCodeWithcurrentruby,itdoesnotworkwithoutthesourcecodeoftheprogramtorun.Thus,peoplewhodon’twantotherstoreadtheirsourcecodemighthavetrouble.

InterpretorObjectCurrentlyeachprocesscannothavemultiplerubyinterpretors,thiswasdiscussedinChapter13.Ifhavingmultipleinterpretorsispracticallypossible,itseemsbetter,butisitpossibletoimplementsuchthing?

ThestructureofevaluatorCurrenteval.cis,aboveall,toocomplex.EmbeddingRuby’sstackframestomachinestackcouldoccasionallybecomethesourceoftrouble,usingsetjmp()longjmp()aggressivelymakesitlesseasytounderstandandslowsdownitsspeed.ParticularlywithRISCmachine,whichhasmanyregisters,usingsetjmp()aggressivelycaneasilycauseslowingdownbecausesetjmp()setasideallthingsinregisters.

Theperformanceofevaluatorrubyisalreadyenoughfastforordinaryuse.Butasidefromit,

Page 872: Ruby Hacking Guide

regardingalanguageprocessor,definitelythefasteristhebetter.Toachievebetterperformance,inotherwordstooptimize,whatcanwedo?Insuchcase,thefirstthingwehavetodoisprofiling.SoIprofiled.

%cumulativeselfselftotaltimesecondssecondscallsms/callms/callname20.251.641.6426383590.000.00rb_eval12.472.651.0111139470.000.00ruby_re_match8.893.370.7255192490.000.00rb_call06.543.900.5321563870.000.00st_lookup6.304.410.5115990960.000.00rb_yield_05.434.850.4455192490.000.00rb_call5.195.270.423880660.000.00st_foreach3.465.550.2886058660.000.00rb_gc_mark2.225.730.1838195880.000.00call_cfunc

ThisisaprofilewhenrunningsomeapplicationbutthisisapproximatelytheprofileofageneralRubyprogram.rb_eval()appearedintheoverwhelmingpercentagebeingatthetop,afterthat,inadditiontofunctionsofGC,evaluatorcore,functionsthatarespecifictotheprogramaremixed.Forexample,inthecaseofthisapplication,ittakesalotoftimeforregularexpressionmatch(ruby_re_match).

However,evenifweunderstoodthis,thequestionishowtoimproveit.Tothinksimply,itcanbearchivedbymakingrb_eval()faster.Thatsaid,butasforrubycore,therearealmostnotanyroomwhichcanbeeasilyoptimized.Forinstance,apparently“tailrecursive→gotoconversion”usedintheplaceofNODE_IFandothershasalreadyappliedalmostallpossibleplacesitcanbe

Page 873: Ruby Hacking Guide

applied.Inotherwords,withoutchangingthewayofthinkingfundamentally,there’snoroomtoimprove.

TheimplementationofthreadThiswasalsodiscussedinChapter19.Therearereallyalotofissuesabouttheimplementationofthecurrentruby’sthread.Particularly,itcannotmixwithnativethreadssobadly.Thetwogreatadvantagesof@ruby@’sthread,(1)highportability(2)thesamebehavioreverywhere,aredefinitelyincomparable,butprobablythatimplementationissomethingwecannotcontinuetouseeternally,isn’tit?

ruby2

Subsequently,ontheotherhand,I’llintroducethetrendoftheoriginalruby,howitistryingtocountertheseissues.

RiteAtthepresenttime,ruby’sedgeis1.6.7asthestableversionand1.7.3asthedevelopmentversion,butperhapsthenextstableversion1.8willcomeoutinthenearfuture.Thenatthatpoint,thenextdevelopmentversion1.9.0willstartatthesametime.Andafterthat,thisisalittleirregularbut1.9.1willbethenextstableversion.

Page 874: Ruby Hacking Guide

stable development whentostart1.6.x 1.7.x 1.6.0wasreleasedon2000-09-191.8.x 1.9.x probablyitwillcomeoutwithin6months1.9.1~ 2.0.0 maybeabout2yearslater

Andthenext-to-nextgenerationaldevelopmentversionisruby2,whosecodenameisRite.ApparentlythisnameindicatesarespectfortheinadequacythatJapanesecannotdistinguishthesoundsofLandR.

Whatwillbechangedin2.0is,inshort,almostalltheentirecore.Thread,evaluator,parser,allofthemwillbechanged.However,nothinghasbeenwrittenasacodeyet,sothingswrittenhereisentirelyjusta“plan”.Ifyouexpectsomuch,it’spossibleitwillturnoutdisappointments.Therefore,fornow,let’sjustexpectslightly.

ThelanguagetowriteFirstly,thelanguagetouse.DefinitelyitwillbeC.Mr.Matsumotosaidtoruby-talk,whichistheEnglishmailinglistforRuby,

IhateC++.

So,C++ismostunlikely.Evenifallthepartswillberecreated,itisreasonablethattheobjectsystemwillremainalmostthesame,sonottoincreaseextraeffortsaroundthisisnecessary.However,chancesaregoodthatitwillbeANSICnexttime.

Page 875: Ruby Hacking Guide

GCRegardingtheimplementationofGC,thegoodstartpointwouldbeBoehmGC\footnote{BoehmGChttp://www.hpl.hp.com/personal/Hans_Boehm/gc}.BohemGCisaconservativeandincrementalandgenerationalGC,furthermore,itcanmarkallstackspacesofallthreadsevenwhilenativethreadsarerunning.It’sreallyanimpressiveGC.Evenifitisintroducedonce,it’shardtotellwhetheritwillbeusedperpetually,butanywayitwillproceedforthedirectiontowhichwecanexpectsomewhatimprovementonspeed.

ParserRegardingthespecification,it’sverylikelythatthenestedmethodcallswithoutparentheseswillbeforbidden.Aswe’veseen,command_callhasagreatinfluenceonalloverthegrammar.Ifthisissimplified,boththeparserandthescannerwillalsobesimplifiedalot.However,theabilitytoomitparenthesesitselfwillneverbedisabled.

Andregardingitsimplementation,whetherwecontinuetouseyaccisstillunderdiscussion.Ifwewon’tuse,itwouldmeanhand-writing,butisitpossibletoimplementsuchcomplexthingbyhand?Suchanxietymightleft.Whicheverwaywechoose,thepathmustbethorny.

Evaluator

Page 876: Ruby Hacking Guide

Theevaluatorwillbecompletelyrecreated.Itsaimsaremainlytoimprovespeedandtosimplifytheimplementation.Therearetwomainviewpoints:

removerecursivecallslikerb_eval()switchtoabytecodeinterpretor

First,removingrecursivecallsofrb_eval().Thewaytoremoveis,maybethemostintuitiveexplanationisthatit’slikethe“tailrecursive→gotoconversion”.Insideasinglerb_eval(),circlingaroundbyusinggoto.Thatdecreasesthenumberoffunctioncallsandremovesthenecessityofsetjmp()thatisusedforreturnorbreak.However,whenafunctiondefinedinCiscalled,callingafunctionisinevitable,andatthatpointsetjmp()willstillberequired.

Bytecodeis,inshort,somethinglikeaprogramwritteninmachinelanguage.ItbecamefamousbecauseofthevirtualmachineofSmalltalk90,itiscalledbytecodebecauseeachinstructionisone-byte.Forthosewhoareusuallyworkingatmoreabstractlevel,bytewouldseemsonaturalbasisinsizetodealwith,butinmanycaseseachinstructionconsistsofbitsinmachinelanguages.Forexample,inAlpha,amonga32-bitinstructioncode,thebeginning6-bitrepresentstheinstructiontype.

Theadvantageofbytecodeinterpretorsismainlyforspeed.Therearetworeasons:Firstly,unlikesyntaxtrees,there’snoneedto

Page 877: Ruby Hacking Guide

traversepointers.Secondly,it’seasytodopeepholeoptimization.

Andinthecasewhenbytecodeissavedandreadinlater,becausethere’snoneedtoparse,wecannaturallyexpectbetterperformance.However,parsingisaprocedurewhichisdoneonlyonceatthebeginningofaprogramandevencurrentlyitdoesnottakesomuchtime.Therefore,itsinfluencewillnotbesomuch.

Ifyou’dliketoknowabouthowthebytecodeevaluatorcouldbe,regex.cisworthtolookat.Foranotherexample,Pythonisabytecodeinterpretor.

ThreadRegardingthread,thethingisnativethreadsupport.Theenvironmentaroundthreadhasbeensignificantlyimproved,comparingwiththesituationin1994,theyearofRuby’sbirth.Soitmightbejudgedthatwecangetalongwithnativethreadnow.

UsingnativethreadmeansbeingpreemptivealsoatClevel,thustheinterpretoritselfmustbemulti-threadsafe,butitseemsthispointisgoingtobesolvedbyusingagloballockforthetimebeing.

Additionally,thatsomewhatarcane“continuation”,itseemslikelytoberemoved.ruby’scontinuationhighlydependsontheimplementationofthread,sonaturallyitwilldisappearifthreadisswitchedtonativethread.Theexistenceofthatfeatureisbecause“itcanbeimplemented”anditisrarelyactuallyused.Thereforetheremightbenoproblem.

Page 878: Ruby Hacking Guide

M17NInaddition,I’dliketomentionafewthingsaboutclasslibraries.Thisisaboutmulti-lingualization(M17Nforshort).Whatitmeansexactlyinthecontextofprogrammingisbeingabletodealwithmultiplecharacterencodings.

rubywithMulti-lingualizationsupporthasalreadyimplementedandyoucanobtainitfromtheruby_m17mbranchoftheCVSrepository.Itisnotabsorbedyetbecauseitisjudgedthatitsspecificationisimmature.Ifgoodinterfacesisdesigned,itwillbeabsorbedatsomepointinthemiddleof1.9.

IOTheIOclassincurrentRubyisasimplewrapperofstdio,butinthisapproach,

therearetoomanybutslightdifferencesbetweenvariousplatforms.we’dliketohavefinercontrolonbuffers.

thesetwopointscausecomplaints.Therefore,itseemsRitewillhaveitsownstdio.

RubyHackingGuide

Page 879: Ruby Hacking Guide

Sofar,we’vealwaysactedasobserverswholookatrubyfromoutside.But,ofcourse,rubyisnotaproductwhichdisplayedininashowcase.Itmeanswecaninfluenceitifwetakeanactionforit.Inthelastsectionofthisbook,I’llintroducethesuggestionsandactivitiesforrubyfromcommunity,asafarewellgiftforRubyHackersbothatpresentandinthefuture.

GenerationalGCFirst,asalsomentionedinChapter5,thegenerationalGCmadebyMr.KiyamaMasato.Asdescribedbefore,withthecurrentpatch,

itislessfastthanexpected.itneedstobeupdatedtofittheedgeruby

thesepointsareproblems,buthereI’dliketohighlyvalueitbecause,morethananythingelse,itwasthefirstlargenon-officialpatch.

OnigurumaTheregularexpressionengineusedbycurrentRubyisaremodeledversionofGNUregex.ThatGNUregexwasinthefirstplacewrittenforEmacs.Andthenitwasremodeledsothatitcansupportmulti-bytecharacters.AndthenMr.MatsumotoremodeledsothatitiscompatiblewithPerl.Aswecaneasilyimaginefromthishistory,itsconstructionisreallyintricateandspooky.Furthermore,duetotheLPGLlicenseofthisGNUregex,

Page 880: Ruby Hacking Guide

thelicenseofrubyisverycomplicated,soreplacingthisenginehasbeenanissuefromalongtimeago.

Whatsuddenlyemergedhereistheregularexpressionengine“Oniguruma”byMr.K.Kosako.Iheardthisiswrittenreallywell,itislikelybeingabsorbedassoonaspossible.

YoucanobtainOnigurumafromtheruby’sCVSrepositoryinthefollowingway.

%cvs-d:pserver:[email protected]:/srccooniguruma

ripperNext,ripperismyproduct.Itisanextensionlibrarymadebyremodelingparse.y.Itisnotachangeappliedtotheruby’smainbody,butIintroducedithereasonepossibledirectiontomaketheparseracomponent.

Itisimplementedwithkindofstreaminginterfaceanditcanpickupthingssuchastokenscanorparser’sreductionasevents.ItisputintheattachedCD-ROM\footnote{ripper:archives/ripper-

0.0.5.tar.gzoftheattachedCD-ROM},soI’dlikeyoutogiveitatry.Notethatthesupportedgrammarisalittledifferentfromthecurrentonebecausethisversionisbasedonruby1.7almosthalf-yearago.

Icreatedthisjustbecause“Ihappenedtocomeupwiththisidea”,ifthisisaccounted,Ithinkitisconstructedwell.Ittookonlythree

Page 881: Ruby Hacking Guide

daysorsotoimplement,reallyjustapieceofcake.

AparseralternativeThisproducthasnotyetappearedinaclearform,there’sapersonwhowriteaRubyparserinC++whichcanbeusedtotallyindependentofruby.([ruby-talk:50497]).

JRubyMoreaggressively,there’sanattempttorewriteentiretheinterpretor.Forexample,aRubywritteninJava,Ruby\footnote{JRubyhttp://jruby.sourceforge.net},hasappeared.Itseemsitisbeingimplementedbyalargegroupofpeople,Mr.JanArnePetersenandmanyothers.

Itrieditalittleandasmyreviews,

theparseriswrittenreallywell.Itdoespreciselyhandleevenfinerbehaviorssuchasspacesorheredocument.instance_evalseemsnotineffect(probablyitcouldn’tbehelped).ithasjustafewbuilt-inlibrariesyet(couldn’tbehelpedaswell).wecan’tuseextensionlibrarieswithit(naturally).becauseRuby’sUNIXcentricisallcutout,there’slittlepossibilitythatwecanrunalready-existingscriptswithoutanychange.slow

Page 882: Ruby Hacking Guide

perhapsIcouldsayatleastthesethings.Regardingthelastone“slow”,itsdegreeis,theexecutiontimeittakesis20timeslongerthantheoneoftheoriginalruby.Goingthisfaristooslow.ItisnotexpectedrunningfastbecausethatRubyVMrunsonJavaVM.Waitingforthemachinetobecome20timesfasterseemsonlyway.

However,theoverallimpressionIgotwas,it’swaybetterthanIimagined.

NETRubyIfitcanrunwithJava,itshouldalsowithC#.Therefore,aRubywritteninC#appeared,“NETRuby\footnote{NETRubyhttp://sourceforge.jp/projects/netruby/}”.TheauthorisMr.arton.

BecauseIdon’thaveany.NETenvironmentathand,Icheckedonlythesourcecode,butaccordingtotheauthor,

morethananything,it’sslowithasafewclasslibrariesthecompatibilityofexceptionhandlingisnotgood

suchthingsaretheproblems.Butinstance_evalisineffect(astounding!).

Howtojoinrubydevelopmentruby’sdeveloperisreallyMr.Matsumotoasanindividual,

Page 883: Ruby Hacking Guide

regardingthefinaldecisionaboutthedirectionrubywilltake,hehasthedefinitiveauthority.Butatthesametime,rubyisanopensourcesoftware,anyonecanjointhedevelopment.Joiningmeans,youcansuggestyouropinionsorsendpatches.Thebelowistoconcretelytellyouhowtojoin.

Inruby‘scase,themailinglistisatthecenterofthedevelopment,soit’sgoodtojointhemailinglist.Themailinglistscurrentlyatthecenterofthecommunityarethree:ruby-list,ruby-dev,ruby-talk.ruby-listisamailinglistfor“anythingrelatingtoRuby”inJapanese.ruby-devisforthedevelopmentversionruby,thisisalsoinJapanese.ruby-talkisanEnglishmailinglist.Thewaytojoinisshownonthepage“mailinglists”atRuby’sofficialsite\footnote{Ruby’sofficialsite:http://www.ruby-lang.org/ja/}.Forthesemailinglists,read-onlypeoplearealsowelcome,soIrecommendjustjoiningfirstandwatchingdiscussionstograsphowitis.

ThoughRuby’sactivitystartedinJapan,recentlysometimesitissaid“themainauthoritynowbelongstoruby-talk”.Butthecenterofthedevelopmentisstillruby-dev.Becausepeoplewhohasthecommitrighttoruby(e.g.coremembers)aremostlyJapanese,thedifficultyandreluctanceofusingEnglishnaturallyleadthemtoruby-dev.IftherewillbemorecorememberswhoprefertouseEnglish,thesituationcouldbechanged,butmeanwhilethecoreofruby’sdevelopmentmightremainruby-dev.

Page 884: Ruby Hacking Guide

However,it’sbadifpeoplewhocannotspeakJapanesecannotjointhedevelopment,socurrentlythesummaryofruby-devistranslatedonceaweekandpostedtoruby-talk.Ialsohelpthatsummarising,butonlythreepeopledoitinturnnow,sothesituationisreallyharsh.Thememberstohelpsummarizeisalwaysindemand.Ifyouthinkyou’rethepersonwhocanhelp,I’dlikeyoutostateitatruby-list.

Andasthelastnote,onlyitssourcecodeisnotenoughforasoftware.It’snecessarytopreparevariousdocumentsandmaintainwebsites.Andpeoplewhotakecareofthesekindofthingsarealwaysinshort.There’salsoamailinglistforthedocument-relatedactivities,butasthefirststepyoujusthavetopropose“I’dliketodosomething”toruby-list.I’llansweritasmuchaspossible,andotherpeoplewouldrespondtoit,too.

FinaleThelongjourneyofthisbookisgoingtoendnow.Astherewasthelimitationofthenumberofpages,explainingallofthepartscomprehensivelywasimpossible,howeverItoldeverythingIcouldtellabouttheruby‘score.Iwon’taddextrathingsanymorehere.Ifyoustillhavethingsyoudidn’tunderstand,I’dlikeyoutoinvestigateitbyreadingthesourcecodebyyourselfasmuchasyouwant.

TheoriginalworkisCopyright©2002-2004MineroAOKI.

Page 885: Ruby Hacking Guide

TranslatedbyVincentISAMBARTandCliffordEscobarCAOILEThisworkislicensedundera

CreativeCommonsAttribution-NonCommercial-ShareAlike2.5License