a robust partitioning scheme for ad-hoc query workloads · microsoft mit qcri univ. chicago. today...
TRANSCRIPT
ARobustPartitioningSchemeforAd-HocQueryWorkloads
ANILSHANBHAGMIT
J/WAlekh Jindal,SamMadden, JorgeQuiane, AaronJ.ElmoreMicrosoftMIT QCRI Univ.Chicago
Today
Datacollectionischeap=>Lotsofdata!
DataPartitioning
FindaverageordersizeforallordersbetweenSept10andSept11,2017
DataSkipping - Skipdatablocksnotnecessary
10%selectivityquery=>10xfasterifdatapartitionedonselectionpredicate
Orderdate
TheProblem
Analytics
Ad-Hoc/ExploratoryAnalysis
RecurringWorkloads
+
Focusofexistingwork
Giveworkload=>Returnpartitioninglayout
Problems:1. Tedioustocollectworkload2. Maynotbeknownupfront3. Changesovertime
Howtogetbenefitsofpartitioninginthiscase?
OurApproach
Doeverythingadaptively!
Twostepprocess:1. Upfrontloadthedatasetpartitioned2. Asusersquery,incrementallyimprovethe
partitioningofthedata
DistributedstoragesystemslikeHDFS,filesbrokenintoblocks(128MBchunks)
A<=5andB<=7
UpfrontPartitioning
>Insteadofpartitioningbysize,partitionbyattributes.>SamenumberofblockscreatedasinHDFS.Eachblocknowhasadditionalmetadata
AdaptiveRe-Partitioning
Whenusersubmitsaquery,optimizertriestoimprovethepartitioningbyreorganizingthepartitioningtree
HereifqueriesaskA<=3manytimes,replaceB7 byA3
DoneondatasetswhichareO(1TB)with~8000nodepartitiontrees.
SystemArchitecturePredicatedScanQueryExample:
FINDemployeesWITHAge<30AND20k<Salary<40k1 2
1.UpfrontPartitionerGoal:Generateapartitioningtree
WITHOUTanupfrontqueryworkload
>Generatesatreewithheterogeneousbranching
>Balancethepartitioningbenefitacrossallattributes
!
" #
$
! " !
AllocationGoal: Balancepartitioningbenefitacrossattributes
Allocationofattributei ~averagepartitioningofanattributej
= 𝛴all nodes i nij cij
UpfrontPartitioningAlgorithm
AttributeAllocations
PartitioningTree
UniformifnoworkloadinformationWeightedifwehavepriorworkloadinformation
2.AdaptiveQueryExecutorGoal:Returnmatchingtuples+checkifpartitioninglayoutcanbeimproved
Alternativesfoundviatransformationsonthepartitioningtree
1.SwapRule
2.PushupRule 3.RotateRule
Gettingaplan
CostModelThesystemmaintainawindowWofpastqueries
ComputeBenefitandRepartitioningCostforthebestplan
RepartitioningONLY happenswhenreductioninthetotalcostofthequeryworkloadisgreaterthanre-partitioningcost.
Solvesconstantre-partitioningduetorandomquerysequencesandboundstheworsecaseimpact.
Performance
4metrics
1)Loadtime
2)Timetakenbyfirstquery
3)Aggregateruntimeoveraworkload
4)Incrementalimprovementwithworkloadhints
LoadTimeTPC-H:ScaleFactor200+De-normalized.Datasize:1.4TB
Loadingperformance: 1.38timesslowerthanHDFS
Loadtimescalesalmostlinearlywithdatasizeandindependentofnumberofcolumns
Timetakenbyfirstquery
OnAverage:45%betterthanfullscan20%betterthank-dtree
AggregateWorkloadRuntime
0400800
120016002000
0400800
120016002000
0400800
120016002000
0 25 50 75 100 125 150 175 2004uery 1o
0400800
1200160020007i
me
7aNe
Q (iQ
s)
full scaQ raQge raQge2 AmoebaWorkload:200Queriesgeneratedfromrandominitializationof8querytemplatesofTPC-Hbenchmark
fullscan – Baseline
range – partitionsonorderdate (1perdate)1.88xbetter
range2– partitionsonorderdate(64),r_name(4),c_mktsegment(4),quantity(8)3.48xbetter
Amoeba– 3.84xbetterthanbaseline
WorkloadHints
0400800
120016002000
0 25 50 75 100 125 150 175 2004uery 1o
0400800
120016002000
7im
e 7a
NeQ
(iQ s
)
default better iQitBetterInit:Startswithcustomallocationtomimicrange2
6.67xbetterthan fullscan
Filteringratio:default:0.81betterinit :0.9
Conclusion•Amoeba isadistributedstoragesystembasedonanadaptivedatapartitioningscheme• Lowloadingoverhead• Improvedfirstqueryperformance• Adapttochangesandsignificantlyimprovementtoworkloadruntime• Canexploitworkloadhints
•Allowsanalyststogetstartedrightawayandreapbenefitsofpartitioningwithoutanupfrontworkload