madrid! query processing

of 28/28
WWW2009 MADRID! Query Processing 於 WWW 於於於於於webkai 於於於於 於於於於 於於於於於於 ()

Post on 23-Feb-2016

35 views

Category:

Documents

0 download

Embed Size (px)

DESCRIPTION

WWW2009 MADRID! Query Processing. 於 WWW 論文読み会( webkai ) 東京大学 岡崎直観(辻井研究室). 今回の発表資料. 以下の URL に置 いてあります http://www.chokkan.org/www2009/. 最適に整列された文書群に対する 転置インデックス圧縮及びクエリ処理. H. Yan, S. Ding, and T. Suel . (NYU & Yahoo! Research) - PowerPoint PPT Presentation

TRANSCRIPT

WWW2009 MADRID! Query Processing

WWW2009 MADRID!Query Processing WWWwebkaiURLhttp://www.chokkan.org/www2009/

H. Yan, S. Ding, and T. Suel.(NYU & Yahoo! Research)Inverted index compression and query processing with optimized document ordering, WWW2009, pp. 401-410.inverted list; posting list: [[56, 1, [34]], [198, 2, [14,23]], [1034, 1, [43]]]

d-gapID: IDID: URLIDwww.chokkan.org: [[56, 1, [34]], [57, 2, [14,23]], [60, 1, [43]]]IDd-gap: [[55, 1, [34]], [0, 2, [14,23]], [2, 1, [43]]]d-gapIDd-gap0Var-Byte Coding: fdddddddf: ; d: f18bit142 = (10001110)b 10000001 000011102 = (10)b 00000010Rice Coding: 1q0rrrrq10m bitrq = int(n / 2m), r = n % 2mm = 4142 (q = 8, r = 14) 11111111011102 (q = 0, r = 2) 00010Var-Byte Coding (1/3)Simple9 (S9): ssssdddd dddddddd dddddddd dddddddd4 bitd0000: 281bit0001: 142bit0010: 93bit0011: 74bit0100: 55bit0101: 47bit0110: 39bit0111: 14bit1000: 128bit(142 2 17)S901100100 01110000 00001000 00100010S1616 (2/3)PForDeltaNN = 12890%b2b(23, 41, 8, 12, 30, 68, 18, 45, 21, 9, )90%25 = 32m = 5|1 23 3 8 12 30 1 18 2 21 9 | 45 68 41|

bInterpolative Coding (3/3)5bit4PForDelta (PFD) move-to-front codingIDTREC GOV250%

TREC GOV22004.gov252010NewPFDPFD2b2b(23, 41, 8, 12, 30, 68, 18, 45, 21, 9, )m=5: (21 9 8 12 30 4 18 13 21 9 ): (1 3 1 ): (9 4 13)OptPFDmPForDeltaS16NewPFD, OptPFD

IDURLMove-To-Front (MTFBurrows-WheelerMost-Likely-Next (MLN)QQQiji(j+1)j

#0(1, 2, 3, 4, 5)()15(5, 1, 2, 3, 4)(5)25(5, 1, 2, 3, 4)(5, 1)35(5, 1, 2, 3, 4)(5, 1, 1)43(3, 5, 1, 2, 4)(5, 1, 1, 4)52(2, 3, 5, 1, 4)(5, 1, 1, 4, 4)62(2, 3, 5, 1, 4)(5, 1, 1, 4, 4, 1)(5, 5, 5, 3, 2, 2)MTFMTF, MLN

RuralCafe: J. Chen, L. Subramanian, J. Li.(NYU)RuralCafe: Web Search in the Rural Developing World, WWW2009, pp. 411-420Web100-1000128KbpsBPO50-10064KbpsSMSSMSWiFi2009910BBC News http://news.bbc.co.uk/2/hi/africa/8248056.stmITMedia Newshttp://www.itmedia.co.jp/news/articles/0909/10/news075.html4GB1180km1826574%

RuralCafeRuralCafe

ORRuralCafe

N-gramGPUS. Ding, J. He, H. Yan, T. Suel.(NYU & Yahoo! Research)Using Graphics Processors for High Performance IR Query Processing, WWW2009, pp. 421-430.GPUjoinGPUNVIDIA GeForce 8800 GTS (640MB; 32 threds)Compute Unified Device Architecture (CUDA)

Inclusive parallel prefix sum[a0, a1, , an-1] [a0, a0+a1, , a0+a1++an-1]Exclusive parallel prefix sum[a0, a1, , an-1] [0, a0, a0+a1, , a0+a1++an-2]for (i =1;i