experiments with the negotiated boolean queries of the trec 2007 legal discovery track stephen...

Post on 02-Apr-2015

214 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Experiments with the Negotiated Boolean Queries

of theTREC 2007

Legal Discovery Track

Stephen Tomlinson

Open Text Corporation

2007 Nov 8

Overview

• who won the boolean query “negotiations” ?• can dropping the boolean operators improve on

the boolean run’s Recall@B ?• did the boolean keywords (synonyms) improve

on the natural language request text ?• can just relaxing the proximity constraints

improve Recall@B ?• can blind feedback improve Recall@B ?• can a fusion of vector and boolean approaches

improve Recall@B ?

3 Boolean Queries

• Defendant – initial boolean query proposed by the

defendant

• Plaintiff– rejoinder boolean query from the plaintiff

• Final– final negotiated boolean query

Topic 74: “All scientific studies expressly referencing health effects

tied to indoor air quality.”

Defendant:"health effect!" w/10 "air quality"

Plaintiff:(scien! OR stud! OR research) AND ("air quality" OR health)

Final:(scien! OR stud! OR research) AND ("air quality" w/15 health)

Topic 74 Boolean Results

Defendant:"health effect!" w/10 "air quality"– 2691 matches, 82% precision, 3% recall

Plaintiff:(scien! OR stud! OR research) AND ("air quality" OR health)

– 858,700 matches, 64% precision@25000 (ranked), 25% recall@25000 (ranked)

Final:(scien! OR stud! OR research) AND ("air quality" w/15 health)

– 20,516 matches, 77% precision, 22% recall

Topic 74: Missed Relevant Documents

Final Boolean:(scien! OR stud! OR research) AND ("air quality" w/15 health)

Passages in Missed Relevant Documents:• “… Lowrey A.H. (1980). Indoor air pollution …”• “assessment … entitled “Respiratory Health

Effects of Passive Smoking …”• “study … funded by the Center for Indoor Air

Research”

Defendant vs. Final Boolean: Precision

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43

Prec

• Def. Boolean won 20• Boolean won 22• (1 tied)

Mean in (-0.09, 0.15)

Topic 63: 1.00 vs. 0.02 (sugar contract)

Topic 69: 0.00 vs. 0.97 (indoor smoke ventilation)

Defendant vs. Final Boolean: Recall

-1

-0.9

-0.8

-0.7

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43

R@B

• Def. Boolean won 0• Boolean won 42• (1 tied)

Mean in (-0.27, -0.11)

Topic 77: 0.00 vs. 0.00 (smoke NOT tobacco)

Topic 52: 0.00 vs. 0.98 (boosting crop yields)

Plaintiff vs. Final Boolean: Recall@25000

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43

R25000

• Pl. Boolean won 35• Boolean won 6• (2 tied)

Mean in (0.03, 0.19)

Topic 59: 0.76 vs. 0.01 (limestone treatment)

Topic 58: 0.24 vs. 0.94 (phosphates and health)

Plaintiff vs. Final Boolean: Recall@B

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43

R@B

• Pl. Boolean won 15• Boolean won 27• (1 tied)

Mean in (-0.09, 0.04)

Topic 63: 0.73 vs. 0.27 (sugar contract)

Topic 58: 0.18 vs. 0.94 (phosphates and health)

Vector vs. Boolean (Example)

Boolean: (scien! OR stud! OR research) AND ("air quality" w/15 health)

Vector: scien! OR stud! OR research OR air OR quality OR health

Relevance Ranking

• term frequency dampening (BM25)– wildcard variants treated as same term– for boolean proximity constraints, only count

term occurrences satisfying proximity– metadata + ocr included in document length

• inverse document frequency (log)– based on most common variant for wildcards

Vector vs. Boolean: Recall@B

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43

R@B

• Vector won 16• Boolean won 26• (1 tied)

Mean in (-0.13, 0.02)

Topic 63: 0.79 vs. 0.27 (sugar contract)

Topic 58: 0.08 vs. 0.94 (phosphates and health)

Topic 58: “… health problems caused by HPF …”

Vector R@B=0.08, Boolean R@B=0.94 • (B=8183, estRel = 1151)

Phosphat! w/75 (caus! OR relat! OR assoc! OR derive! OR correlat!) w/75 (health OR disorder! OR toxic! OR "chronic fatigue" OR dysfunction! OR irregular OR memor! OR immun! OR myopath! OR liver! OR kidney! OR heart! OR depress! OR loss OR lost)

• vector matches often didn’t mention “Phospat!”

Topic 72: “… chemical process(es) which result in onions … making persons cry”

Vector R@B=0.03, Boolean R@B=0.78 • (B=119, estRel = 98)

((scien! OR research! OR chemical)

w/25 onion!)

AND (cries OR cry! OR tear!)

• proximity clause found some long documents with just one reference to onions’ effects

Topic 63: “… exclusivity clause in a sugar contract …”

Vector R@B=0.79, Boolean R@B=0.27

• (B=294, estRel = 18)

(Sugar w/20

(contract! OR agreement! OR deal!))

AND exclusiv!

• boolean missed “U.S. sugar quota law”

Request vs. Vector: R@25000

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43

R25000

• Req. Vector won 21• Vector won 22• (0 tied)

Mean in (0.00, 0.13)

Topic 87: 1.00 vs. 0.13 (SEC reporting)

Topic 84: 0.64 vs. 0.91 (1960s films)

Impact of Doubling Proximity Distances: Recall@B

-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43

R@B

• 2x-Prox Boolean won 14• Boolean won 8• (21 tied)

Mean in (-0.03, 0.02)

Topic 61: 0.49 vs. 0.44 (waste treatment)

Topic 72: 0.39 vs. 0.78 (onions effect)

Impact of Blind Feedback: Recall@B

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43

R@B

• Boolean+BF won 16• Boolean won 21• (6 tied)

Mean in (-0.12, 0.03)

Topic 90: 0.64 vs. 0.10 (sales in England)

Topic 58: 0.01 vs. 0.94 (phosphates and health)

Fusion of Boolean, Request and Vector: Recall@B

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43

R@B

• Fusion won 20• Boolean won 20• (3 tied)

Mean in (-0.08, 0.03)

Topic 65: 0.88 vs. 0.67 (candy packaging)

Topic 58: 0.10 vs. 0.94 (phosphates and health)

Conclusions

• final negotiated boolean query often had substantially lower recall than the plaintiff boolean query

• boolean operators (AND, proximity) often have value

• blind feedback and fusion did not improve the boolean run’s Recall@B (on average)

top related