searching political data by strategy

20
Searching Political Data by Strategy Roberto Cornacchia Jaap Kamps Wouter Alink Arjen P. de Vries [email protected]

Upload: arjen-de-vries

Post on 27-Jun-2015

658 views

Category:

Documents


0 download

DESCRIPTION

Presentation on the Exposé demonstrator, to enable search of Dutch parliamentary proceedings (the Political Mashup data collection).

TRANSCRIPT

Page 1: Searching Political Data by Strategy

Searching

Political Data by Strategy

Roberto CornacchiaJaap KampsWouter Alink

Arjen P. de Vries

[email protected]

Page 2: Searching Political Data by Strategy

Search by Strategy

An iterative 2-stage search process Express domain knowledge as high-level

search strategies Generate search engine from the strategy

A dynamic REST API UI controls for unspecified parameters

Separate search strategy definition (the how) from actual searching and browsing of data collections (the what)

Page 3: Searching Political Data by Strategy
Page 4: Searching Political Data by Strategy

https://devel.spinque.com/ExPoSeApp-20130116/?config=demo#dashboard/demo04:

/p/topic/Mokken

Page 5: Searching Political Data by Strategy

Search by Strategy captures:

Arbitrary retrieval unit types (not just documents) E.g., expert finding, entity search

“Semantic” search The building blocks operate on scored triples

Semi-structured search Data objects may be structured in hierarchies

Exploratory search Use facets as preferences

Page 6: Searching Political Data by Strategy

Exposé

Searching the parliamentary proceedings of the Dutch parliament Complete transcripts of everything said in

parliament Organized by parliamentary session Detailing who sais what in what role and

context

Page 7: Searching Political Data by Strategy

Exposé

Original data is PDF, transformed into XML by award-winning project Political Mashup http://politicalmashup.nl/

Page 8: Searching Political Data by Strategy

In Politics…

Essence is not only what is said, but also by who and to whom, and why

Concrete example: Wilders sais “knettergek” in parliament (in

2007) – is this remarkable?

Page 9: Searching Political Data by Strategy
Page 10: Searching Political Data by Strategy

“Knettergek” case

The word “knettergek” has been used many times in parliament…

… but never to address a member of the government

Page 11: Searching Political Data by Strategy

Vary

ing

resu

lt typ

es

Utterances

Person / Party / …

Page 12: Searching Political Data by Strategy

Flexibility

Concrete case: Maarten: “I cannot find Prof. Mokken, who I

know has been spoken about in parliament multiple times!”

Page 13: Searching Political Data by Strategy

Flexibility

Default indexing uses stemming and normalization

But… searching for people’s names (and, as we mention it, many other domain specific terminology) can be negatively affected by stemming

“Mokken” transformed into “mok”, leading us to geographic locations “Mook” and “De Mok”, but not to the famous professor!

Page 14: Searching Political Data by Strategy
Page 15: Searching Political Data by Strategy
Page 16: Searching Political Data by Strategy

https://devel.spinque.com/ExPoSeApp-20130116/?config=demo#dashboard/demo05:/p/topic/mokken/p/emphasis_stemming/0

Page 17: Searching Political Data by Strategy

Joins to the rescue!

Which house speakers from the Rotterdam harbour say what about “Amsterdam”?

Page 18: Searching Political Data by Strategy

biographies

describes

person

utterance

Sem

an

tic Search

Page 19: Searching Political Data by Strategy

Advantages

Define and execute custom build search strategies Specialized to the task, or even to the search

at hand

Search multiple data sources at once Explore and refine results interactively “Search provenance”

Complete transparency on how search results were obtained

Page 20: Searching Political Data by Strategy

Position Statement

Search professionals think in terms of search strategies already

Let them design their own strategies, and thereby tailor their search engines

So they learn to trust what we claim to be the effective information retrieval techniques!