computer assisted research and reporting
DESCRIPTION
CARRTRANSCRIPT
Data Mining en Data Driven Story Telling
Computer Assisted Research and Reporting
Peter Verweij
School of JournalismUtrecht The Netherlands
Wikileaks
• Begonnen als citizen reporters site• een klokkenluiders site• Onderzoeksjournalisten zijn altijd afhankelijk
van hulp van buitenaf• 300.000 documenten uit de US diplomatieke
post geven, bewerkt door journalisten, een beeld van de buitenlandse politiek
• Julian Assange op de lijst van interpol• Nieuw is het aantal: door digitalisering. • Titel: datamining- datadriven story telling
Nick Davies Copy/paste or new possibilities Feiten, nauwkeurigheid en geloofwaardigheid:
Nick Davies Boek, interview, sites http://www.humedia.nl/profiles/blogs/
nick-davies-over recensie
http://extra.volkskrant.nl/select/boeken/artikel.php?id=843
website boek http://www.flatearthnews.net/ Nederlands onderzoek:
http://www.volkskrant.nl/multimedia/article1135829.ece/Eenderde_nieuws_is_voorverpakt
Nick Davies 2
Wat is de conclusie? Geen research of onderzoekjournalistiek is
nog mogelijk? Maar: nieuws is overal en de media
focussen op eigen productie: bv nrcnext en parlementaire verslaggeving
Maar: digitalisering biedt meer mogelijkheden
Voorbeeld: twitternetwerken
Verkeersongelukken in Utrecht
Some classical US examples School bus and drunken drivers
convictions drunken driven; driver licence number, school bus drivers
Hurricane Andrew damage map related to wind strength;
building construction fraude
What’s in a name? Phil Meyer(Precision Journalism):
Some practitioners of the "new journalism" took to making up their facts in order to keep up with the deadline pressures. Others stopped short of making things up, but combined facts from different cases to write composite portrayals of reality that they passed off as real cases. Despite the problems, the new nonfiction remains an interesting effort at coping with information complexity and finding a way to communicate essential truth. It pushes journalism toward art. Its problem is that journalism requires discipline, and the discipline of art may not be the most appropriate kind. A better solution is to push journalism toward science, incorporating both the powerful data-gathering and analysis tools of science and its disciplined search for verifiable truth.
After the introduction of internet and spreadsheets: CARR: computer assisted research and reporting
Now because of analysis of databases: Data-mining
Some other examples NY Times: Gap in life expectancy USA Today: delegate tracker Volkskrant: topsalarissen NRC: voedselprijzen NRC: WOZ waarde
World Food Prices Simple Story
World Bank echoes food cost alarm Research and background
Food price crisis Costs of food
Continuum for reporting: Re-active reporting Proactive reporting
From one column press release story to a full investigative scoop
Examples
How to follow the story about food prices on the web? Find leading media: FT, BBC, Economist and
subscribe to RSS feed Search newspaper archives: lexis/nexis Search the web with Google
Use more keywords; quotation marks Look different source type: doc, xls, ppt
Use Google news and create RSS feeds Find institutions and their databases or use
Google public data Bloggers: using technorati Use Twitter search hashtags # related to food
prices
What has been changed in reporting? Internet:
More sources in number and in full text Geographical range wider Direct access Multi media: including audio/video/graphics
Google indexes about 10 bill pages but that is 20% of the information on the web
Databases: more data How do you find databases?
Institutional approach for searching CBS, Worldbank, IMF FAO, UN, eurostat
What has been changed in reporting? (2) Tools for handling data from databases
Spreadsheet; excel Database; access GIS, geographic information systems; mapping;
arcgis How do you store your data?– Create your own database or spreadsheets to
store your data Asksam Google notes
New tools
Google public data: directe analyse van databases
Google forms: online enquete maken Google fusion tables: data aan kaarten
koppelen Google maps mashups: adding data to
google maps Links: memeburn en blog Wordpress plugin voor poll
Maps masups Web 2.0 and mashups: merging data on
the web http://projects.latimes.com/homicide/map/
Using google API to create poi’s FCJ Utrecht Maps in slideshow with audio
http://www.fao.org/hunger/en/
http://www.gapminder.org/world/
Asielaanvragen naar land van nationaliteit
Sep-97 Sep-98Toename AandeelTotaal 3758 5107 35.90
Afganistan 794 820 3.27 Afganistan 16.06Bosnie- Herzogowina 182 526 189.01 Bosnie- Herzogowina 10.30Irak 1154 904 -21.66 Irak 17.70Iran 106 159 50.00 Iran 3.11Servie en Montenegro 192 537 179.69 Servie en Montenegro 10.51Liberia 11 12 9.09 Liberia 0.23Soedan 89 248 178.65 Soedan 4.86Somalia 109 294 169.72 Somalia 5.76Sri Lanka 138 121 -12.32 Sri Lanka 2.37Turkije 127 158 24.41 Turkije 3.09Overige 856 1328 55.14 Overige 26.00
3758 5107 35.90
16%
10%
18%
3%11%
0%
5%
6%
2%
3%
26%
Nationaliteiten
AfganistanBosnie- HerzogowinaIrakIranServie en MontenegroLiberiaSoedanSomaliaSri LankaTurkijeOverigeAfganistan
IrakServie en Montenegro
SoedanSri Lanka
Overige
0.00
5.00
10.00
15.00
20.00
25.00
30.00
16.06
10.30
17.70
3.11
10.51
0.23
4.86 5.76
2.37 3.09
26.00
nationaliteiten
Column G
Verkiezingen 1998/2003 Grootste partij per gemeentewww.nederlandkiest.nl
Gemeente data
• Gemeente utrecht:• http://www.utrecht.nl/smartsite.dws?
id=86964• Interactieve databank:• http://utrecht.buurtmonitor.nl/
What can we do with these tools? Calculations: averages Graphs: bar, line, pie Maps: Interactive graphs UNDP data by gapminder
What is the objective? In journalism:
Graphs are analysis not illustrations Cooperation between programmers, design and
journalists Aim is better journalism; better storytelling, informing
public
What do you need? Knowledge about statistics How to handle spreadsheets, graphs, maps
Other techniques Social network analysis: From IRE
Terrorist NetworkValdis Krebs published "Uncloaking Terrorist Networks," an analysis of the Sept. 11, 2001, terrorist network in the April 2002 issue of First Monday, a peer-reviewed Internet journal. This article explains how Krebs was able to construct a visual representation of the network as well as what this visualization can tell us about the network that was previously unknown. Other papers Krebs has authored, including information on InFlow software, can be found at the researcher's Web site: www.orgnet.com
527 Committee DonorsIn the 2004 presidential election "huge donations of a handful of wealthy liberals named Linda Pritzker, Stephen L. Bing, Peter B. Lewis and George Soros could determine the outcome. Together, they have given more than $26 million to help finance the most extensive get-out-the vote operation in history, the goal of which is to make John F. Kerry president." These donations were made to 527 organizations. "Named after a section of the tax code, the 527 groups are doing much of the advertising and field work traditionally left to party organizations." Included with this story is a diagram displaying contributions to Democratic 527s and a list of the biggest donors to these groups.
They RuleThey Rule is a Web site that allows you to create maps of the interlocking directories of the top 100 companies in the United States in 2001. The data is static, so it is fast becoming out of date, as companies merge and disappear and directors shift boards. A new version of this site is being developed.
Overzicht uit NRC
Netwerken in journalistiek 2
Jury Lidmaatschap Literaire Prijzen
Twitter netwerken tussen politici en journalistenMore
Other techniques 2 Collect your own data
From secondary to primary data Design your own survey and collect data online using Or Content analysis: for example NRC; talkshow and
partij Google forms or wordpress plugin surveymonkey Analysis:
Online Importing in spreadsheet Datamatrix using SPSS