data journalism (city online journalism wk8)

Post on 27-Jan-2015

147 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Week 8 lecture to students on the 8 MAs at City University

TRANSCRIPT

Online JournalismCity UniversityPaul Bradshaw

Data journalism

1. What is it?2. Where to get it3. How to get it

Themes

“Each weekday, my computer program goes to the Chicago Police Department's website and gathers all crimes reported in Chicago.”

Adrian Holovaty

Times film genres

• Times Data Blog

”QUOTE”

Now is a good time.

“The Tribune’s more than three dozen interactive databases, collectively have drawn three times as many page views as the site’s stories. [75% of traffic]”

http://bit.ly/dj2dmz

.

What is data?

NumbersTextLive dataBehavioural dataImages, audio, video

Anything that a computer can work with

Start with the data and look for the stories? (MPs’ expenses)Or start with a lead and look for the data?

Passive vs active data journalism

Data Journalism Continuum

Data.gov.ukGuardian datastoreOpenlylocal,Open Corporates, Open Charities, Who's Lobbying etc.FOI requests (WDTK), disclosure logsBooks - British Political Facts

Finding

GetTheData.orgWDMMG forumsMySociety mailing listsOpen Data CookbookWolfram Alpha forum

Finding – data communities

Government - national and local'Monitors' - regulators & other bodiesCharities, pressure groupsInstitutions - academic, scientific, healthBusiness, financeMedia, entertainment, sport

Other secondary sources

Site:gov.uk (etc)Filetype:pdf (etc) Imagine the page you hope to find, including jargon etc. Database contents are invisibleGoogle News alerts: report OR review

 Advanced search

"quotes search for exact phrases""disclosure logs" site:nhs.uk + ensures page contains word: +logs- omits results with word: -wooden* wildcard, e.g. "deaths * custody"~ synonyms, e.g. ~deaths

 Advanced search

Tip: use overseas sources

• US medicine databases• EU subsidy databases • Swedish people data• International police agency

correspondence with UK

RSS, XML, JSON, RDF - and APIsScraperwikiOutwit HubYahoo! PipesSpreadsheet formulae(look them up)

Feeds and scrapers

Format? Table? Pattern? URL?

'Structured' data

http://www.eib.org/projects/pipeline/?start=2009&end=2010&status=&region=&country=united+kingdom&sector=

'Structured' HTML? (Use Firebug)

<p>      <strong>Case Ref: FS50295557 <br />Date: 04/11/2010 <br />Public Authority: London Borough of Southwark <br />Summary: </strong>The complainant requested a copy of the authorities approved business plan  [...]<br /><strong>Section of Act/EIR &amp; Finding: </strong>FOI 1 - Complaint Upheld , FOI 10 - Complaint Upheld <br /><a title="Opens in new window" href="~/media/documents/decisionnotices/2010/fs_50295557.ashx" target="_blank">View PDF of Decision Notice FS50295557</a></p>

=ImportHTML("http://bob.com/mytable", "table", 1)=ImportXML("http://backtweets.com/search.xml?itemsperpage=100&...”)=ImportFeed("http://search.twitter.com/search.atom?rpp=20&page=1&q="&A2)

Spreadsheet formulae

Fetch Page module Regex

Yahoo! Pipes

"A problem for sites who want to provide privacy while allowing new users to join easily. Scraping services may constitute a violation of terms of service; tactics often resemble a denial-of-service attack or a security exploit."

Ethics

.

Questions?

Links

OnlineJournalismClasses.tumblr.comDelicious.com/paulb/cityoj08Delicious.com/paulb/datajournalismDelicious.com/paulb/visualisationDelicious.com/paulb/data

- Use advanced search to find data- Use tools to scrape data- Visualise a politician's speeches using Wordle or Many Eyes- Read up on some of the tools or technologies before the lab

 Lab

Books

Darrell Huff - How To Lie With Statistics Blastland & Dilnot - The Tiger That Isn'tDonna Wong - The WSJ Guide to Information GraphicsBrian Suda - A Practical Guide to Designing with Data

.

Assignments

Enough time?

10 credits = 100 hoursLectures = 15 hoursGroup blog = 60 hours (75%)Strategy = 20 hours (25%)(Some in labs) + 5 hours on other issues

Enough time? Blog

Just an example:10 posts ranging from simple links to interviews, analysis, experiment5.5 hours ave per week x10 weeks = 55 hours+ 5 hours to write evaluation

Enough time? Strategy

Just an example:12.5 hours researching community30 mins per week x10 weeks with community (2.5 hours)5 hours analysis & write up

Group blogs

8 areas:1.Online video; 2. Online audio3. Data; 4. UGC5. Community management6. Mobile; 7. Social media8. Infographics and photography

Criteria

Ass1: Newsgathering/researchProductionLaw, ethics and strategyAss 2: ResearchAnalysisExecution

top related