Data Journalism (City Online Journalism wk8)

Download Data Journalism (City Online Journalism wk8)

Post on 27-Jan-2015




0 download

Embed Size (px)


Week 8 lecture to students on the 8 MAs at City University


<ul><li> 1. Online Journalism City University Paul Bradshaw Data journalism </li></ul> <p> 2. 3. 1. What is it? 2. Where to get it 3. How to get it Themes 4. 5. 6. 7. 8. 9. 10. Each weekday, my computer program goes to the Chicago Police Department's website and gathers all crimes reported in Chicago. Adrian Holovaty 11. 12. 13. Times film genres 14. </p> <ul><li><ul><li>Times Data Blog </li></ul></li></ul> <p> 15. 16. 17. QUOTE Now is a good time. 18. The Tribunesmore than three dozen interactive databases , collectively have drawn three times as many page views as the sites stories . [75% of traffic] 19. . What is data? 20. Numbers Text Live data Behavioural data Images, audio, video Anything that a computer can work with 21. 22. Start with the data and look for the stories? (MPs expenses) Or start with a lead and look for the data? Passive vs active data journalism 23. Data Journalism Continuum 24. Guardian datastore Openlylocal,Open Corporates, Open Charities, Who's Lobbying etc. FOI requests (WDTK), disclosure logs Books - British Political Facts Finding 25. WDMMG forums MySociety mailing lists Open Data Cookbook Wolfram Alpha forum Finding data communities 26. 27. Government - national and local 'Monitors' - regulators &amp; other bodies Charities, pressure groups Institutions - academic, scientific, health Business, finance Media, entertainment, sport Other secondary sources 28. (etc) Filetype:pdf (etc)Imagine the page you hope to find, including jargon etc. Database contents are invisible Google News alerts: report OR review Advanced search 29. "quotes search for exact phrases" "disclosure logs" + ensures page contains word: +logs - omits results with word: -wooden * wildcard, e.g. "deaths * custody" ~ synonyms, e.g. ~deaths Advanced search 30. 31. Tip:use overseas sources </p> <ul><li><ul><li>US medicine databases </li></ul></li></ul> <ul><li><ul><li>EU subsidy databases</li></ul></li></ul> <ul><li><ul><li>Swedish people data </li></ul></li></ul> <ul><li><ul><li>International police agency correspondence with UK </li></ul></li></ul> <p> 32. RSS, XML, JSON, RDF - and APIs Scraperwiki Outwit Hub Yahoo! Pipes Spreadsheet formulae (look them up) Feeds and scrapers 33. Format? Table? Pattern? URL? 'Structured' data 34.;end=2010&amp;status=&amp;region=&amp;country=united+kingdom&amp;sector= 35. 'Structured' HTML? (Use Firebug) </p> <ul><li>Case Ref: FS50295557 <br />Date: 04/11/2010 <br />Public Authority: London Borough of Southwark <br />Summary: </li></ul> <ul><li>The complainant requested a copy of the authorities approved business plan [...]<br />Section of Act/EIR &amp; Finding: FOI 1 - Complaint Upheld , FOI 10 - Complaint Upheld <br /></li></ul> <ul><li>View PDF of Decision Notice FS50295557 </li></ul> <p> 36. =ImportHTML("", "table", 1) =ImportXML(";...) =ImportFeed(";page=1&amp;q="&amp;A2)Spreadsheet formulae 37. Fetch Page moduleRegex Yahoo! Pipes 38. "A problem for sites who want to provide privacy while allowing new users to join easily. Scraping services may constitute a violation of terms of service; tactics often resemble a denial-of-service attack or a security exploit." Ethics 39. . Questions? 40. Links 41. </p> <ul><li>- Use advanced search to find data </li></ul> <ul><li>- Use tools to scrape data </li></ul> <ul><li>Visualise a politician's speeches using Wordle or Many Eyes </li></ul> <ul><li>Read up on some of the tools or technologies before the lab </li></ul> <p> Lab 42. Books Darrell Huff - How To Lie With StatisticsBlastland &amp; Dilnot - The Tiger That Isn't Donna Wong - The WSJ Guide to Information Graphics Brian Suda - A Practical Guide to Designing with Data 43. . Assignments 44. Enough time? 10 credits = 100 hours Lectures = 15 hours Group blog = 60 hours (75%) Strategy = 20 hours (25%) (Some in labs) + 5 hours on other issues 45. Enough time? Blog Just an example: 10 posts ranging from simple links to interviews, analysis, experiment 5.5 hours ave per week x10 weeks = 55 hours + 5 hours to write evaluation 46. Enough time? Strategy Just an example: 12.5 hours researching community 30 mins per week x10 weeks with community (2.5 hours) 5 hours analysis &amp; write up 47. Group blogs </p> <ul><li>8 areas: </li></ul> <ul><li>Online video; 2. Online audio </li></ul> <ul><li>3. Data; 4. UGC </li></ul> <ul><li>5. Community management </li></ul> <ul><li>6. Mobile; 7. Social media </li></ul> <ul><li>8. Infographics and photography </li></ul> <p> 48. Criteria Ass1: Newsgathering/research Production Law, ethics and strategy Ass 2: Research Analysis Execution </p>