github as transparency device in data journalism, open data and data activism

Post on 14-Aug-2015

653 Views

Category:

Education

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

GitHub as Transparency Device in Data Journalism, Open Data and Data Activism

Digital Methods Initiative Summer School 2015"Liliana Bounegru, Jonathan Gray & Stefania Milan

Part of a broader research collaboration: !• Data Journalism (Liliana Bounegru) • Open Data (Jonathan Gray) • Data Activism (Stefania Milan) • Digital Methods (Richard Rogers & Erik Borra)

How is GitHub reconfiguring… data journalism? open data? data activism?

How is GitHub reconfiguring… data journalism? open data? data activism?

Our work on data journalism includes…

Witnessing and Auditing Journalism in the Making with GitHub

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)

3. Research design: Mapping open data journalism with GitHub (our project)

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)

3. Research design: Mapping open data journalism with GitHub (our project)

An example of the role of GitHub in open data journalism.

New York Times (2014) “War Gear Flows to Police Departments”"http://www.nytimes.com/2014/06/09/us/war-gear-flows-to-police-departments.html

New York Times (2014) “Mapping the Spread of the Military’s Surplus Gear”"http://www.nytimes.com/interactive/2014/08/15/us/surplus-military-equipment-map.html

New York Times (2014) “What Military Gear Your Local Police Department Bought”"http://www.nytimes.com/2014/08/20/upshot/data-on-transfer-of-military-gear-to-police-departments.html

The Upshot on GitHub: https://github.com/TheUpshot/Military-Surplus-Gear

The Upshot on GitHub: https://github.com/TheUpshot/Military-Surplus-Gear

http://earino.shinyapps.io/Military-Surplus/

https://github.com/cinquemb/1033-program-quick-drill-down

Charleston Daily Mail (2014) “Federal programe sends military equipment to WV law enforcement” http://www.charlestondailymail.com/article/20140819/DM01/140819135/1420

GitHub as a device for multiplying witnessing around police acquisition of military equipment.

What does openness mean in the context of journalism?

–Alex Howard, The Art and Science of Data-Driven Journalism

“The embrace of open source software and agile development practices, coupled with a growing

open data movement, have breathed new life into traditional computer-assisted reporting.”

Advocates discuss openness in terms of: !• Transparency • Collaboration and participation

–Simon Rogers, “Hey Wonk Reporters, Liberate Your Data!”

“Data journalism only matters when it's transparent.” ”  

–Mathew Ingram, “Open journalism also means opening up your data, so others can use and improve it”

“Open journalism … means opening up your data, so others can use and improve it.”

–Simon Rogers, “Journalist datastores: where can you find them? A list”

“It’s a pretty core tenet of open journalism that you share your sources; i.e., you write a story about data then you make numbers available to download ”

––Simon Rogers, “Hey Wonk Reporters, Liberate Your Data!”

“Journalism today is at least as much about working with the community as it is telling the world what

you think happened. The ethos of open journalism is that reporting becomes better by gathering the expertise of the world and helping to curate it.”

”  

Openness in the service of what?

Advocates associated openness with: !• Trust, credibility and accountability • Fact-checking and optimisation

(“many eyes make shallow bugs”) • Innovation and reusability • Democratising data and levelling the

playing field

– Nicolas Kayser-Bril in Scott Nesbitt’s “Is open data living up to the hype? One data journalist weighs in”

“Open source makes an organization more transparent and, therefore, more trustworthy.

Newsrooms are moving towards open source; just look at the number of journalists using GitHub now!”

–Mathew Ingram, “Open journalism also means opening up your data, so others can use and improve it”

“As with the code behind software programs — the original use for things like GitHub — there are

a host of benefits to opening up the data that provides the foundation for news stories,

including the fact that more eyeballs on the data means a greater likelihood of finding errors

and/or misinterpretations of that data.”

–Alex Plough, “The Evolution of Data Journalism: from CAR to fivethirtyeight”

“ GitHub lets users duplicate others’ code and re-purpose it for their own needs. This feature lets data journalism teams across the world quickly replicate

each other’s projects, spurring innovation with increasingly sophisticated news applications.”

–Alex Salkever, “Open Source Journalism: Data and the New News”

“Open source journalism levels the playing field. Every neighborhood blogger in California or New

York or London can now post visualization using the very same data that the biggest news organizations in the world have use. And the blogger can focus

that data down on the local impact.”

What is the role of GitHub in open data journalism?

–Alex Plough, “The Evolution of Data Journalism: from CAR to fivethirtyeight”

“Another trend is the use of software code-hosting platform Github by news organizations. Typically used by the open-source software development

community to store and share their code online (in “repositories”), GitHub lets users duplicate others’

code and re-purpose it for their own needs.”

–Tom Giratikanon, Erin Kissane, Jeremy Singer-Vine,“When the news calls for raw data”

“Why post it on GitHub? … As journalists marshall more data than ever, collect it from a

wider range of sources, and analyze it in increasingly complex ways, it’s important (and

interesting!) to be transparent about those processes. I think about it in three ways:

verifiability …, reproducibility …, reusability.”

– Emily Ferber, “Getting GitHub: Why journalists should know and use the social coding site”

“As more journalists embrace GitHub as a way to improve stories, they’ll develop a new kind of news community, centered around collaboration

and code – truly a news nerd’s nirvana.”

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)

3. Research design: Mapping open data journalism with GitHub (our project)

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)"

3. Research design: Mapping open data journalism with GitHub (our project)

To understand what is at stake we turn to studies of openness and transparency in the

context of government and activism.

Some points raised by this research in relation to studied openness or transparency programmes: !• Uncoupling of data and code from politics • Witnessing data publics/subjects • Presumption of absence of trust • Anticipation of moral failings

Clare Birchall in “Data.gov-in-a-box” on Obama’s data-driven transparency programme:

• “post-political solution” • “data in lieu of politics” • emphasis on individual rather than collective

political agency • “only reveals that which is conducive of

maintaining the status-quo.”

–Clare Birchall, “‘Data.gov-in-a-box’: Delimiting transparency”

“The openness of all this data is obviously meaningless until it is witnessed.”

–Clare Birchall, “‘Data.gov-in-a-box’: Delimiting transparency”

“What kind of publics, subjects, and indeed, politics it [data-driven transparency model] will

produce?”

The “auditor–entrepreneurial–consumer subjectivity”

–Clare Birchall, “‘Data.gov-in-a-box’: Delimiting transparency”

“The data subject is therefore called upon to be auditor (to monitor the granular transactions

of the state in the name of accountability), entrepreneur (to make data profitable through apps and visualizations) and consumer (as the

market for such apps and visualizations).”

Clare Birchall, “Data.gov-in-a-box”:

• The burden of monitoring the state moves from the state to the citizens.

• “A subject who is monitored while being asked to monitor; acted upon as data while being asked to act on data.”

• Agency is reliant on technological competence.

Because visibility is about gaining trust a transparency device presumes that there is

an absence of trust in the first place.(Harvey, Reeves & Ruppert, 2012)

–Penny Harvey, Madeleine Reeves & Evelyn Ruppert, “Anticipating failure”

“It is to past moral failures of wrongdoing, conflict or corruption that these [transparency]

devices react and consequently it is the anticipation of future moral failings towards

which they are then oriented.”

–Penny Harvey, Madeleine Reeves & Evelyn Ruppert, “Anticipating failure”

“As such rather than alleviating uncertainty they come to amplify it.”

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)

3. Research design: Mapping open data journalism with GitHub (our project)

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)

3. Research design: Mapping open data journalism with GitHub (our project)

Questions.

How can we use these studies to make sense of the move to make journalism more trustworthy

and accountable through the opening up of data and code?

To be meaningful journalistic data and code need to be witnessed.

What kinds of publics are mobilised around open journalism data and code through GitHub?

What forms of trust and accountability are produced by the opening up of data and code?

How does GitHub mobilise and format engagement with journalism and with what effects?

Research design

How to map “open data journalism”with GitHub?

GitHub not only as device for witnessing and auditing journalism in the making

But also a source of data about such practises

Over 200 programmer-journalist GitHub accounts. Over 60 journalism organisations GitHub accounts.

Five studies: !

1. Situating GitHub in the journalism ecology 2. Mapping journalism data publics with

GitHub 3. Profiling journalism practises and product

repertoires through the “distant reading” of code

4. Mapping open data on GitHub 5. Mapping data activism on GitHub

Five studies: !

1. Situating GitHub in the journalism ecology"2. Mapping journalism data publics with

GitHub"3. Profiling journalism practises and product

repertoires through the “distant reading” of code"

4. Mapping open data on GitHub 5. Mapping data activism on GitHub

1. Situating GitHub in the journalism ecology This study will locate GitHub in the data journalism space in terms of its resonance. It will trace the issues associated with it, particularly exemplary projects, programming languages, tools, analytical techniques, visions and values. !The data journalism space will be demarcated through a three-year collection of tweets containing related keywords and hashtags, as well as through associated mailing lists and events.

2. Mapping journalism data publics with GitHubThis study profiles the journalism publics, practises and product repertoires active on GitHub. The focus is on functions, modes of engagement, as well as trust and accountability mechanisms and how they are mediated and reconfigured through GitHub, open code and data. !To do so it uses custom-made GitHub scrapers to extract data around users and repositories, and analyses such data manually and by means of network analysis tools.

3. Profiling journalism practises and product repertoires through the “distant reading” of codeThis study scopes out possibilities for using digital traces of code from GitHub to inform a “distant reading” of the ideals and practises of emergent data publics in journalism and civil society. !In addition to the tracing of actor networks and their modes of engagement through the analysis of GitHub metadata in study 1, this study enquires into the possibilities and methods for undertaking an analysis of the actual code in the journalism repositories to examine the epistemological commitments, horizons, styles of reasoning and action repertoires of journalism data publics.

Same approach will be used to studydata activism and open data.

The Team

Facilitators: • Liliana Bounegru (@bb_liliana / lilianabounegru.org) • Jonathan Gray (@jwyg / jonathangray.org) • Stefania Milan (@annliffey / stefaniamilan.net) !Programmer-analyst: • Sam Leon (@noel_mas)

Who should join us?"!• Anyone active around or interested in data

journalism, data activism and/or open data. • GitHub users or people familiar with the

platform. • Designers and programmers.

Join Us!

top related