2013 legislative data and transparency conference

97
2013 Legislative Data and Transparency Conference

Upload: nancy-banks

Post on 29-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

2013

Legislative Data and

TransparencyConference

Conference ScheduleLegislative Process OverviewLegislative Branch UpdateBulk Data Task Force update on provisioning legislative dataLibrary of Congress and GPO Electronic Access Plans and DevelopmentsOfficial Tools Demo - Administrative Interface (docs.house.gov) - Democratic Caucus Intranet - Committee Roll Call Vote UtilityInternational UpdateElectronic Legislative Archiving:Panel of legislative archivists discuss how to preserve and curate electronic legislative recordsExtending Legislative XML for and by third parties:Address XML data standards and how to extend them for new applicationsUnder-digitized legislative data:What are the evolving standards and practices for integration and use of legislative data?

Legislative Overview

Kirsten Gullickson

Office of the Clerk

Legislative Process Overview

Kirsten Gullickson, Sr. Systems Analyst

Office of the Clerk

Rep. Ludlow placing bill into hopper 12/30/1936http://www.loc.gov/pictures/item/hec2009008605/

The Challenge

• Legislative documents and related data must be– prepared– managed,– distributed, and– archived.

• This includes paper and electronic means for handling the official documents.

How a bill become a law. After the vote has been taken, the result is noted in the Journal of Action by Louis Sirkey, House Journal Clerk. If the bill receives a passing vote, it is sent to the other chamber for action. If the bill failed to pass it must be reintroduced unless it is voted to refer it back to the committee for reconsideration

The Challenge (cont’d)

Government data should be– Public– Accessible– Described– Reusable– Complete– Timely– Managed Post-Release

White House M-13-13, Open Data Policy, Managing Information as an Asset

Where are the documents? Data?

•GOVERNMENT PRINTING OFFICE

– www.gpo.gov

•LIBRARY OF CONGRESS

– Thomas.loc.gov– Beta.congress.gov

•THE HOUSE– Clerk.house.gov– Docs.house.gov– www.house.gov– Committee websites

•THE SENATE– www.senate.gov– Committee websites

General Document Flow

Introduction and Referral to Committee

Doc. 110-49, page 8How Our Laws Are Made

http://history.house.gov/Collection/Listing/2004/2004-019-000/

The Hopper

Consideration by Committee

Doc. 110-49, page 11How Our Laws Are Made

Reported to House and Placed on Calendar

Doc. 110-49, page 15How Our Laws Are Made

Consideration on House Floor

Doc. 110-49, page 20How Our Laws Are Made

Senate Consideration

Doc. 110-49, page 36How Our Laws Are Made

Enrollment and Presidential Actions

Doc. 110-49, page 36How Our Laws Are Made

Slip laws and U.S. Code

Doc. 110-49, page 53How Our Laws Are Made

Questions and Answers

Until Jurgensen, Jr., a tally clerk designed this electric voting machine it took at least three months, using the old rubber stamp system, to compile the voting records of the 435 members of the House. Recording the yeas and nays, absent and present, paired for and paired against votes of each individual member, the machine which is similar to an adding machine, does the same job in less than two weeks. Greater accuracy is assured in counting votes with Jurgensen-designed machine.

New time saving voting machine 05/10/1938http://www.loc.gov/pictures/item/hec2009015711/

Bulk Data Task Force Update

Robert ReevesOffice of the Clerk

Bulk Data Task Force and Transparency Updates

Since our last meeting on January 30, 2013 here’s what we’ve been up to:

Bulk Data Task Force and Transparency Updates

Bulk Data Task Force and Transparency Updates

Bulk Data Task Force and Transparency Updates

Bulk Data Task Force and Transparency Updates

Other projects:• Bulk Data Bill Summaries• House Modernization Project• Data Challenge• Data Dashboard• Clerk Twitter Account• Clerk/History Arts & Archives YouTube

Library of Congress and GPO Plans and Developments

Tammie NelsonLibrary of Congress

Matt LandgrafGovernment Printing Office

LIBRARY OF CONGRESSTammie Nelson

GPO/LOC Collaboration: Digitization of Core Legislative Materials

Matt Landgraf - GPO

May 22, 2013

Background

Joint Committee on Printing approved collaboration on digitization of:

Statutes at Large

Bound Congressional Record

Roles and Responsibilities

Library of Congress:

Performs digitization

Provides files to GPO

GPO:

Creates access copies

Creates metadata

Statutes at Large Status

All work for volumes from 1951-2002 has been completed

Currently available via FDsys

Access files and metadata have been provided to LOC (to be available on congress.gov in the future)

Bound Congressional Record Status

LOC Digitization (1873-1998) to be completed by the end of calendar year 2013

FDsys development underway

Resources being identified for metadata creation

Content will be released on an iterative basis via FDsys, beginning in FY 2014

Bound Congressional Record: Key Issues

Size of collection

Large effort required to create descriptive metadata and access files at the article level

Official Tools Demonstration Panel

Michael BakerHouse Committee on Ways and Means

Stephen DwyerOffice of the Democratic Whip

Kathleen SwiatekGovernment Printing Office

The Official Intranet for House Democratic Staff

Presentation by Steve Dwyer, Office of the Democratic Whip

HISTORY & ORIGIN

• Originally launched in early 2009• We recently launched our 3rd major iteration• Private—only House Democratic staffers

have access• Why did we build it? • Why Democrats-only?

ORGANIZATION

• Over 120,000 nodes and counting• How do we organize content?

• Primarily by legislation• General issue tags• “Specific Topics” for big non-bill items• Authoring office and staffer

DATA SOURCES UTILIZED

• GovTrac for legislative information• House LDAP for permissions and

credentials• Housenet’s e-Dear Colleague system• DemocraticWhip.gov for House Floor

schedule

DATA SOURCES UTILIZED (CONTINUED)

• Docs.house.gov for Committee schedules

• POPVOX for organization letters and public sentiment

• Staffer data from a commercial vendor• Significant private listservs are auto-

consumed

The Official Intranet for House Democratic Staff

Presentation by Steve Dwyer, Office of the Democratic Whip

International Update

Gherardo Casini

Global Center for ICT

Electronic Legislative Archiving Panel

James JacobsGovernment Information Librarian, Stanford Univ.

Lisa LaPlantGovernment Printing Office

Marc LevittByrd Center for Legislative Studies

Preserving Electronic Legislative Information in FDsys

Legislative Data Transparency ConferenceMay 22, 2013

Lisa LaPlantGPO

GPO’s Mission

Keeping America Informed by producing, protecting, preserving, and distributing the official publications and information products of the Federal Government.

1

2

3

Legislative Publications Bills and Resolutions Committee Materials Congressional Calendars Congressional Directory Congressional Record United States Code Journal of the House of Representatives Procedural and Precedential Materials

4

Digital Preservation

Combination of the policies, strategies, and actions that ensure access to reformatted and born digital content regardless of the challenges of media failure and technology change.

5

Preservation Goal Accurately render authenticated content over time.

6

Preservation Objectives Safeguard digital content along with all relevant metadata. Assess the condition and needs of collections of digital information. Meaningfully render content despite continuously changing technology. Manage processes which are auditable, replicable, and that build the basis for trust.

OAIS Reference Model

7

Consumer

Producer

System Administration

Ingest Access

Data Management

ArchivalStorage

Preservation Planning

Package Based Approach

8

Package 1Package 1

Rendition 2Rendition 2

ContentFiles

mods.xml

aip.xml

premis.xml

Rendition 1Rendition 1

ContentFiles

9

PREMIS

Record each significant event in the lifecycle of content in PREMIS metadata. Record the content source, changes that have occurred since the content was created or acquired, and who has custody of the content.

Events Recorded in PREMIS

Software Activities: Digest Calculation Ingest Fixity Check Rendition Creation ACP Creation Digital Signing Parsing

User Activities: Rendition Upload Rendition Deletion Submission Replacement AIP Deletion

10

11

Preservation Strategies Refreshment (bit-level preservation)

Content is transferred from one physical medium to another.

MigrationContent is converted or transformed into a more recent version or a more widely used format. 

FDsys Primary and COOP

12

13

More Information

Lisa LaPlantOffice of Programs, Strategy, and Technology, [email protected]

GPO’s FDsyswww.fdsys.gov

Preservation in FDsyswww.gpo.gov/preservation

Archiving Senator Byrd’s E-Records

Marc Levitt

Director of Archives

Robert C. Byrd Center for Legislative Studies

Records Received & Migrated

• Early Petitions (1790-1817)- PDFs with OCR• Byrd Migration Projects:

– Photographs- TIFF– A/V Material- Outside Vendor– Microfilm- PDFs, then OCR (in-house)

• Byrd Capture Projects:– CSPAN floor speeches– Congressional Record PDFs

• Byrd Office Files Received: – Hard drive with files from the shared drive– Constituent Services System (CSS) data on 2

DVDs

Case Study: CSS Processing

• Hired a contractor• Script to automate ingestion of data• CSV tables cleaned and optimized with

Google Refine• SQL database created• Waiting for installation

What the Office Uses:

Senator Byrd confers with President Jimmy Carter at the White House. (August 23, 1977). Official White House Photo.

What is Archived by the Vendor:

• <A color photograph of Senator Byrd (left) and President Carter discuss issues in an office.>

• <Senator Byrd is seated on a floral print couch.>

• <President Carter is seated on a blue chair.>• <Flower curtains hang behind the men.>• <A white lamp sits on a brown table between

them.>

The Reconstructed Result:

Not the Same:

Full picture and functionality in original record

Loss of information and context through 3 phases of data migration

Issues

• Authenticity and Reliability• Standardization• Organization Schema• What to Save (and why it’s okay to do so)

Third Party Extensions of Legislative XML Panel

Daniel Bennett

eCitizen

Jim Harper

CATO Institute

Eric Mill

Sunlight Foundation

Extending Congressional XML:

Transparency, Soup to Nuts

Extending XML

“Soup to Nuts”

- American English idiom conveying the meaning of "from beginning to end“- Derived from the description of a full course dinner, in which courses progress from soup to a dessert of nuts

Extending XML

“Deepbills” Project

CatoXML

http://www.cato.org/resources/data

Extending XML

Extending XML

Extending XML

Extending XML

Extending XML

Extending XML

Extending XML

Extending XML

Extending XML

What can YOU build?

Extending XML

“Deepbills” Project

CatoXML

http://www.cato.org/resources/data

Under-digitized Legislative Data Panel

Anne Washington

George Washington University

Grant Vergottini

Xcential, Inc.

Josh Tauberer

GovTrack

Why Digitize?

Anne L. Washington, PhD

George Mason University, School of Public Policy

May 2013

Legislative Data Standards ConferenceUS House of Representatives

Political Informatics

Poli-Informatics• Computational science & "big data"

– Data visualization– Machine learning

• Study of politics and government

http://poliinformatics.org

Poli-Informatics could…

• Visualize complex policy solutions.• Predict procedural progress through

language.• View nested organizational hierarchies

impacted by a policy.• Gather single policy idea across multiple

ideological discourses.• Track policy developments over time.

Joint PI-net

• George Mason University • University of Washington• Northwestern University• Cornell University• Carnegie Mellon University• Pennsylvania State University• & YOU !

http://poliinformatics.org

Anne L. Washington, PhDhttp://washington.gmu.edu

[email protected] Professor

School of Public Policy

Organizational Development & Knowledge Management

George Mason University, Arlington VA

Digitizing Legislative DataFrom documents to data to

information and beyond

Grant Vergottini

May 22, 2013

Digitizing Legislative Data From documents to data to information and beyond

Now

Web Services

XML Download

Data Scraping

Proprietary XML

Open XML Standards-Based XML

Past

Akoma Ntoso

Future

Step 1: Legislative Documents OnlinePutting the documents online

Data Scraping

Proprietary XML

Past

• Simple systems• Geared towards people rather than

programs

• Data Scraping for programs• Roll your own XML• Maintain your own repository

Step 2: Legislative Data Sources Improving data accuracy

XML Download

Data Scraping

Open XMLProprietary XML

Past

• Authentic data• More sophisticated

Web Sites

• Download XML directly

• Open Gov. data formats

• Still need your own repository

Now

Now

Next: Legislative Information Services

Future

Web Services

XML Download

Proprietary XML

Open XML Standards-Based XML

Past

Akoma Ntoso

Step 3: Legislative Information ServicesConnecting the information

Web Services

Standards-Based XML

• More reliable data• Authentic HTML & XML

• More useful data• Consumer rather than producer

oriented• Simpler standards-based information

models• Linked citations & other metadata• Microformats & Microdata for HTML

• More timely data• Web services rather than download• Link services stitch data together• Robust repository services – search,

query

Akoma Ntoso

Future

Step 4: The VisionConnecting the world

• State & Federal Laws

• Regulations to Legislation

• Treaties & Trade Agreements

So what’s left to do?

Joshua Tauberer (@JoshData)GovTrack.us

Legislative Data & StandardsMay 22, 2013

All legislative events are recordedin structured data.

All legislative artifacts arepublicly available.

(How hard could that be, right?)

Legislative Data

Bill Summary & StatusAmendment Status & TextList of MembersCommittee ArtifactsHistorical Bill Text, Statutes, and so on.

http://opengovdata.io/maturity/

Wrapping Up

Reynold Schweickhardt

Director of Technology

Committee on House Administration

Thank you for participating!

Legislative Data and

TransparencyConference