getting to know our numerical selves

Upload: carl-thorne

Post on 03-Apr-2018




0 download


  • 7/28/2019 Getting to Know Our Numerical Selves


    Bloomberg Businessweek

    Politics & Policy

    Balancing Security and Liberty in the Age of Big Data

    By Paul Ford on June 13, 2013

    A very large Internet company once had the noble impulse to share some of its data with the research

    community. It made three months of log files from its search service available to all. The company took many

    steps to preserve privacy, removing personal information and randomizing ID numbers in the belief that this

    would make it impossible to identify any of the more than 650,000 customers whod used the service. But

    Internet hobbyists, professional researchers, and journalists were able to ferret out many of the users. No.4417749, for example, was a Georgia widow. Another user appeared to be planning a murder. Today, the

    AOL (AOL) Search Log Scandal is remembered as one of the weirdest missteps in Internet history.

    That took place an epoch ago, way back in 2006. Now anyone with a few dollars and a knack for computers

    can rent some cloud capacity and set up a stack of totally free technologies to deal with enormous amounts of

    data. Managing this data is a key part of functioning as a large Internet company. If youre the intelligence

    apparatus of a global superpower, and your job is to keep an eye on people who are contemplating terrible

    acts, this data is incredibly valuable. Youre going to do what you can to get your hands on it. Once you do,

    you can employ beautiful, supple pieces of softwaresome with point-and-click interfaces and little

    iconsto help you understand what youre seeing. Its powerful stuff.

    Thats essentially whats been going on. From a series of leaks to the Guardian newspaper weve learned that

    Verizon (VZ) turns over logs of all its callsthe numbers, locations, and other metadata, but not audio

    from calls themselvesto the National Security Agency every day. Thanks to a 29-year-old Booz Allen

    Hamilton (BAH) consultant named Edward Snowden, we also learned of a program called Prism. The detailsare uncertain: At first it seemed that companies such as Google (GOOG), Apple (AAPL), Facebook (FB),

    and Microsoft (MSFT) had given the NSA open access to all of their user information. Now it seems that

    these companies are merely streamlining the way that Foreign Intelligence Surveillance Act requests work,

    setting up a secured drop-point service for the NSA to use. Of course, this implies that there are so many

    requests that a special expediting system is neededand since we dont know how much data is being

    shared, nor which is domestic or international, the leak about the NSA program has become a global


    Public debate about how to strike a balance between security and liberty in the age of global terrorism and

    Big Data is long overdue. And despite the insistence of the countrys elected leaders that the NSAs activities

    pose no threat to law-abiding citizens, we cant merely shrug off the details. The total absorption of our

    telecommunications system into the national security apparatus should give all of us pause. But it shouldnt

    be shocking. Thats because the vast digital trove of secrets amassed by the government isnt a secret.

    Amid all the fury over the Snowden leak, its easy to forget that when asked in a March 12 hearing if the

    NSA collects any type of data at all on millions of Americans, the director of National Intelligence, James

    Clapper, said, No sir, not wittingly. Later, on NBC, Clapper offered this explanation: What I was thinking

    of is looking at the Dewey Decimal numbers of those books in the metaphorical library. Collecting data, he

    said, would mean taking the books off the shelf, opening it up, and reading it.

    ncing Security and Liberty in the Age of Big Data - Businessweek


  • 7/28/2019 Getting to Know Our Numerical Selves


    In other words, the NSA is building a giant card catalog of human beings, many of whom are Americans.

    Theyre not actually collecting their conversations, or the people themselves, but all the data about their

    conversationsnot the data but the metadata.

    Its entirely possible that Clapper believes he has drawn a sensible ethical line here. Yet as that AOL case in

    2006 made clear, metadata can be revealing. Search histories or call logs like those the NSA ingests and

    presumably stores are hardly the same as Dewey Decimal numbers. Theyre more like the index in the back

    of a book. What the NSA seems to be doing is treating hundreds of millions of people like open books and

    indexing them: Who are they, who do they know, where have they been, and so forth.

    Data thats well-defined and cleanly organized can be connected to whole other swaths of data. So once you

    build big organized indexes of human beingsone of their search terms, one of their phone calls, for

    exampleyou can merge them into one mega-index. And you can combine that mega-index with other

    mega-indexes. Theres enormous power in linking things. Google famously got its start by judging the way

    one page connected to another on the Web. Facebook has its social graph of interconnected people and

    organizations. Person A connects to person B, and by inference to all of person Bs friends, too.

    This capability is so powerful and compelling that it gets hard not to link things. That a midlevel external

    consultant such as Snowden could have access to so many poorly designed PowerPoint slides doesntnecessarily demonstrate that the NSA is bad at keeping secrets. It could imply that these programs are so

    typical inside the organization that theyre almost taken for granted. Perhaps people like Director Clapper

    really do believe, or have chosen to believe, that building a huge index to the invisible library of humanity is

    essentially a clerical act that doesnt fall under the same moral category as surveillance. He might argue that a

    program like Prism doesnt involve digging up secrets so much as combining ways of seeing the world.

    But theres a weird side effect for the rest of us. We are not just ourselves anymore. Each one of us has a new,

    statistical self living in databases around the world. Its those selves, uniquely identified bundles of behavior,

    that marketers target and companies try to reach. These are remarkable, distributed portraits of what we read,

    what we eat, and where we sleep. When it comes to our statistical selves, the difference between the NSA and

    private companies such as Facebook or Google or (AMZN) lies in what the government can do

    with the data it collects. Its building that giant index so that, if it needs to, it can actively cross the line

    between your statistical self and your real, physical self. Its the difference between would you like to

    receive local coupons for businesses you love? and why is there a van in front of our house?

    Do we have a choice? Not much of one, not yet. Its possible but very burdensome to encrypt all of your data

    and become less snoopable. Americans, according to polls, just dont care that much about this sort of

    privacy. As long as the line between the statistical self and the real self isnt crossed, why worry? Full

    participation in modern culture, one could argue, requires us to continually leave these data trails, to build

    these other selves, all bound to be indexed inside the NSAs secret empire at Fort Meade, Md.

    Will that change, now that the extent of surveillance has been revealed? Its hard to imagine the president

    doing much, since the executive branch has become an executive dashboard: a world of online petitions and

    spreadsheets and briefings produced from the very data under discussion. The legislature could act, but no

    one ever went broke underestimating the technical savvy of the U.S. Congress.

    There are, however, some basic questions an informed citizenry can and should ask. Where is this data being

    collected? Where does it come from? How long is it stored? Which databases are linked? And another one:

    Can I see? To its credit, the NSA does allow you to request your own file and see the information it has on

    record. You can mail the request to Fort Meade, fax it, or e-mail it (with a special digital signature). It takes a

    ncing Security and Liberty in the Age of Big Data - Businessweek


  • 7/28/2019 Getting to Know Our Numerical Selves


    while to process; obviously the NSA would prefer not to share this information with you. The irony is that

    the NSA is very likely the organization that best understands the digital self-portraits weve painted over the

    Internet years. Searching for terrorists, its built an unbelievably large index of human events.

    As the conversation unfolds, its likely that the media will focus on individuals: Edward Snowden and his

    motives, or the terrorists caught out. But dont forget all of that data, all those captured moments. We deserve

    to know this databases shape and how it protects us. As the weeks go on and people in power talk about

    needles, keep your eye on the haystack.

    2013 Bloomberg L.P. All Rights Reserved. Made in NYC

    ncing Security and Liberty in the Age of Big Data - Businessweek
