getting to know our numerical selves
TRANSCRIPT
-
7/28/2019 Getting to Know Our Numerical Selves
1/3
Bloomberg Businessweek
Politics & Policy
Balancing Security and Liberty in the Age of Big Data
By Paul Ford on June 13, 2013
http://www.businessweek.com/articles/2013-06-13/balancing-security-and-liberty-in-the-age-of-big-data
A very large Internet company once had the noble impulse to share some of its data with the research
community. It made three months of log files from its search service available to all. The company took many
steps to preserve privacy, removing personal information and randomizing ID numbers in the belief that this
would make it impossible to identify any of the more than 650,000 customers whod used the service. But
Internet hobbyists, professional researchers, and journalists were able to ferret out many of the users. No.4417749, for example, was a Georgia widow. Another user appeared to be planning a murder. Today, the
AOL (AOL) Search Log Scandal is remembered as one of the weirdest missteps in Internet history.
That took place an epoch ago, way back in 2006. Now anyone with a few dollars and a knack for computers
can rent some cloud capacity and set up a stack of totally free technologies to deal with enormous amounts of
data. Managing this data is a key part of functioning as a large Internet company. If youre the intelligence
apparatus of a global superpower, and your job is to keep an eye on people who are contemplating terrible
acts, this data is incredibly valuable. Youre going to do what you can to get your hands on it. Once you do,
you can employ beautiful, supple pieces of softwaresome with point-and-click interfaces and little
iconsto help you understand what youre seeing. Its powerful stuff.
Thats essentially whats been going on. From a series of leaks to the Guardian newspaper weve learned that
Verizon (VZ) turns over logs of all its callsthe numbers, locations, and other metadata, but not audio
from calls themselvesto the National Security Agency every day. Thanks to a 29-year-old Booz Allen
Hamilton (BAH) consultant named Edward Snowden, we also learned of a program called Prism. The detailsare uncertain: At first it seemed that companies such as Google (GOOG), Apple (AAPL), Facebook (FB),
and Microsoft (MSFT) had given the NSA open access to all of their user information. Now it seems that
these companies are merely streamlining the way that Foreign Intelligence Surveillance Act requests work,
setting up a secured drop-point service for the NSA to use. Of course, this implies that there are so many
requests that a special expediting system is neededand since we dont know how much data is being
shared, nor which is domestic or international, the leak about the NSA program has become a global
sensation.
Public debate about how to strike a balance between security and liberty in the age of global terrorism and
Big Data is long overdue. And despite the insistence of the countrys elected leaders that the NSAs activities
pose no threat to law-abiding citizens, we cant merely shrug off the details. The total absorption of our
telecommunications system into the national security apparatus should give all of us pause. But it shouldnt
be shocking. Thats because the vast digital trove of secrets amassed by the government isnt a secret.
Amid all the fury over the Snowden leak, its easy to forget that when asked in a March 12 hearing if the
NSA collects any type of data at all on millions of Americans, the director of National Intelligence, James
Clapper, said, No sir, not wittingly. Later, on NBC, Clapper offered this explanation: What I was thinking
of is looking at the Dewey Decimal numbers of those books in the metaphorical library. Collecting data, he
said, would mean taking the books off the shelf, opening it up, and reading it.
ncing Security and Liberty in the Age of Big Data - Businessweek http://www.businessweek.com/printer/articles/125430-balancing-
6/17/2013
-
7/28/2019 Getting to Know Our Numerical Selves
2/3
In other words, the NSA is building a giant card catalog of human beings, many of whom are Americans.
Theyre not actually collecting their conversations, or the people themselves, but all the data about their
conversationsnot the data but the metadata.
Its entirely possible that Clapper believes he has drawn a sensible ethical line here. Yet as that AOL case in
2006 made clear, metadata can be revealing. Search histories or call logs like those the NSA ingests and
presumably stores are hardly the same as Dewey Decimal numbers. Theyre more like the index in the back
of a book. What the NSA seems to be doing is treating hundreds of millions of people like open books and
indexing them: Who are they, who do they know, where have they been, and so forth.
Data thats well-defined and cleanly organized can be connected to whole other swaths of data. So once you
build big organized indexes of human beingsone of their search terms, one of their phone calls, for
exampleyou can merge them into one mega-index. And you can combine that mega-index with other
mega-indexes. Theres enormous power in linking things. Google famously got its start by judging the way
one page connected to another on the Web. Facebook has its social graph of interconnected people and
organizations. Person A connects to person B, and by inference to all of person Bs friends, too.
This capability is so powerful and compelling that it gets hard not to link things. That a midlevel external
consultant such as Snowden could have access to so many poorly designed PowerPoint slides doesntnecessarily demonstrate that the NSA is bad at keeping secrets. It could imply that these programs are so
typical inside the organization that theyre almost taken for granted. Perhaps people like Director Clapper
really do believe, or have chosen to believe, that building a huge index to the invisible library of humanity is
essentially a clerical act that doesnt fall under the same moral category as surveillance. He might argue that a
program like Prism doesnt involve digging up secrets so much as combining ways of seeing the world.
But theres a weird side effect for the rest of us. We are not just ourselves anymore. Each one of us has a new,
statistical self living in databases around the world. Its those selves, uniquely identified bundles of behavior,
that marketers target and companies try to reach. These are remarkable, distributed portraits of what we read,
what we eat, and where we sleep. When it comes to our statistical selves, the difference between the NSA and
private companies such as Facebook or Google or Amazon.com (AMZN) lies in what the government can do
with the data it collects. Its building that giant index so that, if it needs to, it can actively cross the line
between your statistical self and your real, physical self. Its the difference between would you like to
receive local coupons for businesses you love? and why is there a van in front of our house?
Do we have a choice? Not much of one, not yet. Its possible but very burdensome to encrypt all of your data
and become less snoopable. Americans, according to polls, just dont care that much about this sort of
privacy. As long as the line between the statistical self and the real self isnt crossed, why worry? Full
participation in modern culture, one could argue, requires us to continually leave these data trails, to build
these other selves, all bound to be indexed inside the NSAs secret empire at Fort Meade, Md.
Will that change, now that the extent of surveillance has been revealed? Its hard to imagine the president
doing much, since the executive branch has become an executive dashboard: a world of online petitions and
spreadsheets and briefings produced from the very data under discussion. The legislature could act, but no
one ever went broke underestimating the technical savvy of the U.S. Congress.
There are, however, some basic questions an informed citizenry can and should ask. Where is this data being
collected? Where does it come from? How long is it stored? Which databases are linked? And another one:
Can I see? To its credit, the NSA does allow you to request your own file and see the information it has on
record. You can mail the request to Fort Meade, fax it, or e-mail it (with a special digital signature). It takes a
ncing Security and Liberty in the Age of Big Data - Businessweek http://www.businessweek.com/printer/articles/125430-balancing-
6/17/2013
-
7/28/2019 Getting to Know Our Numerical Selves
3/3
while to process; obviously the NSA would prefer not to share this information with you. The irony is that
the NSA is very likely the organization that best understands the digital self-portraits weve painted over the
Internet years. Searching for terrorists, its built an unbelievably large index of human events.
As the conversation unfolds, its likely that the media will focus on individuals: Edward Snowden and his
motives, or the terrorists caught out. But dont forget all of that data, all those captured moments. We deserve
to know this databases shape and how it protects us. As the weeks go on and people in power talk about
needles, keep your eye on the haystack.
2013 Bloomberg L.P. All Rights Reserved. Made in NYC
ncing Security and Liberty in the Age of Big Data - Businessweek http://www.businessweek.com/printer/articles/125430-balancing-
6/17/2013