Blogs of War

Hizballah Cavalcade

Internet Haganah



Kremlin Trolls

Making Sense of Jihad

Selected Wisdom

Views from the Occident


American Terrorists

Anwar Awlaki

Al Qaeda


American Al Qaeda Members

Inspire Magazine

Revolution Muslim


News, documents and analysis on violent extremism

Saturday, June 8, 2013

Understanding NSA Data Collection and Locations

The latest in the Guardian's series of eye-opening stories on the NSA's data collection displays a heat map of the geographic location of IP addresses collected by the agency -- billions of them -- and uses it to make the following claim:
The Boundless Informant documents show the agency collecting almost 3 billion pieces of intelligence from US computer networks over a 30-day period ending in March 2013. [...] Other documents seen by the Guardian further demonstrate that the NSA does in fact break down its surveillance intercepts which could allow the agency to determine how many of them are from the US. The level of detail includes individual IP addresses.
IP address is not a perfect proxy for someone's physical location but it is rather close, said Chris Soghoian, the principal technologist with the Speech Privacy and Technology Project of the American Civil Liberties Union. "If you don't take steps to hide it, the IP address provided by yourinternet provider will certainly tell you what country, state and, typically, city you are in," Soghoian said.
That approximation has implications for the ongoing oversight battle between the intelligence agencies and Congress.
This might be true, but it is by no means certain. IP addresses don't only apply to people -- the targets of the investigation -- but to the servers on Web sites tracked by the NSA and used by its actual targets. Here's an example of how this might work.

Abu Joe is a global terrorist. He lives in Mali. He's an avid user of the Shamikh jihadist forum, which he accesses from an Internet connection in his home. Let's say Joe is not a particularly careful terrorist, so he doesn't do anything to hide his IP address. So Joe uses his computer's IP address, which is associated with Mali, to access Shamikh. This results in a simple network that looks like this (all IP addresses are fictional here):

All nice and simple, and Joe's IP address does indeed correspond, roughly, to his physical location. Now let's say you're the NSA and you're building a dossier on Joe. So every time Joe posts on Shamikh, you have automated software that "scrapes" a copy of the forum page.

The automated software is set up to get every scrap of information from the page, but for now we'll just look at the IP addresses. In addition to Joe's IP address, the server that hosts the page also has an IP address. This is USEFUL information if you're the NSA, for various reasons, so by default their hypothetical software scrapes the server's address in addition to Joe's. So in addition to Joe's location, you also have the location of the building where Shamikh's servers are probably located, which happens to be in Ukraine. The data record now looks like this:
So now you have two IP addresses and two locations, but Joe is only sitting at his computer in one of them.

Of course, the NSA is also interested in Shamikh, where Joe hangs out, so let's say they use some kind of automated software to scrape every page on the Shamikh forum along with some basic useful information about the page -- date created, software used to make it, and the IP address of the server hosting the page.

To keep it as simple as possible, we'll say there are 5,000 posts on Shamikh's forum, each one a post created by a different user.

Eighty percent of the users have been sloppy and exposed their IP addresses, while 1,000 kept their IPs hidden. Six of the sloppy users live in Ukraine.

What you end up with, in this very simplified example, are 5,000 records for pages containing 9,000 IP addresses -- 4,000 for users spread all over the world, and 5,000 for the Web pages on which those users posted a comment, which are associated with an IP address based in Ukraine.

If you made a heat map of all the IP address you had collected, more than half would be located in Ukraine. Ukraine would be red hot on the map, and every other location would be much cooler. But only six of the 5,000 users actually *live* in Ukraine.

So when the Guardian reports that:
A snapshot of the Boundless Informant data, contained in a top secret NSA "global heat map" seen by the Guardian, shows that in March 2013 the agency collected 97 billion pieces of intelligence from computer networks worldwide.
You can't look at the map and jump to conclusions about where the targets of surveillance are based. The map does, to some extent, reflect the locations of the targets, but it almost certainly also includes the location of the infrastructure they use.

According to the heat map, an "orange" amount of the "pieces of intelligence" (theoretically based on IP addresses and other infrastructure) have been geolocated to the United States. Orange is not a very specific amount, and we don't know if the data has been rendered on a logarithmic scale, which is often used to visualize large amounts of data by scaling (i.e., distorting) its actual value.

For all these reasons, you can't reasonably infer that some plurality of targets of surveillance are located in the United States. Many, many, many global extremists use Web sites hosted in the U.S., but very few of them live here themselves.

On the flip side, none of this *precludes* the possibility that many targets are based in the United States. But when you're looking at a Powerpoint map chart (for God's sake) designed to visualize 97 billion bits of information scraped from the world every month, you have to recognize that a lot of detail is dropping out for the sake of visualization.

This is complicated stuff, and it's important to understand what mountains of information and complexity lie behind an extremely simple graphic. What I've outlined here is likely only the tiniest slice of that complexity.

We know a whole lot more about the NSA's programs than we did last week, but the information we lack vastly outweighs the information we have. We should be cautious in interpreting data summaries we don't fully understand.

Buy J.M. Berger's book, Jihad Joe: Americans Who Go to War in the Name of Islam


Views expressed on INTELWIRE are those of the author alone.



Tweets referencing this post:



", granular analysis..."

ISIS: The State of Terror
"Jessica Stern and J.M. Berger's new book, "ISIS," should be required reading for every politician and policymaker... Their smart, granular analysis is a bracing antidote to both facile dismissals and wild exaggerations... a nuanced and readable account of the ideological and organizational origins of the group." -- Washington Post

More on ISIS: The State of Terror

"...a timely warning..."

Jihad Joe: Americans Who Go to War in the Name of Islam:
"At a time when some politicians and pundits blur the line between Islam and terrorism, Berger, who knows this subject far better than the demagogues, sharply cautions against vilifying Muslim Americans. ... It is a timely warning from an expert who has not lost his perspective." -- New York Times

More on Jihad Joe


INTELWIRE is a web site edited by J.M. Berger. a researcher, analyst and consultant covering extremism, with a special focus on extremist activities in the U.S. and extremist use of social media. He is a non-resident fellow with the Brookings Institution, Project on U.S. Relations with the Islamic World, and author of the critically acclaimed Jihad Joe: Americans Who Go to War in the Name of Islam, the only definitive history of the U.S. jihadist movement, and co-author of ISIS: The State of Terror with Jessica Stern.


Newest posts!

I appeared on a Google Hangout to discuss online r...

The FBI's History of Spying on Journalists

The Roots of Radicalization

Myths of Radicalization

Forecasting Terrorist Attacks With Big Data and th...

J.M. Berger on Marathon Bombings

Marathon Bombing: Issues to Watch

Background on Marathon Bombers Tamerlan Tsarnaev a...

A day in the life of a terrorism analyst, 2013 edi...

On Comparing White Nationalists to Anarchists


New York Pipe Bomb Suspect Linked to Revolution Muslim

The Utility of Lone Wolves

Interview with Online Jihadist Abu Suleiman Al Nasser

A Way Forward for CVE: The Five Ds

How Terrorists Use The Internet: Just Like You

PATCON: The FBI's Secret War on the Militia Movement

Interview About Jihad With Controversial Cleric Bilal Philips

Forgeries on the Jihadist Forums

U.S. Gave Millions To Charity Linked To Al Qaeda, Anwar Awlaki

State Department Secretly Met With Followers of Blind Sheikh

State Department Put 'Political Pressure' On FBI To Deport Brother-in-Law Of Osama Bin Laden In 1995

FBI Records Reveal Details Of Nixon-Era Racial Profiling Program Targeting Arabs

Gaza Flotilla Official Was Foreign Fighter in Bosnia War

U.S. Had 'High Confidence' Of UBL Attack In June 2001

Behind the Handshake: The Rumsfeld-Saddam Meeting