How the Military Sifts Through All the Data on Earth

It's a no-brainer that the military uses big data, but the scope is even greater than you may realize.

"Have you ever read The Art of War by Sun Tzu?"

It's a question I didn't expect to be asked by a data scientist who spends her days working on improving ways for extracting information online. But then again, she was trying to explain how big data is used in defense planning.

"Tzu says that if you know the enemy and you know yourself, you need not fear the result of 100 battles," Olfa Nasraoui, a data scientist at the University of Louisville and an expert in big data, told me over the phone. "From that you should be able to get a lot of ideas about how big data is exploited in war."

Big data has become a buzzword of late. It's used so often, it's almost lost all meaning. Every day, there are stories about how big data can be harnessed and used by companies, governments, and even individuals. It's pretty much a no-brainer that the military is also taking advantage of big data, but exactly what data is it using, and how? It turns out the military is using pretty much any kind of big dataset you can imagine, for everything from the most basic daily tasks to the most complicated long-term planning.

"Large data sets have already existed for a long time, it's just that they have gotten larger and we have the computing power and technology to process it now," said Vicki Barbur, chief technical officer at Concurrent Technologies Corporation, a nonprofit applied research and development organization that helps clients (including the military) find the right data analysis tools. She said big data in defense planning starts with very basic information.

"The kind of info we all have access to on a day to day basis is information defense groups would be interested in: GPS, climate, environment, traffic, vegetation growth," Barbur told me in a phone interview. "This is all culled together and interrogated with specific questions in mind."

It's easy to imagine how this information might be useful for the military. If you're planning a mission, you'd want to have a program that can collect all of the vital data on weather, traffic, and environment, and allow you to plot out the best course of action. It's sort of like using a map app on your phone: the app is pulling in data about weather, road conditions, construction, traffic, and more, combining it with the specifications you laid out (no toll roads, a more scenic route, whatever), and spitting out step-by-step directions. That's big data in action, though on a relatively small scale.

But big data in defense planning goes much deeper than rain and gridlock. Along with all the data we use on a daily basis, defense groups are pulling in human-generated data. And there's a lot of it.

"There are 7 billion people on Earth, and 6 billion of those people have a cell phone. One billion people logged in to Facebook on Monday, August 24. That's one in seven people on Earth," Nasraoui said. "The biggest challenge is going to be how to filter through the noise because most of the data is not going to be useful."

All of that human activity generates gargantuan amounts of data, only a tiny sliver of which might be useful for intelligence or surveillance purposes. The military needs to be able to comb through the data, pull out the necessary info, and do it all quickly. To do that, it relies on data scientists and researchers who have developed faster algorithms to analyze data, fusion tools that combine different datasets, new indexing techniques organize data, improved cloud services to store and process data, and innovative new software such as Memex, DARPA's "Google for the dark web."

"An active mission is not all that different from a football game."

In the real world, this allows military organizations to combine huge sets of human-produced data—everything from tweets, to phone calls, to GPS locations—and suss out necessary information. Nasraoui point to tracking ISIS as an example.

"ISIS actually uses social media to gain supporters and to boast about their activities," Nasraoui said. "Most of the data on social media is just innocuous, innocent data, but there is maybe just that one tweet from one of these armies like ISIS that has information we can use."

Obviously, no human can read through every word on Twitter to look for information about ISIS activity, so we rely on algorithms to comb through it and pull out salient data, she said.

Biometric technology has also introduced an entirely new realm of human-generated data, which the military is using not only for intelligence purposes (like scanning the irises of persons of interest in the field) but also to monitor US soldiers, CTC's Barbur said. Something as simple as the high-tech equivalent of a Nike Fuelband could provide reams of data to help analyze soldier's performances.

"An active mission is not all that different from a football game," Barbur said. "You have players on the field or you have soldiers in the field, and you need to understand how they're performing, so you can know if someone is in harm's way before they actually collapse."

Once it's collected and sorted, all of this information also needs to be presented in a way that allows defense planners to make sense of it. Some of the big data tools in our lives make decisions for us, Nasraoui explained, but in defense planning that wouldn't work. Take Netflix: It analyzes huge amounts of data on what you watch and uses an algorithm to suggest other movies and TV shows for you to watch. That's fine for Netflix, but when it comes to planning a war, the stakes are obviously way too high. Instead, the "suggested titles" part of the equation would be left up to human minds.

And the volume of data isn't the only challenge when it comes to defense planning, especially when it comes to human-generated data. Analyzing human speech, for example, is simple for an actual human, but tricky for algorithms, Nasraoui said, though more research is being dedicated to this task. Then there's the question of ethics.

This challenge became all-too-apparent in the wake of Edward Snowden, and the question of how to source and analyze human-generated data legally and ethically is the most important one the military needs to consider, Nasraoui said, especially when many people are still in the dark on just how much data they generate every day.

"Unfortunately, I doubt that the majority of people actually are that aware of the data they produce, just by looking at the way people interact with websites and apps and games. They reveal so much of themselves," Nasraoui said. "I would doubt most people are aware of every possible way that their data gets used. And really every piece of data can be used if you think about it. Whether or not it's used ethically and legally, that's the big question."

All Fronts is a series about technology and forever war. Follow along here.