It is clear to me that the CIA needs big data, like Zettabytes (10 to the 21st power bytes), and the ability to find and connect the “terrorist dots” in it. As of 2009, the entire Internet was estimated to contain close to 500 exabytes which is a half zettabyte.

Recently I have listened to three senior CIA officials — two former and one current — talk about this and the need for data science and data scientists to make sense of it.

Gen. Michael Hayden, former director of the CIA and National Security Agency, and Principle Deputy Director of National Intelligence, and Bob Flores, former chief technology officer at the CIA, spoke about this at the MarkLogic Government Summit; and Gus Hunt, current CTO at CIA, spoke about this at the Amazon Web Services Summit that I wrote about recently.

General Hayden framed the problem as follows: Cold War Era — easy to find the enemy, but hard to stop them (e.g. Soviet tanks in Eastern Germany); versus the Global War on Terrorism — hard to find the terrorist, but easy to stop once their found (e.g. the underwear bomber on the airplane). He said we live in an era where it is not a failure to share data, but with processing the shear volume and variety of data with velocity that is the result of sharing.

He shared his experience meeting with former Egyptian President Mubarak before the recent Arab awakening due to social media that resulted in his overthrow and then meeting with the President of Twitter, Jack Dorsey, whom he asked: How does it feel to overthrow a government–something the CIA, when Hayden was director, was never able to do?

Hayden also said we need tools to predict the future from social media and data scientists to use them.

I told him about my work with Recorded Future that was also the subject of an Breaking Gov story.

Bob Flores, former CIA CTO, said that Recorded Future was a new, fantastic technology and that the old model of collect, winnow, and disseminate fails spectacularly in the big data world we live in now. He used the recent movie “Moneyball” as an example of how the new field of baseball analytics called Sabermetrics has shown there is no more rigorous test (of a business plan) than empirical evidence.

He said that in this time of budget cuts and downsizing the creme will rise to the top (those people and organizations can solve real problems with data) and survive. And Flores agrees with Gen. Hayden that while all budgets are on a downslope (including for defense, intelligence, and cyber), that cyber is on the least down slope of all the rest because it is realized that limiting the analysis of big data would be equivalent to disarmament in the Cold War era.