Amr Awadallah

The fact that the Department of Defense got its budget cut and the Intelligence Community got its budget increased in the White House’s 2013 budget request of Congress is indicative of more than the need to roll back a decade of military growth. It’s also indicative of a shift in IT focus–and a reflection that DoD’s network-centric focus is being overtaken by the IC’s big data-centric focus.

There are probably many reasons for such a shift. One is the world’s population. The U.S. Census Bureau estimates the world population passed 7 billion mark this past weekend. The rapidly growing number of people who will eventually have smartphones with multiple sensors (your iPhone has them now for GPS position, etc.) promises a future where there will be massive streams of real-time data that the IC will want to mine, looking for lone-wolf terrorists (who are relatively but easy to stop) who I have written about previously.

For companies like Google and Facebook, big data is big business, and for other companies big data is becoming their business as they mine their large swaths of data to improve their services and develop new business activities. The IC may not come out and say it, but it has to love the fact that Facebook will soon have 1/7th of the world’s population using it’s platform to share what’s going on. Or that Google is almost everyone’s favorite search engine because they can keep track of what people are posting and searching for much easier than many in government can.

The IC also has to love big data, and the rapid evolution of systems used to ingest and process it, because it helps push the technology wave, as Gus Hunt, CIA chief technology officer (pictured above), described it at the recent Government Big Data Forum.

Hunt said that in every aspect of their workflow at the CIA, from sensors to finished intelligence, massive, multiple, real-time sensor data streams cause bottlenecks on current networks that swamp current storage devices and overwhelm current query, analytics, and visualization tools, that are needed to produce finished intelligence.

So he wants his cake and to eat it too: He wants real-time analytics and visualizations that he says a few start-ups are trying to achieve. He also wants the Federal Cloud Computing Initiative to add two more services to Platform-, Software-, and Infrastructure-as-a-Service, namely, Data-as-a-Service and Security-as-a-Service.

Part of the solution is emerging from Google’s MapReduce, which is a parallel data processing framework that has been commercialized as Apache Hadoop (developed by Doug Cutting who named it after his son’s toy elephant) by Cloudera so one can store and compute big data at the same time.

Amr Awadallah, founder and CTO of Cloudera, calls Apache Hadoop a data operating system in contrast to Windows and Linux, which are essentially file operating systems (they store and manage all the files you create and are needed for your software applications). He points out that Apache Hadoop provides the three essential things: velocity, scalability, and economics, that are needed to handle big data.

So the IC, Gus Hunt, Amr Awadalla, and others at the Government Big Data Forum are leading the next technology wave and gave us a glimpse of both the technology infrastructure and the business organization with chief data officers and data scientists that will be needed to implement and succeed with big data.

More details about what was said can be found at CTOVision and at my wiki document, Data Science Visualizations Past Present and Future.