Lately it seems that everyone is talking about “big data,” and for good reason – the potential to gain greater insight into the way decisions are made has implications throughout businesses, governments and societies the world over. Capitol Hill just took a deep dive into the big data pool, to look at what this relatively new concept really means and how we can leverage it to address the greatest challenges of our day.
Last week, IBM joined government leaders on Capitol Hill to discuss how we can apply new technologies – called analytics – to big data so that we make critical decisions to improve and better the lives of the citizens we serve.
Big data is an unparalleled resource. Every day, we create 2.5 quintillion bytes of data – a rate growing so rapidly that 90% of the world’s data today has been created in the last two years. This unstructured data comes from everywhere, including sensors for climate information, posts to social media sites, digital pictures and videos, and smart power meters.
Big data is defined as the digital convergence of this unstructured data and the structured data found in databases. Big data promises a treasure trove of information that we as a society can tap to improve many facets of life – from energy to health care to transportation to public safety.
We now have the capacity to understand, with greater precision than ever before, how our world actually works – to see patterns unfolding in real time; to model possible outcomes; and to take informed action. As an example, the City of Charleston, South Carolina’s Police Department, which joined us on Capitol Hill last week, is piloting a predictive analytics system that will give its police officers a more holistic view into historical crime statistics and patterns in order to prevent crime. The project – like those in New York, Las Vegas, Memphis and Los Angeles – demonstrates how local governments can make sense of data to better protect citizens while also lowering costs.
Forget the image of a scientist in a white lab coat; data scientists are our modern day explorers in the business world.”
By applying analytics technologies to big data, we can do more than manage information; we can manage vast information supply chains. They’re made up of not only the tables of structured data that traditional computers love, but streams of unstructured text, images, sounds, sensor-generated impulses and more.
We can parse the real languages of commerce, processes and natural systems, as well as conversations from the growing universe of tweets, blogs and social media. We can also draw on advanced technologies such as stream computing to analyze gigabytes of flowing data “on the fly” and decide on an appropriate action, such as a real-time alert or capturing an insight for later analysis. Efforts like NYU’s Center for Urban Science and Progress (CUSP) will take Urban Informatics to a whole new level.
But we can only do all of this if our computing systems are up to the task. The advent of technologies like IBM Watson – the Jeopardy!-playing system – that instantly analyzes natural human language, and massive amounts and varieties of big data flowing from sensors, mobile devices, and the Web, is helping today’s data pioneers find answers to tough questions. These systems literally sift through the data and identify patterns and trends on the fly, then present it in a way that’s easy for people to understand.
And when analytics technologies are coupled with the speed of a supercomputer, applications can very quickly deliver insights and value. The United States has long recognized the important role of high performance computing, modeling and simulation, with Congress and the administration expressing their continued commitment to ensuring our nation does not cede its leadership in heavy-duty computing.
Just two weeks ago, an IBM supercomputer named “Sequoia,” which is at the National Nuclear Security Administration’s (NNSA) Lawrence Livermore National Lab, was named the fastest supercomputer in the world by the TOP500 list. Argonne National Lab’s IBM supercomputer, “Mira,” was ranked third. Supported by experts, we can use these types of supercomputing resources to solve research and industrial challenges in cancer and genetic research, medical imaging and informatics, advanced manufacturing, environmental and climate research, materials science and more.
Moreover, public-private collaborations such as IBM’s with Lawrence Livermore, Argonne and other national labs and universities, help to drive economic competitiveness through investments in heavy-duty computing and new analytics technology. And along with applying analytics to big data, comes the opportunity to create jobs.
Forget the image of a scientist in a white lab coat; data scientists are our modern day explorers in the business world. The role of data scientist is causing shifts inside organizations and across business cultures, making the job in great demand: currently, there are 10,000 job openings in the U.S. from a broad variety of companies, ranging from deal-of-the-day websites to traditional retailers to global consumer goods distributors.
The era of big data has arrived, with implications for economic growth, job creation and improving society. It’s time to unleash the power of these breakthrough analytics technologies and heavy-duty supercomputing power to leverage the immense potential of big data.
Dave McQueeney is the vice president of software at IBM Research. Steven E. Koonin, recently under secretary for science in the Department of Energy, is director of the Center for Urban Science and Progress (CUSP) in New York City.