Big data, which has been the hot topic for conferences this year, has also received a good deal of attention on Capitol Hill in recent weeks, most notably with two recent events:


As one who represents a population of data scientists, a group for which the TechAmerica says there is growing demand, I have seen quite a few–and written a number of–articles about recent big data conferences:
For those who contributed to the ACT-IAC discussion with Congressional staff members on Big Data at the Hill – Defining and Understanding Policy Implications, I offer some specific ideas to three suggestions in their report:

What Congress should do to help big data

  • Allow access to confidential data such as the Census data centers
  • Allow sharing between statistical agencies
  • Have a chief data dfficer that promotes a federal data science community of data scientists and statisticians
The federal government should first focus on the value of big data
  • Hadoop projects are costing 50 times more than expected
  • DHS failed fast with a big data in the cloud project, but quickly and at less cost
  • Semantic Medline on the Cray Graph Computer in an example of a federal data science team project with value
The federal government should foster real innovation with government data
  • Encourage private industry to add value to government data
  • Consider having the federal government’s chief statistician be the chief data officer
  • Empower the government’s data scientists and statisticians to analyze big data and statistical data
I even provided more detail to Congressional staff members on nine topics of special interest to them in the guide below, including my ideas on trends, issues, and comments.
See the full table at the end of this article.

My hope is that my suggestions will actually inform decision makers to productive action.

I agree with the The TechAmerica’s Big Data Commission report, “Demystifying Big Data: A Practical Guide to Transforming the Business of Government,” when it says:

“Government agencies should think about Big Data not as an IT solution to solve reporting and analytical information challenges but rather as a strategic asset that can be used to achieve better mission outcomes, and conceptualized in the strategic planning, enterprise architecture, and human capital of the agency. Through this lens, government agencies should create an ownership structure for the data, treating it like any other asset – one that is valued and secured.

“Ultimately, agencies should strive to address the following two questions – ‘How will the business of government change to leverage Big Data?’ and ‘How will legacy business models and systems be disrupted?’

Note that my recent story on “Open Government Data and Statistical Data: Haven’t We Been Here Before?” talks about senior government statisticians viewing so much of what has been done with open government data as “IT projects” and not solid statistics and data science for decision makers:

I also agree with the report when it says::

“Because of the importance of data in the digital economy, the Commission encourages each agency to follow the FCC’s decision to name a Chief Data Officer. To generate and promulgate a government-wide data vision, to coordinate activities, and to minimize duplication, we recommend appointing a single official within the OMB to bring cohesive focus and discipline to leveraging the government’s data assets to drive change, improve performance, and increase competitiveness.”:

I have made the same suggestion to Congressional staff members recently as well, emphasizing that many individuals in government, such as CIA Chief Technology Officer, are are essentially functioning in multiple roles that include data science, computer science, and big data entrepreneurship.

But we need more of them, not just speaking at conferences, writing reports, and evangelizing big data, but leading teams that do best practice examples of big data for the business and science of government in cooperation with industry and academia.

We also need concrete examples that deliver return on investment soon and often for big data to gain and retain taxpayer support in these tight budget times.

Nine Big Data Topics of Special Interest to Congressional Staffers:

Topics Trends Issues Comments
Myth vs. Realities Big Data Solves Everything Hype Without Demonstrated Business and Scientific Value See Data Evolution in the Government Enterprise: Will It Still Be Big Data Next Year?
Privacy: Who knows what? The Intelligence Community Knows Everything Who Knows Everything the Intelligence Communty Is Doing? See Intelligence Community Loves Big Data
Cloud: Where Big Data belongs? Terabytes to Zettabytes Bandwidth Limitations Amazon: Fedex Your Storage Devices To Us to Upload Your Big Data
Mobility – of you and your data Bring Your Own Device (BYOD) Conventional Web Sites and Databases Are Not Mobile-Enabled Your Mobile Device Has Access To a Supercomputer
Storage and technology Scalable single level storage Collapses the Server, Network, and storage by removing software and replacing them with memory system primitives Panève’s ZettaLeaf & ZettaTree Products
Data Analytics – hidden gems and spurious conclusions Data Science Too Few Data Scientists – Need a Government Data Science Community See my Data Journalism Articles
Opportunities and risks in data aggregation Aggregate Before Analysis To Reduce Size Needels Could Be Lost See Data Evolution in the Government Enterprise: Will It Still Be Big Data Next Year?
Security concerns for large data sets Integrate Calssified and Unclassified Data Sources Different Security Levels Need To Specify/Protect Security at the Row and Element Level
Financial Implications Hadoop for Everything with Big Data Costs 50 Times Higher Than Expected Big Data In Memory Could Be More Costs Effective