I recently was involved in a discussion debating the successes of Open Government.


Some of the individuals in the discussion felt the success of the Open Government initiative was the creation of Data.gov, but I disagreed saying that it was only a data catalog and even the featured data sets are difficult to use and understand.

We really need data apps and data stories from those that the public and decision-makers could use to justify funding and claim success.
The 25 most popular apps at Data.gov were mostly XML feeds and only 5 were Excel that I could easily make into data apps.
I used a data story that I am working on for the Energy Usage Analysis System: GSA’s Public Buildings Service, that tracks energy usage and trends from a variety of energy sources in every GSA-managed facility (forthcoming), as an example of Data.gov’s limitations.

At Data.gov, ones immediately sees the graphics for the featured data sets (8):
  • General Land Office Records System
  • Visit the Energy Community
  • Energy Usage Analysis System
  • Famine Early Warning System Network
  • Small Business Loans and Grants Program
  • Environmental Compliance and Enforcement Data
  • Federal Data Center Consolidation, and
  • RadNet

They also see the list of latest Datasets (10). Below these it says there are 1,119 government apps.

I first looked at the latter to be sure I was not duplicating what had already been done and found by looking at the first 25 most popular apps (in terms of views) they were just external data sets. All Data.gov had added was another layer of links between the user and the external data set with very little metadata or other value added information– essentially a clearinghouse without much functionaliity or new content.


I had reported previously on the Federal Data Center Consolidation data (which had limitations) and plan to work next with the RadNet data.
Specifically for the Energy Usage Analysis System, the metadata and download page at Data.gov only says:

This dataset is available for download, then describes it by saying, the EUAS application is a web based system which serves Energy Center of Expertise, under the Office of Facilitates Management and Service Programs. EUAS is used for tracking energy details for various energy sources namely electricity, natural gas, oil, chilled water, steam and renewable energy, and the Agency Program Page.

So there is no real metadata or data dictionary for the 214 data elements.

I searched the Agency Program Page, but did not find any more information. All I could do then was exploratory data analysis to see if I could find any relationships (see scatterplot elsewhere) in the data (Weather Bank) and consider mapping it by state, city and Zip Code.

The 25 most popular apps at Data.gov were mostly XML feeds and only 5 were Excel that I could easily make into data apps. I did a simple scatterplot of the largest of the 17 EUAS data sets, Energy Utilization 2, but do not know what it means.

If I as data scientist have a hard time discerning what the data means, how would the average citizen? That’s why I have a hard time saying Data.gov is a successful outcome of the Open Government initiative.