It’s no surprise that recent reports of NASA’s shift from OpenStack — and open source cloud computing — in favor of a commercial platform stirred some chatter in and around the federal space.
After all, NASA officials foresaw open source cloud computing’s potential and invested in the new phenomenon back in 2008, when it was still just a shimmer on the horizon.
Therefore, when NASA CIO Linda Cureton casually mentioned in her June 8 blog that the agency was shifting to Amazon Web Services and its cloud-base enterprise infrastructure, multiple news stories erupted that claimed the agency was abandoning OpenStack — and open source cloud computing — in favor of a commercial platform.
While Cureton’s blog statement was completely true, the stories were completely wrong. OpenStack is still very much a part of the big picture at NASA, particularly when it comes to big data.
Data hosting solutions company Rackspace and NASA “co-founded the OpenStack project to provide a fully open alternative for building public and private clouds,” said Jim Curry, General Manager of Rackspace Cloud Builders, in a written statement for Breaking Gov. “At the time, the cloud market was dominated by proprietary technology and vendors. We wanted to create a standard platform that would prevent customers from suffering the same license lock-in they have faced with similar technology shifts in the past.”
The agency launched the Nebula Cloud Computing Project in 2008, in which it began developing code for this new platform. In 2010, Nebula led to OpenStack, a new open source cloud computing initiative founded with Rackspace. NASA’s involvement greatly boosted both OpenStack’s visibility in the open source community and the notion of open source cloud computing.
“We thought cloud computing had a close affinity with NASA requirements,” Cureton said. “It was new on the government side, and there were barriers to adoption, such as security and how it works. With Nebula, we could get in the cloud and do work without concerns about these barriers.”
One of NASA’s primary goals in its involvement in OpenStack was — and still is — determining how the cloud environment matches up with the agency’s enormous data storage and processing needs. Goddard Space Flight Center in Greenbelt MD receives more than four terabytes (4,000 megabytes) of data every day from the Hubble Space Telescope (pictured above), NASA’s fleet of Earth Observing System satellites, and the new Sumoni National Polar-orbiting Partnership satellite. In 2012, the total annual data quantity generated by the satellites and Hubble is approximately 300 terabytes. In 2018, when the James Webb Space Telescope is expected to launch and add its observations to the rest, total annual data amounts streaming down to GSFC are projected to reach nearly 800 terabytes.
“Right now, all that data sits next to the computing resources,” said GSFC CIO Adrian Gardner. “This is expensive, both in terms of computing and storage.”
Today, big data like that which GSFC handles universally sits beside computing resources because it takes too long to transmit it over conventional networks. It’s not just a matter of sitting and drinking coffee and playing Angry Birds while you wait-as anyone who’s tried to download a big video file across a slow connection can attest, the connections time out. That can mean you must start all over again-or give up. That’s a big drawback for using the cloud, because everything moves to and from clouds over networks.
Finding ways around this problem was part of NASA’s original OpenStack engagement goals, and it still is.
“We are looking at cloud as part of a computing framework, focused on the computational science aspect,” said Gardner. “We are looking for a way to provide access to these big data sources via the cloud to see how we can optimize performance.”
One possible way to use cloud computing, even when network transmission is an issue, is to reduce the data load.
“We’re trying to look at whether researchers need the whole data set or part of it, like what’s changed,” Gardner said. “Maybe we just move the changes into the cloud and then move it back.” Changes in the data would represent a small subset of the total, so it may not bog down a conventional network. “Without the network issue, it doesn’t matter where the data is,” Gardner said. In any event, Gardner expects that networks will cease to be an issue in three to five years as bigger, speedier pipes get built.
OpenStack may also help NASA address its computational needs in another, perhaps more significant, way.
“I want to look at the attributes of different clouds and find the best ones for staging data,” Gardner said. “So we look at OpenStack and get a list of its attributes and those of other clouds.” Once that’s done, various commercial or open source cloud offerings and their capabilities could be added to an entire list of computing services, including high performance computing. Gardner calls the list a “storefront” where scientists and engineers can select desired computing services from a menu, much like making online purchases.
For example, if security is important, “I may not want the cloud at all,” he said. In another situation where data is public, however, it might be the right menu choice. For example, Gardner poses the problem of scientists and engineers who wish to process a large quantity of data on a supercomputer. It might take two days to run the data on the supercomputer-but the machine is not available for 60 days. Processing the data in the cloud might take two weeks, but that’s still sooner than it would take to run it on the supercomputer. By using the cloud, “we’re creating options for the scientists,” he said.
The cloud engagement has a more mundane side to it as well. “We’re using cloud as part of our data center consolidation strategy,” Gardner said. “If we can move a lot of computational jobs to the cloud, it will help the data center consolidation.”
Contrary to the reports that NASA was leaving open computing and OpenStack, the agency intends to remain involved in both. “We are staying with OpenStack,” Gardner said. “It will provide lots of insight into the cloud computing platform and how it will mature. Staying involved in it will help us see how it will meet our needs.” For data storage, GSFC has also investigated Eucalyptus, another open source cloud platform.
However, the agency’s role will change because OpenStack itself is changing. “OpenStack is being handed over to a Foundation,” Curry said. “Now with 183 member companies, 200,000+ downloads of the OpenStack software, and 100+ deployments, the technology has lots of momentum behind it. We are seeing a very strong adoption of OpenStack in the market, and we want to contribute to its rapid growth by evolving our commercial offerings around it.”
Commercialization is not part of NASA’s mission. “Our role in space technology is to develop it and transfer it to commercial organizations where they can develop it and commercialize it,” Cureton said. “Nebula and OpenStack aren’t any different.”
And what about NASA’s supposed departure from open source cloud computing and OpenStack to Amazon’s commercial cloud platform?
A key clue was evident in Cureton’s original blog, in which she stated, “NASA shifted to a new web services model …”
Amazon Web Services will host the www.nasa.gov portion of NASA’s websites. With Amazon, there will be no big data and no learning about cloud computing platforms as with NASA’s continuing OpenStack engagement. “When our business needs are clear,” Cureton said, “commercial providers are best.”