The idea of replacing hard disk drives with flash memory has been gaining steam in the IT industry. But a research group at Stanford University is going even further: they say the goal should be to replace hard disks with DRAM.
While it's just in the prototype phase, the Stanford group is trying to make it a reality with a project called RAMCloud, which can aggregate memory from thousands of commodity servers to dramatically speed up data access. Hard disks, and perhaps flash, would still be used for backup, a crucial consideration because when DRAM loses power it also loses data. But in daily operations, all the information applications access would come directly from DRAM.
Project leader and computer science professor John Ousterhout doesn't downplay the potential roadblocks RAMCloud faces. For one, its success depends upon the development of extremely low-latency networking, he tells Ars. "We're building RAMCloud on the assumption that networking is going to get dramatically better over the next three to five years," he says. "There is evidence that is starting to happen. We think that's likely, but there's a bet there."
But if successful, the benefits will be enormous. DRAM is expensive, but it's also five to ten times faster than flash memory and 100 to 1,000 times faster than hard disks, he says. Of course, DRAM is also 50 to 100 times more expensive than disk when measured per bit of storage. But with mechanical disk, businesses rarely use anything close to their full storage capacity, and are limited in how quickly data can be accessed. If you measure by the cost of each read or write operation, DRAM is actually cheaper, he argues.
"DRAM does not have to be cheaper than disk," Ousterhout says. "It only has to be cheap enough that people will use it because of the performance benefits."
Some vendors have already recognized the major performance benefits, in fact. The likes of VoltDB and other companies have built in-memory databases that rely on main memory for specialized, transaction-heavy applications. These systems have the advantage of actually being shipped today, although Ousterhout believes the use of DRAM can be greatly extended in the future as technology evolves and prices come down.
A new home for data
Papers on RAMCloud published in December 2009 and October 2011 describe it as scaling across thousands of servers and hundreds of terabytes of data. "All information is kept in DRAM at all times," the 2009 paper explains. "DRAM is the permanent home for data. Disk is used only for backup. Second, a RAMCloud must scale automatically to support thousands of storage servers; applications see a single storage system, independent of the actual number of storage servers." A multi-core storage server in a RAMCloud-powered network should be able to service at least 1 million small requests per second, the paper says.
Two years on, Ousterhout and his team of graduate students now have a prototype cluster of 80 servers with 24GB of DRAM on each for a total of two terabytes. The prototype system has some gaps, but it is capable of recovering from crashes and performs basic read operations in five microseconds. The team started coding a year and a half ago, but is still six to 12 months away from a "1.0" level system that would be ready for business use. Even then, "these would be avant-garde, risk-taking people for starters," Ousterhout says.
Ousterhout sees the first wave of adopters as being teams building cutting-edge Web applications that have hit a wall in their storage systems, and can't rely on traditional databases for real-time access to data. Facebook is a good example of a website that could benefit from RAMCloud, and in fact Ousterhout says he's had discussions with executives at the social network. It may not be apparent to a casual user, but Facebook is limited in the ways it can display content by the amount of data it can access in the time it takes to put together a webpage. "They are very constrained right now because they don't have a fast enough storage system," Ousterhout says.
That's not to say Facebook could replace its storage systems overnight with RAMCloud, even if the technology was ready. But Ousterhout said his group is also getting interest from flash device vendors and storage system companies. In the long term, Ousterhout believes RAMCloud could power cloud networks like Amazon's Elastic Compute Cloud or Windows Azure, and perhaps even make an impact in the enterprise data center.
Disk is not the future
Enterprise Strategy Group founder and senior analyst Steve Duplessie argues that future storage systems will not be mechanical in any way, but whether that happens in five years or 40 is unclear. "It's probably more an issue of economics than it is an issue of technology at this point," he says. "It's just a matter of when we can drive down the costs in order to make it realistic for most people."
The SSD industry has been around a few dozen years, yet "we're just finally starting to take off," Duplessie says. Replacing all disks with SSD would be prohibitively expensive, but coupling traditional disk with flash in a tiered system taking advantage of compression and deduplication is both effective and feasible today, he said. "It's really just begun, but that is the way the world is going to go."
The RAMCloud idea, while innovative, is similar to one posed by a company called RNA Networks, which was recently bought by Dell, Duplessie says. "What RNA was trying to do was say they could pool up all the DRAM in all servers and create one big, virtual DRAM megapool that any server could call to and from," Duplessie says. "It hasn't really been commercialized yet."
While DRAM is expensive, Duplessie notes that there is plenty of underutilizied DRAM sitting in data centers today. "If you add up all the DRAM in all machines in a data center, there might be terabytes but at any given time you're not using all of it," he says. "You've already paid for it, you might as well use it."
Within a couple years, the kinds of servers RAMCloud is designed for will have as much as 256GB of DRAM, Ousterhout said. He promises RAMcloud is not the "typical research project," in which a couple of papers are written, a crude research prototype is built and then thrown away.
"I like to build systems that are production-quality," he says. "We build these to be used by other people. We release them as open source. If the system really works we hope it will become widely used."