Facebook already built its own data center and its own servers. And now the social-networking giant is building its own storage hardware — hardware for housing all the digital stuff uploaded by its more than 845 million users.
“We store a few photos here and there,” says Frank Frankovsky, the ex-Dell man who oversees hardware design at Facebook. That would be an understatement. According to some estimates, the company stores over 140 billion digital photographs — and counting.
Like the web’s other leading players — including Google and Amazon — Facebook runs an online operation that’s well beyond the scope of the average business, and that translates to unprecedented hardware costs — and hardware complications. If you’re housing 140 billion digital photos, you need a new breed of hardware.
In building its own data center on the Oregon high desert, Facebook did away with electric chillers, uninterruptible power supplies, and other terribly inefficient gear. And in working with various hardware manufacturers to build its own servers, the company not only reduced power consumption, it stripped thee systems down to the bare essentials, making them easier to repair and less expensive. Frankovsky and his team call this “vanity free” engineering, and now, they’ve extended the philosophy to storage hardware.
“We’re taking the same approach we took with servers: Eliminate anything that’s not directly adding value. The really valuable part of storage is the disk drive itself and the software that controls how the data gets distributed to and recovered from those drives. We want to eliminate any ancillary components around the drive — and make it more serviceable,” Frankovsky says during a chat at the new Facebook headquarters in Menlo Park, California, which also happens to be the former home of onetime hardware giant Sun Microsystems.
“Break fixes are an ongoing activity in the data center. Unfortunately, disk drives are still mechanical items, and they do fail. In fact, they’re one of the higher failure-rate items. So [we want to] be able to quickly identify which disk has failed and replace it, without going through a lot of mechanical hoops.”
As with its data center and server creations, Facebook intends to “open source” its storage designs, sharing them with anyone who wants them. The effort is part of the company’s Open Compute Project, which seeks to further reduce the cost and power consumption of data center hardware by facilitating collaboration across the industry. As more companies contribute to the project, the thinking goes, the designs will improve, and as more outfits actually use the designs for servers and other gear — which are manufactured by Facebook partners in Taiwan and China — prices will drop even more.
When Facebook first introduced the project last spring, many saw it as a mere PR stunt. But some big-name outfits — including some outside the web game — are already buying Open Compute servers. No less a name than Apple has taken interest in Facebook’s energy-conscious data-center design. And according to Frankovsky, fifty percent of the contributions to the project’s open source designs now come from outside Facebook.
For Peter Krey — who helped build a massive computing grid for one of Wall Street largest financial institutions and now advises the CIOs and CTOs of multiple Wall Street firms as they build “cloud” infrastructure inside their data centers — Facebook’s project is long overdue. While building that computing grid, Krey says, he and his colleagues would often ask certain “tier one” server sellers to strip proprietary hardware and unnecessary components from their machines in order to conserve power and cost. But the answer was always no. “And we weren’t buying just a few servers,” he says. “We were buying thousands of servers.”
Now, Facebook has provided a new option for these big name Wall Street outfits. But Krey also says that even among traditional companies who can probably benefit from this new breed of hardware, the project isn’t always met with open arms. “These guys have done things the same way for a long time,” he tells Wired.
Hardware by Committee
Facebook will release its new storage designs in early May at the next Open Compute Summit, a mini-conference where project members congregate to discuss this experiment in open source hardware. Such names as Intel, Dell, Netflix, Rackspace, Japanese tech giant NTT Data, and motherboard maker Asus are members, and this past fall, at the last summit, Facebook announced the creation of a not-for-profit foundation around the project, vowing to cede control to the community at large.
The project began with Facebook’s data center and server designs. But it has since expanded to various other sub-projects, and the contributors include more than just web companies. Rackspace contributes, but so does financial giant Goldman Sachs.
Rackspace is leading an effort to build a “virtual I/O” protocol, which would allow companies to physically separate various parts of today’s servers. You could have your CPUs in one enclosure, for instance, your memory in another, and your network cards in a third. This would let you, say, upgrade your CPUs without touching other parts of the traditional system. “DRAM doesn’t [change] as fast as CPUs,” Frankovsky says. “Wouldn’t it be cool if you could actually disaggregate the CPUs from the DRAM complex?”
With a sister project, project members are also working to create a new rack design that can accommodate this sort of re-imagined server infrastructure. A traditional server rack houses several individual machines, each with its own chassis. But the Open Rack project seeks to do away with the server chassis entirely and turn the rack into the chassis.
Meanwhile, Goldman Sachs is running an effort to build a common means for managing servers spread across your data center. Part of the appeal of the Open Compute Project, says Peter Krey, is that the project takes a “holistic approach” to the design of data center hardware. Members aren’t designing the data center separately from the servers, and the servers separately from the storage gear. They’re designing everything to work in tandem. “The traditional data center design…is Balkanized,” Krey says. “[But] the OCP guys have designed and created all the components to efficiently integrate and work together.”
This began with Facebook designing servers specifically for use with the revamped electrical system built for its data center in Prineville, Oregon. And soon, the effort will extend to the storage gear as well. Frankovsky provides few details about the new storage designs. But he says his team has rethought the “hot-plug drive carriers” that let you install and remove hard drives without powering a system down.
“I’ve never understood why hot-plug drive carriers have to come with these plastic handles on them,” he explains. “And if you’ve actually mounted a drive inside one of those drive carriers, there are these little bitty screws that you inevitably lose — and you’ll likely lose one onto a board that’s live and powered. That’s not a good thing.”
He says that the new design will eliminate not only the screws but the carriers themselves. “It’s a completely tool-less design,” Frankovsky says. “Our techs will be able to grab hold of a ‘slam latch,’ pull it up, and the act of pulling it up will pop the drive out.”
Frankovsky calls it “small stuff.” And that’s what it is. But if you’re running an operation that size of Facebook, that small stuff becomes very big indeed. In making one small change after another, Facebook is overhauling its infrastructure. And in sharing its designs with the rest of the world, it hopes to overhaul much more.