IBM has launched an ambitious initiative, called Project Kittyhawk, aimed at building “a global-scale shared computer capable of hosting the entire Internet as an application.” Forget Thomas Watson’s apocryphal remark that the world may need only five computers. Maybe it needs just one.
The Register’s Ashlee Vance points to a fascinating white paper about the IBM program. The effort focuses on expanding the company’s Blue Gene supercomputer to handle web-scale applications of every imaginable stripe – to create a “generic” computing platform, incorporating millions of processors, that can essentially run anything you throw at it. The authors argue that the reigning, Google-style model of web-scale computing – big clusters of cheap servers – was born of necessity, rather than choice, and has fundamental flaws:
At present, almost all of the companies operating at web-scale are using clusters of commodity computers, an approach that we postulate is akin to building a power plant from a collection of portable generators. That is, commodity computers were never designed to be efficient at scale, so while each server seems like a low-price part in isolation, the cluster in aggregate is expensive to purchase, power and cool in addition to being failure-prone. Despite the inexpensive network interface cards in commodity computers, the cost to network them does not scale linearly with the number of computers. The switching infrastructure required to support large clusters of computers is not a commodity component, and the cost of high-end switches does not scale linearly with the number of ports. Because of the power and cooling properties of commodity computers many datacenter operators must leave significant floor space unused to fit within the datacenter power budget, which then requires the significant investment of building additional datacenters.
Many web-scale companies start in a graduate lab or a garage, which limits their options to the incremental purchase of commodity computers even though they can recognize the drawbacks listed above and the value of investing in the construction of an integrated, efficient platform designed for the scale that they hope to reach. Once these companies reach a certain scale they find themselves in a double bind. They can recognize that their commodity clusters are inefficient, but they have a significant investment in their existing infrastructure and do not have the in-house expertise for the large research and development investment required to design a more efficient platform.
IBM believes that it has a better way, a means of building a computing platform that “is an order of magnitude more efficient to purchase and operate than the commodity clusters in use today”:
Companies such as IBM have invested years in gaining experience in the design and implementation of large-scale integrated computer systems built for organizations such as national laboratories and the aerospace industry. As the demands of these customers for scale increased, IBM was forced to find a design point in our Blue Gene supercomputer technology that allowed dense packaging of commodity processors with highly specialized interconnects and cooling components … We postulate that efficient, balanced machines with high-performance internal networks such as Blue Gene are not only significantly better choices for web-scale companies but can form the building blocks of one global-scale shared computer. Such a computer would be capable of hosting not only individual web-scale workloads but the entire Internet.
The researchers report that early tests of the platform in running Web 2.0 programs and other web apps and related software “are promising and show that it is indeed feasible to construct flexible services on top of a system such as Blue Gene.”
IBM sees the future in IBM terms, while Google sees the future in Google terms. But the IBM paper makes for fascinating reading for anyone interested in the future of computing. It underscores the fact that the design of the shared computer that we’ll all be using in the future remains up for grabs. In the long run, this is a battle that will prove of far greater import than Microsoft and Google’s tussle over Yahoo.