The old, Google model: lots of little servers in a network designed for redundancy in case of failure.
The new old IBM model: a few centralized servers, but they’re more efficient and reliable and engineered to not only not waste cycles, but to last for decades. However, they’re designing them around an old mainframe strength — internal bandwidth speed — and using it as the basis for a new mainframe computing based heavily on flexible parallelism, or “cell” computing.
“We have been running multitenancy [running multiple customers on a single machine with a single application instance] for decades and decades,” he told me.
“It’s a mainframe model where things run together but in isolation. The issue is whether the machines will bear up under the load of diverse work or will they grind down and you’ll need to provision another machine. You need reliability, security, auditing, privacy, data integrity, automation and full isolation. You need to have a lot of layers in the environment.”
In 2000, IBM resurrected the mainframe by bringing Linux and WebSphere to the platform and lowering the price of entry, according to William Zeitler, senior vice president of IBM’s Systems & Technology Group. “You can build out a thousand smaller servers that need to be connected to ports and a fabric. You end up with a complexity crisis that has to be rationalized,” Zeitler said. ^
The more small servers you have talking to each other, the more the communication becomes complex. Like in CPUs, the real question is how fast can they move information, not how fast they can calculate. Calculation speed is the model of old personal computing, from the early 1980s. Now we’re talking about moving massive amounts of data around and avoiding latency and internal correction that slow the process down.
A UC Berkeley paper [PDF] recently submitted to the IEEE International Parallel and Distributed Processing Symposium manages to highlight two common and seemingly unrelated themes that have come up a number of times over the past few years in my reporting on the high-performance computing (HPC) space: 1) IBM’s Cell is really good at HPC workloads when you invest the time to write custom code for it, and 2) Intel’s Xeon platform is perennially bandwidth starved and not very power-efficient. ^
IBM’s solution: Use processors that emphasize moving information between each other and working collaboratively, like the Cell. Build a few giant boxes and over-engineer them so they’re reliable and efficient. This is in contrast to the PC/server market, where a new design comes out every six months and is under-engineered to avoid introducing potential conflicts and to get it out the door on time. They also have found out that these products are nice and green, since it’s easier to constrain efficiency on a few specialty designs than impose it on general purpose ones.
International Business Machines Corp (IBM.N) on Wednesday launched tools to reduce computer energy consumption as IBM hopes to boost its business of selling power-saving technologies.
{ deletia }
“Energy efficiency has become a critical business metric, like product reliability and customer satisfaction,” William Zeitler, head of IBM’s systems and technology group, said in an interview with Reuters. ^
After having seen too many servers fail over the past few years, and having heard service technicians refer to two-year-old machines as “antiques,” I think this is a positive possibility. We’ve reached a possibly temporary plateau in processor power; programmers are still finding ways to take advantage of multiple cores, and it’s likely we’ll need to redesign how we write code and operating systems. Then again, the mainframe guys have been doing it all along.