On Monday, Oracle officially launched the Sparc T4 microprocessor and a line of servers based on the new SPARC CPU. Oracle Systems Executive Vice President John Fowler claimed at the rollout event that early customers using T4 servers have seen "up to five times [the] performance improvements across a range of Oracle and third-party applications, and are already placing orders to replace outdated systems from our competitors."
For those who are still members of the Sparc/Solaris installed base—those who haven't headed for x86 or Itanium already—the T4 is potentially good news. It provides a way to preserve investments in existing Solaris skills and software while getting a significant performance boost over the year-old T3. The T4 will likely stop some defections, buy Oracle time as it prepares its next generation of processor, and reduce the company's dependence on reselling Fujitsu SPARC 64 systems to run its own database.
But at the same time, the T4 isn't going to win back customers from Intel, or convert IBM Power users. Despite the dump-truck full of benchmark pronouncements that Oracle delivered along with the official T4 launch—most of which were aimed at comparing Oracle's new SPARC T4-4 servers with IBM's Power line and HP's Itanium-based systems —the T4 is more important as evidence that Oracle really does intend to invest in continuing Sun's hardware and operating system business.
Back to the future
The T4 is, more than anything, a course correction from Sun's previous efforts in that it finally brings out-of-order execution (OOE) to the SPARC platform. OOE, which was engineered into the Power, x86 and Fujitsu SPARC64 processors back in the 1990s, allows instructions in a thread to run rather than wait for those in front of them in the queue to complete.
The previous Sun SPARC processors run threads in-order: the instruction loads, and if inputs are ready, it gets sent to the execution unit of the CPU for processing; if the operands aren't ready, the processor stalls. The T3's architecture tried to dance around the in-order processing issue by boosting multithreaded performance—which made it run well for tasks like running a Web server. But it didn't make it a spectacular platform for Oracle's databases, particularly on parallel processing tasks. In an interview with Ars Technica, Real World Technologies' David Kanter said that Sun's SPARC processors "were really good for, even great for, many apps. But Sun's product line was hamstrung by the fact that their single-thread performance was atrocious."
That's where the T4's architecture has paid off. Because of the addition of OOE, the T4 can run significantly faster than the T3 despite the fact that the T4 has half the processor cores (and half the threads) per CPU that the T3 has—eight cores with eight threads each, instead of sixteen cores. Even so, the T4 lags behind the x86 platforms. Kanter said that while it's a good first step on OOE for Sparc, the T4 is "a two-issue out-of-order processor. If you look at Intel and AMD, they're doing 4 issue out-of-order."
During the presentation, Oracle CEO Larry Ellison claimed it had up to five times faster single-thread performance than the T3. The performance of the T4-4 servers is so good, Ellison said, that a four T4-4 Sparc SuperCluster matches that of Oracle's Exadata integrated database system and Exalogic integrated application servers—systems pretuned for their task. That's partially because the SuperCluster system includes the same storage hardware and Inifiniband networking used by the Exadata and Exalogic machines.
Oracle claimed a legion of record-breaking benchmark performances for the T4-4. Ellison repeatedly compared the performance of the T4-based Sparc SuperCluster to IBM's Power line—and the Power 795 in particular. A one-rack T4 SuperCluster "is twice as fast as IBM's fastest computer, at half the cost," he claimed.
But the benchmarks that Oracle cited were mostly internal ones. Those may carry some weight for many Oracle customers, but there were only two that really hint at the T4-4's performance beyond software that has been tuned for that processor. One of those third-party benchmarks was the TPC-H benchmark for a 1,000 GB load, in which the T4-4 beat the IBM Power 780 and Itanium-based HP Superdome 2 on price/performance, raw performance, and throughput.
But the T4's score puts it tenth in the top 10 for performance on the TPC-H benchmark (a benchmark that Kanter questions the value of). The T4 is still outperformed on Oracle Database 11g by HP's BladeSystem RAC configuration running Oracle Linux, and edged out by HP's Proliant DL980 G7 running Microsoft SQL Server 2008 and Windows Server 2008 both on price performance and raw power. Both are x86 systems.
Additionally, some of the claims Oracle made about its performance on third-party benchmarks were based on selected interpretations of the data—which drew catcalls from IBM Systems and Technology chief technical strategist Elisabeth Stalh. "Oracle claimed nine T4 world records. 7 of the 9 are not industry standard benchmarks but Oracle’s own benchmarks, most based on internal testing," Stahl blogged. "Oracle’s SPECjEnterprise2010 Java T4 benchmark result, which was highlighted, needed four times the number of app nodes, twice the number of cores, almost four times the amount of memory and significantly more storage than the IBM POWER7 result."
Staying on the Sparc path
Kanter said he thinks comparisons between the T4 and IBM's Power 7 processors is like comparing apples to cantaloupes. "The objective ot the T4 is not to beat the Power 7. The point is to give SPARC customers an attractive path to keep them from defecting to Intel and Linux. If you view it through that lens, it's a really good step."
It's also a processor that was within Oracle's ability to execute now, with more ambitious plans on the roadmap ahead. While the T4 is based on 40nm lithography, Oracle has also announced that a 28nm version, the T5, is a year ahead of schedule. Kanter said word that the next chip is so close that "if anyone had any doubts about Oracle continuing to invest in hardware, those should be gone."