Intel announced today the first Optane-branded product using its new 3D XPoint memory: the catchily named Intel Optane SSD DC P4800X. It's a 375GB SSD on a PCIe card. Initial limited availability starts today, for $1520, with broad availability in the second half of the year. In the second quarter, a 750GB PCIe model, and a 375GB model in the U.2 form factor will be released, and in the second half of the year, a 1.5TB PCIe card, and 750GB and 1.5TB U.2 stick, are planned.
3D XPoint is a new kind of persistent solid state memory devised by Intel and Micron. Details on how the memory actually works remain scarce—it's generally believed to use some kind of change in resistance to record data—but its performance characteristics and technical capabilities make it appealing for a wide range of applications.
When it was first announced in 2015, Intel claimed it would be 1,000 times faster than NAND flash, 10 times denser than DRAM, and 1,000 times better endurance than NAND, though without saying "faster at what" or "what kind of NAND" or anything like that. With the shipping product, these comparisons are now clearer, as one of Intel's slides make clear: 3D XPoint has about one thousandth the latency of NAND flash (or about ten times the latency of DRAM), and tens times the density of DRAM.
The raw specs for the P4800X leaked in February. To summarize: it's a datacenter-oriented part, built for applications with high read/write loads, looking for low latency. The sequential transfer rates of 2400MB/s read, 2000MB/s write, are good, but some of the fastest NAND flash can pull slightly ahead. Where the P4800X excels is its ability to sustain high I/O loads, courtesy of those low latencies.
SSD manufacturers often quote huge numbers of I/O operations per second (IOPS), but there's always a footnote: the figures are typically generated with queue depths of 32, which is to say, the drive is bombarded with read or write requests (depending on what is being measured) so that there are always 32 outstanding operations. With these deep queues, NAND flash SSDs can achieve 3-400,000 IOPS.
The P4800X can do 550,000 read IOPS and 500,000 write IOPS, but critically, Intel says it achieves this even at low queue depths. The spec sheet figure has a queue depth of 16, and the company says that a queue depth of about 8 tends to be about the limit seen in the real world.
Moreover, Intel says that the latency of each I/O operation remains low even under heavy load. 99.999 percent of operations have a read or write latency below 60 or 100 microseconds (respectively) with a queue depth of 1, rising to 150 or 200 microseconds with a queue depth of 16. Under a comparable load, Intel's own P3700 NAND SSD can only serve 99 percent of operations with a latency below about 2,800 microseconds.
Likewise, under sustained write workloads, the P4800X retains its low latency for reads, whereas the read latency of the P3700 NAND steadily deteriorates as the write bandwidth increases.
This already makes the Optane drive interesting for applications like caching, but Intel is aiming at more than just that. 3D XPoint is byte addressable; that is to say, each individual byte can be overwritten. This sets it apart from NAND flash. NAND is typically arranged in pages of 512, 2048, or 4096 bytes. Pages are arranged into blocks, typically of 16, 128, 256, or 512 kilobytes. Reading and writing takes place at page granularity, but each page can only be written once. To write it again, it must first be erased, and erasure takes place not at page granularity, but at block granularity. Spinning hard disks can perform reads and writes at the granularity of a sector, typically either 512 bytes (or 512 bytes plus some extra bookkeeping space), or 4096 bytes (or 4096 plus some extra). With 3D XPoint, the reads and writes can occur on individual bytes.
Unlike flash, which physically wears out due to the stress placed by erases, 3D XPoint writes are non-destructive. This gives the drives much greater endurance than NAND of a comparable density, with Intel saying that Optane SSDs can safely be written 30 times per day, compared to a typical 0.5-10 whole drive writes per day.
The low latency and high endurance make Optane a good fit for applications like caching and database servers. But taking further advantage of these two properties, Intel has developed a new way of using Optane. The P4800X can be used as a regular PCIe attached SSD, but Intel has developed something it calls "Memory Drive Technology" that allows the P4800X, when used in conjunction with an appropriate chipset and processor (which means you'll have to use a Xeon processor), can be used as if it were RAM. Optane's latency and bandwidth are both worse than that of DRAM, but the density is higher, and the price substantially lower.
Memory Drive Technology uses a middleware layer that boots before, and is transparent to, the operating system, and it combines regular DRAM with the SSD to make a single large pool of volatile memory. For most workloads, this will be slightly slower than if the same amount of DRAM were being used, but the cost should be substantially lower, and the power consumption modestly improved. Intel even claims that some workloads will go faster; although Optane "memory" is slower than regular RAM, the middleware layer that manages the memory can move data around so that it is closer to the processor that is using it, which can assist when using NUMA configurations.
The biggest benefit may be from substantially increasing the amount of physical memory in a server: 2 socket Xeon systems can hold up to 3TB of RAM, but 24TB of Optane, and 4 socket systems support up to 12TB RAM, but 48TB Optane. This could be a huge boost for applications that need truly enormous quantities of memory.
If using the PCIe bus to attach storage and then use it as memory seems a bit awkward, next year Intel plans to release Optane DIMMS.