Flash
Kioxia eyes GPU memory stack with high-bandwidth flash push
Kioxia has recently mentioned developing a high-IOPS SSD, high-bandwidth flash (HBF) and an optical SSD but with no details. We spoke with Toshiba Europe’s Axel Stoermann, VP and CTO for SSD and memory products, to find out more.
The high-IOPS SSD was actually announced a few days ago, after we first contacted Kioxia about these topics, as the GP Series. It is intended to be a faster flash drive for GPU servers, getting data to them faster than existing TLC (3 bits/cell) SSDs. It is built from Kioxia’s Gen 2 XL-Flash and is packaged in an E1.S enclosure with E3.S as a future option, we understand.
Blocks & Files: How does the GP Series connect to high bandwidth memory?
Axel Stoermann: This is just based on regular PCIe 5. It's based on XL-Flash. It can have very, very high interleave. So that means the performance can be very high on read and also write... It is not directly connected to the GPU bus... but has to go through the CPU, which is connected to the GPU.
The GP Series is an XL-based storage with super high IOPS, and it is considered to be, let me see, 10 million IOPS in 2026 and also considering 100 million random read IOPS in 2027.
Blocks & Files: Do you know how many layers it has?
Axel Stoermann: It depends on the generation. The generation right now is generation eight, and... is 218 layers. And this is valid for all Kioxia’s TLC, for the QLC, and also for the XL.
Blocks & Files: Do you have any idea of its capacity?
Axel Stoermann: (By mail) 800 GB of SLC + 1,600 GB of MLC (2 bits/cell), making 2.4 TB in total.
Blocks & Files: Could you talk about Kioxia and High-Bandwidth Flaah (HBF) please?
Axel Stoermann: The idea is here to have an even higher performance range and also try to focus on applications which are no longer only training, but also inferencing applications, especially focusing on edge server areas, where you do the execution of the AI... This means the form factor is small, but the performance level has to be very high, and the power consumption has to be very low. So it's considered to put the high performance flash even closer to the GPU bus... The GPU is next to the HBM (High-Bandwidth Memory) and then next to HBF, all connected to one and the same interposer.
It’s key to manage the HBM and the HBF individually, as Flash is more or less high performance, can run at high performance on read access, but there are certain concerns around write. As you know, write takes for longer time. There's a lot of refresh necessary and also housekeeping and stuff like that. So therefore, if we focus on flash and high performance, then it should be for read, random read or sequential read or whatever. If we take the XL-Flash, then random read or a sequential read is fine. So that is under investigation.
We have the possibility to stack up to 32 dies, special dies, Through Silicon Vias (TSVs), with dies of flash on top of one another. They are connected Through Silicon Via connections and then connected to the GPU bus. How it is done is confidential under the role of NVIDIA, and they are leading all these discussions with the relevant suppliers and partners.
Blocks & Files: Could Kioxia HBF use XL-Flash, TLC, or QLC?
Axel Stoermann: We cannot disclose anything on this. Most probably it will be either the TLC on cost-saving or... based on XL as basically the read performance level is much higher, but that is not decided yet.
I cannot tell you anything around Sandisk, but I know that they have... a clear target to come up with such products soon... And as we are in joint venture with Sandisk, basically they are using same silicon.
Blocks & Files: And Nvidia is deeply involved in this?
Axel Stoermann: Yes... You see the GPU, the HBM, the HBF, everything is around non-idling, putting a lot of efficiency, maximum efficiency on this GPU node... I think the key, besides the individual components or constructions of the memory and the storage and the GPU, will be the interconnect or the interposer.
That means the file, also the transformation to a suitable protocol, how to multiplex or how to connect to the GPU bus. And all this is not disclosed. I'm pretty sure that will be something which will be discussed between the individual parties and the collaboration partners. And we are one of those, of course.
[One of] our concerns on the HBF definitely is the cost. And also then efficiency of production, because you can imagine Through Silicon Vias and also 32 stacks is quite sophisticated. On the other hand, you can argue who cares because HBM is already super expensive, so nobody will care about another HBF, which is in... a similar range in terms of construction. As flash is definitely much cheaper than DRAM, then it might be more affordable, but still very expensive compared to high IOPS products like the GP Series.
Blocks & Files: The HBF, if it's going to have up to 32 dies stacked one above the other, with TSVs going from each die through the lower dies down to the interposer, there'll have to be space for those TSVs. So will the resulting stacked die have a very much larger surface area than a single die?
Axel Stoermann: For sure. If we talk about a Through Silicon Via construction, then obviously, and definitely we need Through Silicon Vias on the die. This is not the regular standard die. And also the XL-Flash is not a regular standard flash. This is a different die... it depends on which generation, how many stacks now we have 218 (with BiCS 8 and 9). In the future on BiCS 10, we will have over 300 layers, and definitely the conditions will change in terms of space. The lateral shrink will strongly depend on this because the drill holes might be smaller, the diameters will be smaller, and then in any case, the footprint will be also smaller then.
Blocks & Files: Turning to the optical SSD, how is that going to work? Will the optical connection replace the PCIe connection?
Axel Stoermann: It's definitely not a product [and] it is more or less driven from high concerns about high performance over a long distance... So, for example, if you have a GPU rack with node-related HBMs and HBFs in future, then on the other hand, you can consider a bigger scale of racks in different housings, but you have to handle at least a couple of metres of distance [between them].
But you don't want to lose any performance. So you want to keep the PCIe 5, 6, 7 performance range. A copper-based or typical Ethernet-based or even Fabric-based connections are limited in terms of this kind of performance range. And for everything that is above PCIe 6 it makes sense, step by step, to consider different communication [methods].
And here we talk about the almost infinite optical connection, which is right now running on a kind of POC. This is actually sponsored by big government projects in Japan.
For us, one specific thing, talking about the optical SSD, is how to prepare storage for the needs of, let's say, future performance levels like PCIe 7 and beyond. And here we have the optical link, and we have prototypes already, and also demos which are running. They already offer high bandwidth at a range of maybe several metres, or let's say considering up to more than 100 metres.
There's several things to be considered... the integration of the controller, also the integration of or the transformation of digital signals, let's say from PCIe to optical, and this should be then integrated on an SSD in future. But so far we are in an experimental phase. And this will be considered, as I said, to be launched as a product, I would say, from a level of PCIe 7 and beyond.
Blocks & Files: Would Kioxia be using a partner to produce the bridging silicon between PCIe and the optical network?
Axel Stoermann: Yes. There are several partners. As I said, there is a Japanese national project behind it, and of course there are also optical specialists in the background, Japanese companies, and collaboration... The integration of the drive itself will be done by Kioxia.
Blocks & Files: I have the impression that Kioxia and Sandisk are perhaps collaborating and partnering better than Kioxia and Western Digital when Western Digital still owned Sandisk.
Axel Stoermann: Yes, I think this is maybe due to the fact that Western Digital was originally coming from HDD. And so when we saw this merging of Western Digital and Sandisk years ago, then we also identified a lot of, let's say, discussions and maybe also from time to time misunderstanding or misalignments, on the capabilities and the physics of a spinning magnetic disc compared to very sophisticated nanometer technology flash cell. And I think now it's much more separated, where Western Digital can focus again on their core expertise and Sandisk the same... now it's much clearer.
And we [Toshiba then] already had a long, long history also before this merger of Western Digital and Sandisk. So basically we had tens of years of closeness after we invented the flash cell. So this history is very long, and we have just now expanded our joint venture beyond 2030. So that's good.