← Back to context

Comment by jauntywundrkind

8 hours ago

The potential here with High-Bandwidth Flash is super cool. Effectively trying to go from 8 or a dozen flash channels to having a hundred or hundreds of channels would be amazing:

> The KAIST professor discussed an HBF unit having a capacity of 512 GB and a 1.638 TBps bandwidth.

One weird thing about this would be that it's still NAND flash and NAND flash still has limited read/write cycles, often measured in the thousands (Drive-Writes-a-Day across 5 years). If you can load a model & just keep querying it, that's not a problem. Maybe it's small enough to not be so bad, but my gut is that writing context here too might present difficulty.

I assume the use case is that you are an inference provider, and you put a bunch of models you might want to serve in the HBF to be able to quickly swap them in and out on demand.