Comment by bhouston

1 day ago

> "Dual 25 Gigabit SFP28 ports and redundant power supplies for resilience"

Can you actually saturate the links with the spinning drives?

I've had the hardest time making my TrueNAS ZFS server fast when it was filled with HDD spinning disks. I initially also had 12 of them trying to get maximum speed. I have 128GB RAM and a 10G ethernet connection. I tried all types of optimizations like L2ARC via NVMe, etc, and it wasn't very effective and just too much time spent tweaking and testing.

Instead I just threw up my hands and replaced all the spinning disks with NVMe drives for the data I actually shared (8x 4TB NVMe drives.) And now it very usable and no need for LRArc, etc. Random or streaming access is equally fast.

Best choice I made. Now I did do this over a year ago so I skipped the NVMe price inflation.

I still keep 4 spinning disks but it is for archival data that I expect to never access unless something bad happens. It is slow and I use it like a tape drive.

It does have a dual NVMe cache; those in RAID-0 will saturate (e.g. I believe just one Samsung 990 Pro can write at just over 50Gbps).

The bigger risk is the CPU. This is an issue with the Ubiquiti UNAS Pro 8, their ~$800 USD 8 bay NAS. In theory it has 10gig networking. In practice the CPU physically cannot transfer bits fast enough, because its a dinky underpowered ARM CPU that they clearly chose to hit that quite affordable price point. Its a decent trade-off, because similar units from Synology are more like $1600, and you can meaningfully hit somewhere between 2.5gig and 10gig; but saturating 10gig is out of the question.

The ENAS has a beefier CPU so it might keep up with 25gig (could this do 50gig bonded?). But only testing will tell.

  • You can hit 10 gig aggregate on an A57 quite easily, given standard memory bandwidth (I've done it). They must be doing something stupid on the software side, like too many copies. Or if you're trying to shove 10 gig in one flow at 1500 mtu yeah that might be painful.

    • As I recall there were some people on reddit who got the UNAS Pro 8 up to 10gig, but yeah it was only through some level of software tweaks or network stack config or something. From the factory my understanding is that it struggles.

I have a backup node with a 40G NIC & a ZFS pool of just 8x HDDs set up as a pool of two RAIDZ1 vdevs striped together (i.e. 4x drives in raidz1-0 & 4x drives in raidz1-1 make up the "backup" pool). Restoring full backup images to another server I usually get ~11-12 Gbps over NFS, no flash cache or anything involved.

Honestly, outside of random access/small file access, my primary NVMe ZFS server isn't all that much faster in raw throughput - despite being 22x NVMe drives going direct to the CPU instead of 8 HDDs going through a SATA controller. I think it's easier to hit other bottlenecks with ZFS/network transfers well before the disk throughput itself. E.g., enabling jumbo frames for NFS did still give me a decent perf/efficiency bonus.

> Can you actually saturate the links with the spinning drives?

I can mostly saturate my toy 100gbit link with it on read (to memory, since the other side also needs to not be the problem). Just for as long as it's already in the ZFS cache (which can be huge with in the hundreds GB of ram in servers in general). Not in practice since when you hit the disks you take a massive penalty, but then again, it can be done.

> Can you actually saturate the links with the spinning drives?

There can easily be a bottleneck depending on how the setup the sata/sas, but if you can get sustained sequential reads or writes, 16x drives at 6 Gbps sata should be able to saturate 2x 25 Gbps ethernet. The store link shows two expansion ports as well which should help get bandwidth to the point where 25 Gbps is useful.

Less likely with random reads/writes or mixed use.

Made me think of this:

I got a 10G ethernet network card for my NAS only to realize it has to overlap with my modem's supported bandwidths (IIRC 2.5G, 5G).

Knowing nothing about the space, I had assumed it would use max(node1, node2), but instead it negotiated a 1G link. So it was faster to use the mobo's built-in 2.5G port.

  • The 2.5g/5g 'multigig' standard came out a million years after 10g-baseT. Cheap ex-enterprise 10g cards don't know how to do the middle speeds.

You can fill it with SSDs, and SFP28 is so common the prices are cheap:

https://www.fs.com/c/25g-sfp28-3215

But no, spinning disks won't saturate it, even if you were doing 100% sequential reads.

(I originally said fill it with NVMe - I was wrong)

  • It looks like you can put 2 nvme drives in it, for caching.

    • While that's the ARC, I would be surprised if they blocked you from building vdevs with SSDs.

      Looking at the specs: https://store.ui.com/us/en/category/network-storage/products...

      Hard Drive Capacity

      (16) 2.5/3.5" HDD / SSD support

      (2) M.2 NVMe SSD support

      (2) Expansion ports support

      I think you're right we only get two SSDs on NVME as the cache, but it looks like we can run the rest (16) as SATA SSDs, which is often fine if you primarily care about random IOPS and capacity over pure throughput.

      Would you consider that a dealbreaker?

      5 replies →

With NVMe-oF and RDMA I can saturate a 25G link with spinning disks easiliy with around 16 drives.

with the zil/slog on nvme yes -- you would want redundant power, UPS and a raid of nvme drives but with all that in place the data would get securely written to flash media before being flushed to spinning rust.