Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

`madvise(2)` doesn't matter _that_ much in my experience with [1] on modern Linux kernels. SSD just can't read _quite_ as quickly as memory in my testing. Sure, SSD will be able to re-read a lot into ram, analogous to how memory reading will be able to rapidly prefetch into L1.

I get ~30 GiB/s for threaded sequential memory reads, but ~4 GiB/s for SSD. However, I think the SSD number is single-threaded and not even with io_uring—so I need to regenerate those numbers. It's possible it could be 2-4x better.

[1]: https://github.com/sirupsen/napkin-math



I think the effects of madvise primarily crop up in extremely I/O-saturated scenarios, which are rare. Reads primarily incur latency, with a good SSD it's hard to actually run into IOPS limitations and you're not likely to run out of RAM for caching either in this scenario. MADV_RANDOM is usually a pessimization, MADV_SEQUENTIAL may help if you are truly reading sequentially, but may also worsen performance as pages don't linger as long.

But as I mentioned, there's caching upon caching, and also protocol level optimizations, and hardware-level considerations (physical block size may be quite large but is generally unknown).

It's nearly impossible to benchmark this stuff in a meaningful way. Or rather, it's nearly impossible to know what you are benchmarking, as there are a lot of nontrivially stateful parts all the way down that have real impact on your performance.

There are so many moving parts I think the only meaningful disk benchmarks consider whatever application you want to make go faster. Do the change. Is it faster? Great. Is it not? Well at least you learned.


> I get ~30 GiB/s for threaded sequential memory reads, but ~4 GiB/s for SSD. However, I think the SSD number is single-threaded and not even with io_uring—so I need to regenerate those numbers. It's possible it could be 2-4x better.

Assuming that you run the experiments on NVMe SSD which is attached to PCIe 3.0, where theoretical maximum is around 1GB/s per each lane, I am not sure I understand how do you expect to go faster than 4 GiB/s? Isn't that already a theoretical maximum of what you can achieve?


PCIe 4.0 SSDs are pretty common nowadays and are basically limited to what PCIe 4.0 x4 can do (around 7 GB/s net throughput).


I don't think they're that common. You would have to have quite recentish motherboard and CPU that both support PCIe 4.0.

And I'm pretty sure that parent comment doesn't own such a machine because otherwise I'd expect 7-8GB/s figure to be reported in the first place.


I really doubt they’re that common. They only became available on motherboards fairly recently, and are quite expensive.

I’d guess that they’re a small minority of devices at the moment.


PCIe 5.0 has just recently started showing up on consumer motherboards.

4.0 might not be common, but surprisingly it is now the previous generation!


You might be very right about that! It's been a while since I did the SSD benchmarks. Glad to hear it's most likely entirely accurate at 4 GiB/s then!


How'd you measure the maximum memory bandwidth? In Algorithmica's benchmark, the max bandwidth was observed to be about 42 GBPS: https://en.algorithmica.org/hpc/cpu-cache/sharing/

I'm not sure how they calculated the theoretical limit of 42.4 GBPS, but they have multiple measurements higher than 30 GBPS.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: