Comment by menaerus

3 hours ago

Yes, I understand that. It is implied that there's a high TLB miss rate. However, I'm wondering if the penalty which we can quantify as O(4) memory accesses for 4-level page table, which amounts to ~20 cycles if pages are already in L1 cache, or ~60-200 cycles if they are in L2/L3, would be noticeable in workloads which are IO bound. In other words, would such workloads benefit from switching to the huge pages when most of the time CPU anyways sits waiting on the data to arrive from the storage.

1 comment

menaerus

jeffbee 2 hours ago

In a multi-tenant environment, yes. The faster they can get off the CPU and yield to some other tenant, the better it is.