The Latency of Remote Memory Paging

Next: Using Busy Workstations Up: Performance Results Previous: Scaling the Network

The Latency of Remote Memory Paging

Based on our measurements above we can compute the paging latency. For example, the elapsed time of FFT on 28 Mbytes of input is 208 seconds, while the user time is 78.5 seconds. The rest 129.5 seconds should be attributed to paging overhead induced by 6520 page-in requests and 7791 page-out requests. Thus, the average latency per request is or 9.05 ms. From these, 7.2 ms were spend transferring each page on the Ethernet, and the rest 1.85 ms were the average software latency per paging request.

Previous measurements have reported that an 8 KByte page takes about 45 ms over an Ethernet for each page-in [14]. Of those 45 ms, 19 ms were spent on TCP overhead, 4 ms were spent on Mach IPC overhead, 7.2 ms were spend on the Ethernet, and the rest were spent on the computer's I/O bus. The total software latency of our implementation, is only 1.85 ms. The reason for this significant difference in performance is threefold:

The I/O bus of the DEC Alpha 3000 model 300 we use is significantly fast and does not pose a problem in performance.
The processor we use is a DEC Alpha, which is 3-4 times faster than the 386 processor used in [14].
Finally, our pager is implemented as a block device driver, while in [14] it was implemented as a user-level memory manager on top of Mach. Although user-level memory management gives increasing flexibility it induces large overhead.

In general, although our approach may have less flexibility than a full fledged user-level pager, it has much better performance. Moreover, our device-driver implementation provides better performance than traditional disk paging, while user-level implementations have not reported performance results to support similar claims [14].

Next: Using Busy Workstations Up: Performance Results Previous: Scaling the Network

Evangelos Markatos
Fri Mar 24 14:41:51 EET 1995