BSDCan2009 - Final Release

BSDCan 2009
The Technical BSD Conference

Luigi Rizzo
Day Talks - 2 - 2009-05-09
Room MNT 202
Start time 10:00
Duration 01:00
ID 122
Event type Lecture
Track Hacking
Language used for presentation English

GEOM based disk schedulers for FreeBSD

The high cost of seek operations makes the throughput of disk devices very sensitive to the offered workload. A disk scheduler can then help reorder requests to improve the overall throughput of the device, or improve the service guarantees for individual users, or both.

Research results in recent years have introduced, and proven the effectiveness of, a technique called "anticipatory scheduling". The basic idea behind this technique is that, in some cases, requests that cause a seek should not be served immediately; instead, the scheduler should wait for a short period of time in case other requests arrive that do not require a seek to be served. With many common workloads, dominated by sequential synchronous requests, the potential loss of throughput caused by the disk idling times is more than balanced by the overall reduction of seeks.

While a fair amount of research on disk scheduling has been conducted on FreeBSD, the results were never integrated in the OS, perhaps because the various prototype implementations were very device-specific and operated within the device drivers. Ironically, anticipatory schedulers are instead a standard part of Linux kernels.

This talk has two major contributions:

First, we will show how, thanks to the flexibility of the GEOM architecture, an anticipatory disk scheduling framework has been implemented in FreeBSD with little or no modification to a GENERIC kernel. While these schedulers operate slightly above the layer where one would naturally put a scheduler, they can still achieve substantial performance improvements over the standard disk scheduler; in particular, even the simplest anticipatory schedulers can prevent the complete trashing of the disk performance that often occurs in presence of multiple processes accessing the disk.

Secondly, we will discuss how the basic anticipatory scheduling technique can be used not only to improve the overall throughput of the disk, but also to give service guarantees to individual disk clients, a feature that is extremely important in practice e.g., when serving applications with pseudo-real-time constraints such as audio or video streaming ones.

A prototype implementation of the scheduler that will be covered in the presentation is available at