BSDCan2013 - Final

BSDCan 2013
The Technical BSD Conference

Speakers
Luigi Rizzo
Schedule
Day Talks - Day 1 - 2013-05-17
Room MNT 202
Start time 15:00
Duration 01:00
Info
ID 391
Track Hacking
Language used for presentation English

Lightning fast networking in your virtual machine

High speed network communication is challenging on bare metal, and even more so in virtual machines. There we have to deal with expensive I/O instruction emulation, format manipulation, and handing off data through multiple threads, device drivers and virtual switches.

Common solutions to the problem rely on hardware support (such as PCI passthrough) to make portions of the NIC directly accessible to the guest operating system, or specialized drivers (virtio-net, vmxnet, xenfront) built around a device model that is easier to emulate.

These solutions can reach 10 Gbit/s and higher speeds (with suitably large frames), one order of magnitude faster than emulated conventional NICs (e.g. Intel e1000).

Despite popular belief, NIC emulation is not inherently slow. In this paper we will show how we achieved VM-to-VM throughputs of 4 Mpps and latencies as low as 100us with only minimal modifications to an e1000 device driver and frontend running on KVM.

Our work relies on four main components, which can be applied independently:

1) proper emulation of certain NIC features, such as interrupt mitigation, which greatly contribute to reduce the emulation overhead; 2) modified device drivers that reduce the number of I/O instructions, much more expensive on virtual machines than on real hardware; 3) a small extension of the device model, which permits shared-memory communication with the hypervisor without requiring a completely new device driver 4) a fast network backend (VALE), based on the netmap framework, which can sustain multiple millions of packets per second;

With the combination of these techniques, our VM-to-VM throughput (two FreeBSD guests running on top of QEMU-KVM) went from 80 Kpps to almost 1 Mpps using socket based applications, and 4 Mpps with netmap clients running on the guest. Similarly, latency was reduced by more than 5 times, reaching values of less than 100 us.

It is important that these techniques can be applied independently depending on the circumstances. In particular, #1 and #4 modify the hypervisor but do not require any change in the guest operating system. #2 introduces a minuscule change in the guest device driver, but does not touch the hypervisor. #4 relies on device driver and hypervisor changes, but these are limited to a few hundreds of lines of code, compared to the 3-5 Klines that are necessary to implement a new device driver and its corresponding frontend on the hypervisor.

More results and code will be made available at http://info.iet.unipi.it/~luigi/vale/