BSDCan2007 - Confirmed Schedule

BSDCan 2007
The Technical BSD Conference

John Baldwin
Day 3
Room SITE A0150
Start time 11:30
Duration 01:00
ID 2
Event type Lecture
Track Advanced
Language English

PCI Interrupts for x86 Machines under FreeBSD

Pardon me. Excuse Me.

An important element in computers with multiple autonomous devices is the ability of a device to notify the CPU that it needs attention via an interrupt. The OS visible mechanics of interrupts for PCI devices is quite convoluted, especially on x86 PC systems. This paper will cover the various ways that PCI INTx interrupts have been implemented on x86 as well as the methods used by the system BIOS to communicate the implementation to operating systems. It will also cover the newer Message Signalled Interrupts that address some of the limitations of INTx interrupts. Finally, the paper will provide an overview of FreeBSD's implementation of both INTx and MSI interrupts on the x86 platform.

  • Legacy PCI INTx interrupts
- Each device/slot has 4 pins, A# - D# shared by the functions that are hooked up to something. The pins are wired via side-band signals. - Figuring out how various things are hooked together so when an interrupt comes in we know which device(s) triggered it is called interrupt routing.
  • Interrupts on x86
- Each interrupt includes an IDT index (vector) and the OS maps a handler for each of the 256 possible vectors. 0-31 are reserved for faults.
  • Interrupt Controllers (Stuff in the Middle)
- 8259A Master and Slave PICs (PC-AT) - 8 pins each, resulting in 16 ISA IRQs. IRQ 2 is really lost as it is used to chain the slave onto the master. IRQs 0, 1, 8, 13 are always reserved for non-PCI devices. IRQs 3, 4, 6, 7, 12, 14, and 15 are usually reserved for non-PCI devices. That leaves IRQs 5, 9, 10, and 11, and 5 is sometimes used as well (e.g. sound cards often used it). Each PIC is given a range of 8 contiguous (and aligned) IDT vectors. - Programmable Interrupt Routers (PCI Link Devices) - Routes PCI interrupts to pins on an interrupt controller. Provides N input pins. Can have 0 or more PCI interrupt lines tied together on an input pin. Each input pin can then be independently routed (or steered) to a specific pin on an interrupt controller. - I/O APICs - Can have any number of I/O APICs. I/O APICs tend to have 16, 24, or 32 pins. The first 16 pins for the I/O APICs are reserved for ISA devices. Typically PCI interrupt lines are tied to dedicated pins, but sometimes still shared. Can have a PCI Link tied to an input pin. Each pin has its own IDT vector value. - PCI interrupt routing on x86 thus consists of finding which (controller, pin) a (bus, slot, pin) maps to and making sure the IDT vector for (controller, pin) will trigger the interrupt handler for (bus, slot, pin).
  • Figuring out the Mapping
- For devices behind a PCI-PCI bridge on an add-in card, interrupts are routed via a swizzle onto the interrupt pins on the bridge's upstream interface. This is also used for PCI-PCI bridges in the system where no other interrupt routing information is provided for that bus. - Really old systems' BIOS just wrote the ISA IRQ into the interrupt line PCI config register. - For systems with a programmable interrupt router, the $PIR table was added. It maps (bus, slot, pin) tuples to a link value. Each link value represents a programmable input pin on the router. Thus, to route an interrupt you first look up the right link value. Then you see what IRQ that link is routed to. If the link isn't routed yet, you have to route it to an IRQ. - When I/O APICs were introduced, the MP Table was added. This table not only enumerates the I/O APICs in the system, it also tells you which device interrupts are hooked up to pins on I/O APICs. In some of the oldest systems with I/O APICs, there was only a single I/O APIC with 16 intpins that provided 16 ISA IRQs. Interrupts were still routed to the ISA IRQs using a programmable interrupt router. For most systems, however, individual PCI interrupts are routed to individual pins on I/O APICs. - ACPI reduces the two tables ($PIR and MP Table) down to one. With ACPI, each PCI bus has its own routing table (_PRT). The OS tells ACPI if it is using APIC or the 8259A's up front. To route an interrupt, the OS finds the parent PCI bus object in the ACPI namespace and looks up the (device, pin) tuple in the _PRT table to find the destination. The destination can either be a hard-wired interrupt number (typical for the APIC case) or a link device. ACPI treats individual pins on the programmable interrupt router as PCI link devices, and provides methods for programming the link devices. Note that ACPI lets you use programmable links with APIC which the MP Table does not let you do. At least some NVidia amd64 motherboards are known to do this.
  • FreeBSD's implementation
- Interrupts are routed to an IRQ cookie value by the PCI bridge drivers. Different PCI bridge drivers use different methods (ACPI _PRT, $PIR, MP Table, PCI-PCI swizzle) of routing IRQs. On x86, each interrupt source (i.e., pin on an APIC or 8259A) is assigned a unique IRQ. When a bridge driver locates the interrupt source a PCI interrupt is connected to, it queries it determines its IRQ value and returns that. When an interrupt is then activated via bus_setup_intr(), the nexus driver looks up the interrupt source for the specified IRQ cookie value and asks it to register the driver's interrupt handler. - IDT vectors - 0-31 are reserved for exceptions - 240-255 are used for IPIs and spurious vector - 128 (0x80) is used for system calls - 239 is used for the local APIC timer - 32 through 47 are statically assigned to the 8259As - That leaves 48 though 127 and 129 though 238 for device interrupts. I/O APIC pins allocate an IDT vector on demand when they are assigned their first interrupt handler.
  • PCI Message Signalled Interrupts
- Interrupts are signalled via memory writes instead of sideband signals. Reduces number of physical signal lines needed. Architectures and/or chipsets define the format of the data to write as well as the address to write to. On x86, the message data include the IDT vector directly and is parsed by the chipset (either a Host-PCI or HT-PCI bridge) and dispatched directly to the CPUs. Interrupt controllers are no longer involved in interrupt delivery and interrupt routing is no longer required. Instead, the OS assigns IDT vectors directly to messages. Also, each PCI function can have multiple messages. - FreeBSD implementation - PCI bus asks the PCI bridge driver for IRQ cookie values. PCI bridges pass request up the tree. - On x86, the nexus0 device eventually gets the request and allocates some free IRQ cookie values in the range 256 to 383. It also assigns IDT vectors to those IRQs. When the interrupt is activated, the appropriate address and data values are generated and written into the MSI control registers in the PCI device.