policy

SWIOTLB by Morpheus

Stefano Stabellini

Aug 14, 2013 — 6 min read

The following monologue explains how Linux drivers are able to program a device when running in a Xen virtual machine on ARM.
The problem that needs to be solved is that Xen on ARM guests run with second stage translation in hardware enabled. That means that what the Linux kernel sees as a physical address doesn’t actually correspond to a machine address. An additional translation layer is set by the hypervisor to do the conversion in hardware.
Many devices use DMA to read or write buffers in main memory and they need to be programmed with the addresses of the buffers. In the absence of an IOMMU, DMA requests don’t go through the same physical to machine translation set by the hypervisor for virtual machines, devices need to be programmed with machine addresses rather than physical addresses. Hence the problem we are trying to solve.
Definitions of some of the technical terms used in this article are available at the bottom of the page.
Given the complexity of the topic, we decided to ask for help to somebody with hands-on experience with teaching the recognition of the differences between “virtual” and “real”.

Morpheus

At last.
Please. Come. Sit.

Do you realize that everything running on Xen is a virtual machine — that Dom0, the OS from which you control the rest of the system, is just the first virtual machine created by the hypervisor? Usually Xen assigns all the devices on the platform to Dom0, which runs the drivers for them.
I imagine, right now, you must be feeling a bit like Alice, tumbling down the rabbit hole?
Let me tell you why you are here.
You are here because you want to know how to program a device in Linux on Xen on ARM.
Do you want to know what the Matrix is, Neo?
It’s that feeling you have had all your life. That feeling thatÂ something was wrong with the world. The Matrix is everywhere, it’s all around us, here even in this room.
Hold out your hands.
You take the blue pill and the story ends. You wake in your bed and you believe whatever you want to believe. You take the red pill and you stay in Wonderland and I show you how deep the rabbit-hole goes.
I know you would make the right choice.
Follow me. Just relax.
On x86 Dom0 it’s just a regular PV guest, running without nested paging, using hypercalls to issue pagetable manipulations.
You know what that means, don’t you?
Virtual addresses translate directly into machine addresses.
What you see as a physical address, a pfn, doesn’t actually exist.
It’s all part of an elaborate fabrication. The Matrix. That is what this world really is.
Listen.
Linux on x86 knows this too. He can resolve pseudo-physical addresses into machine addresses if he wants to. Generic Linux functions reason in terms of non-existent physical addresses that gets translated into machine addresses afterwards if necessary.
For you it’s going to be harder, Neo.
The Matrix is stronger for autotranslate guests like Xen on ARM virtual machines. Linux on autotranslate guests use physical addresses more frequently. He writes a set of pagetables that translate virtual addresses to intermediate physical addresses. The hypervisor writes a second set of pagetables to translate physical addresses to machines addresses. This is hardware-assisted paging. Autotranslate guests know very little about the underlying machine addresses: it’s all taken care of by the hardware.
Welcome to the real world, Neo.
I see that you have finally woken up.
This is a sparring program. Consider this your first lesson.
Attack me.
Good. Adaption. Improvisation.
But your weakness isn’t your technique.
Do you think my being faster, stronger has anything to do with my muscles in this place?
If you want to use your full potential you need to program a device to do DMA. It’s not enough to translate a physical address into a machine address right before issuing a DMA request, because DMA requests can involve multiple contiguous pages. If you allocate a contiguous buffer in physical address space, it won’t actually be contiguous in machine address space.
Don’t look so stunned. It’s not magic.
Using the swiotlb-xen driver, Linux x86 can ask Xen to make the allocated buffer really contiguous. Xen accomplishes the task by reclaiming the old buffer and allocating another buffer, this time contiguous in machine address space, changing the mapping of the buffer passed as an argument. Suddenly the virtual address of the buffer in Linux points to a different buffer that is safe to use for DMA. swiotlb-xen returns the machine address of the buffer as DMA address.
You need to work harder than that.
Xen ARM guests are all autotranslate guests. For them, physical addresses actually exist. They don’t know what we know, Neo. What we have come to accept.
You must empty yourself to free your mind.
You need to learn to use swiotlb-xen on Xen on ARM.
Device drivers call one of the dma_map_ops functions to allocate a dma-capable buffer, as usual. These functions are implemented by swiotlb-xen. They ask Xen to make the buffer really contiguous, same as Linux x86.
However this time Xen changes the physical to machine mappings of the buffer. The hypervisor also ensures that the mappings won’t be modified as long as Linux needs them. Xen returns the machine address of the first page of the buffer to Linux. Unlike the Linux x86 case, the virtual address of the buffer in Linux still points to the same intermediate physical address after the hypercall returns. However the second stage pagetables have been changed and now point to a contiguous buffer in machine memory. swiotlb-xen returns the new machine address of the buffer as DMA address. This is one of the few cases where the DMA address (machine address) is actually different from the cpu address (intermediate physical address) in Linux on recent x86 and ARM platforms.
You can’t always rely on hardware IOMMUs to setup physical to machine address translations for all the devices on the platform.
Do you believe that’s air you are breathing now?

Definitions

virtual address
An address in memory that needs to be translated by the MMU into a guest physical address in order to be used to access memory.
physical address
When running in a virtual machine, a physical address needs to be translated into a machine address in order to be used to access memory. The translation can happen in hardware, transparently to the operating system (for example using EPT or second stage translation), or in software, usually with collaboration from the OS. In ARM lingo, this is called intermediate physical address.
machine address
An number that points to a particular cell in real unvirtualized memory. It can be used to access memory.
Confusing enough, in ARM lingo this is called physical address.
nested paging
A hardware based mechanism to translate physical addresses into machine addresses. Intel provides EPT, AMD provides NTP, ARM provides second stage translation.
PV guests
A type of Xen virtual machine on x86. They are fully aware that they are running in a virtualized environment. They run without nested paging. They use hypercalls to issue pagetable modifications. The pagetables end up containing machine addresses, therefore the physical address space is just an internal representation of the kernel.
autotranslate guests
A type of Xen virtual machine. An autotranslate guest knows that it is running in a
virtualized environment but as the same time makes use of nested paging. Xen virtual machines on ARM are autotranslate guests.
DMA
Direct memory access from a device. The device is programmed by the operating system to access a buffer (read or write) in main memory. If the buffer spans multiple pages, it needs to be contiguous.
bus and device address
The address used by a device to access memory. In the absence of an IOMMU device, device addresses correspond to machine addresses (on x86 and ARM).
IOMMU
Input/Output Memory Management unit. It translates addresses used by a device for DMA into machine addresses.
DMA and Xen virtual machines: the address space
When the operating system running in a virtual machine programs a device to do DMA, it can use physical addresses only if an IOMMU is available on the platform and the hypervisor has programmed it correctly. Otherwise the operating system needs to use machine addresses.
DMA and Xen virtual machines: contiguousness of the buffer
If the buffer used for DMA spans multiple pages, it needs to be contiguous in machine address space. When the operating system kernel running in a virtual machine allocates a contiguous buffer, it is usually contiguous only in physical address space.

Morpheus

Definitions

Read more