Tuesday, May 5, 2015

VFIO GPU How To series, part 2 - Expectations

From part 1 we learned some basic guidelines for the hardware necessary to support GPU assignment, but let's take a moment to talk about what we're actually trying to accomplish.  There's no point in proceeding further if the solution doesn't meet your goals and expectations.

First things first, PCI device assignment is a means to exclusively assign a device to a VM.  Devices cannot be shared among multiple guests and cannot be shared between host and guest.  The solution we're discussing here is not vGPU, VGX, or any other means of multiplexing a single GPU among multiple users.  Furthermore, VFIO works on an isolation unit known as an IOMMU Group.  Endpoints within an IOMMU group follow the same rule; they're either owned by a single guest or the host, not both.  As referenced in part 1, a previous article on IOMMU groups attempts to explain this relationship.

Next, be aware that Linux graphics drivers, both open source and proprietary are rather poor at dynamically binding and unbinding devices.  This means that hot-unplugging a graphics adapter from the host configuration, assigning it to a guest for some task, and then re-plugging it back to the host desktop is not really achievable just yet.  This is something that could happen, but graphics drivers need to get to the same level of robustness around binding and unbinding devices as NIC drivers before this is really practical.

Probably the primary incorrect expectations that users have around GPU assignment is the idea that the out-of-band VM graphics display, via Spice or VNC, will still be available, it will somehow just be accelerated with an assigned GPU.  The misconception is reinforced by youtube videos that show accelerated graphics running within a window on the host system.  In both of these examples, the remote display is actually accomplished using TightVNC running within the guest.  Don't get me wrong, TightVNC is a great solution for some use cases, and local software bridges for virtio networking provide extreme amounts of bandwidth to make this a bit more practical than going across a physical wire, but it's not a full replacement for a console screen within virt-manager or other VM management tools.  TightVNC is a server running within the guest, it's only available once the guest is booted, it's rather CPU intensive in the guest, and it's only moderately ok for the remote display of 3D graphics.  When using GPU assignment, the only fully accelerated guest output is through the monitor connector to the physical graphics card itself.  We currently have no ability to scrape the framebuffer from the physical device, driven by proprietary drivers, and feed those images into the QEMU remote graphics protocols.  There are commercial solutions to this problem, NICE DCV and HP RGS are both software solutions to provide better remote 3D capabilities.  It's even possible to co-assign PCoIP cards to the VM to achieve high performance remote 3D.  In my use case, I enable TightVNC for 2D interaction with the VM and use the local monitor or software stacks like Steam in home streaming for remote use of the VM.  Tools like Synergy are useful for local monitors to seamlessly combine mouse and keyboard for multiple desktops.

Another frequent misconception is that integrated graphics devices, like Intel IGD graphics, are just another graphics device and should work with GPU assignment.  Unfortunately IGD is not only non-discrete from the aspect of being integrated into the processor, but it's also non-discrete in that the drivers for IGD depend on various registers and operations regions and firmware tables spread across the chipset.  Thus, while we can attach the IGD GPU to the guest, it doesn't work due to driver requirements.  This issue is being worked and will hopefully have a solution in the near future.  As I understand it, AMD APUs are much more similar to their discrete counterparts, but the device topology still makes them difficult to work with.  We rely fairly heavily on PCI bus resets in order to put GPUs into a clean state for the guest and between guest reboots, but this is not possible on IGD or APUs because they reside on the host PCIe root complex.  Resetting the host bus is not only non-standard, but it would reset every PCI device in the system.  We really need Function Level Reset (FLR) support in the device to make this work.  For now, IGD assignment doesn't work, and I have no experience with APU assignment, but expect it to work poorly due to lack of reset.

Ok, I think that tackles the top misconceptions regarding GPU assignment.  Hopefully there are still plenty of interesting use cases for your application.  In the next part we'll start to configure our host system for device assignment.  Stay tuned.

No comments:

Post a Comment

Comments are not a support forum. For help with problems, please try the vfio-users mailing list (https://www.redhat.com/mailman/listinfo/vfio-users)