[libvirt] RFC: setting mac address on network devices being assigned to a guest via PCI passthrough (<hostdev>)

Roopa Prabhu roprabhu at cisco.com
Tue Jan 31 01:16:56 UTC 2012




On 1/30/12 11:14 AM, "Laine Stump" <laine at laine.org> wrote:

> On 01/23/2012 09:08 AM, Paolo Bonzini wrote:
>> 
>> <devices>
>> <interface type='hostdev'>
>> <source>
>> <address type='pci' bus='0x06' slot='0x02' function='0x0'/>
>> </source>
>> <mac address='00:16:3e:5d:c7:9e'/>
>> <address type='pci' .../>
>> </interface>
>> </devices>
>> 
> 
> This is the model that I'm now following.
> 
> Looking further into it, I've found that there are lots of places in the
> libvirt code that scan through all the <hostdev> entries, and call
> functions that expect a virDomainHostdevDef as an argument. Of course
> all those same places will need to be visited with devices that are
> assigned via <interface> (virDomainNetDef) as well (and this will also
> apparently be needed for <controller> devices in the near future).
> 
> What I'm thinking of doing now, is changing virDomainHostdevDef in the
> following way:
> 
> typedef virDomainDeviceSourceInfo *virDomainDeviceSourceInfoPtr;
> struct _virDomainDeviceSourceInfo {
>      int mode; /* enum virDomainHostdevMode */
>      bool managed;
>      union {
>          virDomainDeviceSubsysAddress subsys; /* USB or PCI */
>          struct {
>              int dummy;
>          } caps;
>      };
>      virDomainHostdevOrigStates origstates;
> };
> 
> typedef struct _virDomainHostdevDef virDomainHostdevDef;
> typedef virDomainHostdevDef *virDomainHostdevDefPtr;
> struct _virDomainHostdevDef {
>      virDomainDeviceDefPtr     parent; /* specific device containing
> this def */
>      virDomainDeviceSourceInfo source; /* host address info */
>      virDomainDeviceInfoPtr    info;   /* guest address info */
> };
> 
> 
> (note that "info" is now a separate object, rather than simply being
> contained in the HostdevDef!)
> 
> This new HostdevDef can then be included directly in higher level device
> types, e.g:
> 
> struct _virDomainNetDef {
>      enum virDomainNetType type;
>      unsigned char mac[VIR_MAC_BUFLEN];
>       ...
>      union {
>           ...
>          struct {
>              char *linkdev;
>              int mode; /* enum virMacvtapMode from util/macvtap.h */
>              virNetDevVPortProfilePtr virtPortProfile;
>          } direct;
> **      struct {
> **          virDomainHostdevDef def;
>          } hostdev;
>      } data;
>      struct {
>          bool sndbuf_specified;
>          unsigned long sndbuf;
>      } tune;
>       ...
>      char *ifname;
>      virDomainDeviceInfo info;
>       ...
> };
> 
> 
> for <interface type='hostdev'>, the hostdev would be populated like this:
> 
>       (interface->data.hostdev.def.source will already be filled in from
> Parse)
> 
>       interface->data.hostdev.def.info = &interface->info;
>       interface->data.hostdev.def.parent.type = VIR_DOMAIN_DEVICE_NET;
>       interface->data.hostdev.def.parent.data.net = interface;
> 
> At this point, &interface->data.hostdev.def can be sent as a parameter
> to any function that's expecting a virDomainHostdevDef. Beyond that, I'm
> thinking it can *even be added to the hostdevs list in the DomainDef*.
> This would work in the following way:
> 
> 0) If a parent device (in our example a virDomainNetDef) is
> type='hostdev', in addition to be included in its normal higher level
> device list (e.g. domain->nets), parent->data.hostdev will be filled in
> as indicated above, and &parent->data.hostdev will be added to the
> domain's hostdevs list.
> 
> 1) When a function is scanning through all the hostdevs to do device
> management, it will act on this higher level device just as any generic
> device, except that there may be callouts to setup functions based on
> the value of hostdevs[n]->parent.type (e.g. to setup a MAC address or
> virtual port profile).
> 
> 2) When an entry from the hostdevs list is being freed, any hostdev that
> has a non-NULL parent will simply be removed from the list (and a
> callout made to the equivalent function to remove the hostdev's parent
> from its own list).
> 
> 3) When an entry from the higher level device list is being freed, it
> will also be removed from the hostdev list.
> 
> 4) when one of these "intelligent hostdevs" is attached/detached,
> depending on hostdev->parent.type, it may callout to a device-specific
> function,
> 
> By doing things this way, we assure that these new higher level devices
> will always be included in any scans of hostdevs, while avoiding the
> necessity to add a new loop to every one of the functions that scans
> them each time we add support for PCI passthrough of another
> higher-level device type.
> 
> 
> Does this sound reasonable? (I'm making a proof-of-concept now, but
> figured I'd solicit input in the meantime).
> 
> -------------------
> 
> The next problem: We will need to be able to configure everything that's
> in a <hostdev> from within an <interface>, but there are a few things we
> haven't discussed yet:
> 
> 1) "type='pci'" vs. "type='usb'"
> 
> <hostdev> has one of these directly as an attribute of the toplevel
> element, so it isn't given in the source <address> element. In the case
> of <interface>, type is already used for something else in the toplevel
> element, but it can instead be given as part of the <address>. So which
> do you think is better:
> 
> <interface type='hostdev'>
> <source>
> <address type='pci' domain='0' ... />
> </source>
>       ...
> 
> or:
> 
> <interface type='pci'>
> <source>
> <address domain='0' .... />
> 
> ?? In either case, "type='pci'" could be replaced with "type='usb'".
> Note that if we use the first option, it will be possible to do
> something like:
> 
> <interface type='hostdev'>
> <source dev='eth22'/>
> 
> and have libvirt determine at attachtime whether eth22 is a usb or pci
> device (I'm sure 99 44/100% of all uses of this will be with pci
> devices, but still...).
> 
> 2) "managed='yes'"
> 
> This obviously needs to go *somewhere*. Does this look okay?
> 
> <interface type='hostdev' managed='yes'>
>      ...
> 
> 
> 3) "mode='subsystem'"
> 
> Since the other mode "capability" has never been implemented, and
> apparently won't be, I don't see any need to give this a place in the
> <interface> XML. For now it will always be subsystem, and if we ever
> need to add a mode attribute, "subsystem" will just be the default.
> 
> 
> So what I end up with is this:
> 
> <interface type='hostdev' managed='yes'>
> <source dev='eth22'/>
>     ...
> 
> and
> 
> <interface type='hostdev' managed='yes'>
> <source>
> <address type='pci' domain='0' .... />
> </source>
>     ...
> 
> Note that when "dev='eth22'" is given, an <address> element will be
> added at attach time (I haven't decided yet if it's best for this
> element to persist (with appropriate checks to make sure it continues to
> match the named network device), or should be erased and re-learned each
> time.
> 
Laine, I haven't gone through your whole email yet. Was just curious about
one quick thing,

For sriov VF's, are we expecting that a net device (eth interface) be
present on the host if its being used as a hostdev ?. If yes, then libvirt
will need to do an unbind of the driver on the VF before assigning it to the
VM. Which today it does not do (correct me if I am wrong). Which is still
ok. Just wanted to call that out.
Plus ideally it would be nice to not have an expectation that a vf netdevice
be present on the host. Because for sriov vf's, it would mean that the vf
driver has to be loaded on the host. Which is really not required for vfs
because mac and port profile can be set via the pf with the vf index as
argument.

Basically, 
For non-sriov network devices on host,
    - find netdevice
    - set mac on netdevice
    - If required set port profile on the netdevice
    - unbind netdevice driver
    - assign net pci device to guest


For sriov vf network devices on host,
    - find pf netdevice
    - set mac for vf via the pf netdevice
    - If required set port profile on the vf via the pf netdevice
    - unbind vf driver if its loaded on the vf /* not mandatory */
    - assign vf pci device to guest

Thanks!
-Roopa




More information about the libvir-list mailing list