[libvirt] [PATCH] Added capability to set and restore the MAC address of a NIC when using PASSTHROUGH mode with DIRECT type NICs. This is required to support SRIOV capable NICs with firmware implemented on-card switches.

Eric Blake eblake at redhat.com
Mon Jun 13 18:34:50 UTC 2011


On 06/10/2011 05:40 AM, Dirk Herrendoerfer wrote:
> ---

...see my earlier mail about the cover letter containing useful commit
information.  Also, your subject line is way long; typically the subject
should be < 60 characters, then a blank line, then you can give detailed
information about the patch.  'git shortlog -30' will give you an idea
of why short subject lines are useful.

> +++ b/src/libvirt_macvtap.syms
> @@ -6,5 +6,7 @@
>  # macvtap.h
>  delMacvtap;
>  openMacvtapTap;
> +get_macaddr;
> +set_macaddr;

Let's keep this list sorted, and stick with camelCase (that is,
getMacaddr comes before openMacvtapTap).  Also, you're not the first,
but this file isn't very namespace clean; I'm wondering if we should
first clean things up to use a virFunction prefix style for all the
internally-exported functions in our macvtap wrapper file.

> @@ -125,6 +126,53 @@ qemuPhysIfaceConnect(virDomainDefPtr def,
>          net->model && STREQ(net->model, "virtio"))
>          vnet_hdr = 1;
>  
> +    /** Note: When using PASSTHROUGH mode with MACVTAP devices the link 
> +     * devices's MAC address must be set to the VMs MAC address. In 

s/devices's/device's/

trailing space; please make sure your patch passes 'make syntax-check'
first.

> +     * order to not confuse the first switch or bridge in line this MAC 
> +     * address must be reset when the VM is shut down.
> +     * This is especially important when using SRIOV capable cards that
> +     * emulate their switch in firmware.
> +     */
> +    if ( net->data.direct.mode == VIR_DOMAIN_NETDEV_MACVTAP_MODE_PASSTHRU ) {

Style: we usually don't put space immediately inside the ():

if (net->data.direct.mode == VIR_DOMAIN_NETDEV_MACVTAP_MODE_PASSTHRU) {

> +	rc = get_macaddr(&old_macaddr, 6, net->data.direct.linkdev);
> +        if (rc) {
> +            virReportSystemError(rc,
> +                                 _("Getting MAC address from '%s' "
> +                                 "to '%02x:%02x:%02x:%02x:%02x:%02x' failed."),
> +                                 net->data.direct.linkdev,
> +                                 old_macaddr[0],old_macaddr[1],old_macaddr[2],

Style - space after comma:

old_macaddr[0], old_macaddr[1], ...

> +                                 old_macaddr[3],old_macaddr[4],old_macaddr[5]);
> +        } else {
> +            char oldmacname[256], newmacname[256];

Why so big, when you know that you have a fixed-width address that only
takes sizeof("00:00:00:00:00:00") bytes?

> +
> +            sprintf(oldmacname,"%02x:%02x:%02x:%02x:%02x:%02x",
> +                    old_macaddr[0],old_macaddr[1],old_macaddr[2],
> +                    old_macaddr[3],old_macaddr[4],old_macaddr[5]);
> +
> +            sprintf(newmacname,"/tmp/%s@%02x:%02x:%02x:%02x:%02x:%02x",
> +                    net->data.direct.linkdev,
> +                    net->mac[0],net->mac[1],net->mac[2],
> +                    net->mac[3],net->mac[4],net->mac[5]);

For that matter, since it appears that you are trying to create a file
name, it would be more appropriate to use virAsprintf (to malloc the
name, rather than doing it on the stack), so that the directory prefix
can then be variable sized according to configure arguments.  For
example, see how daemon/libvirtd.c computes /var/run/libvirt if
base_dir_prefix is "/var", but is flexible to being run by non-root
($HOME/.libvirt/run instead of /var/run).

> +
> +	    rc = symlink (oldmacname, newmacname);
> +            if (rc) {
> +                virReportSystemError(rc,

Here, and in several other places after a failed syscall, you want to
report the value of errno (which is a positive value such as EACCES),
not rc (which is often -1, and which makes no sense in
virReportSystemError).

> +                                     _("MAC link file creation failed for %s."),
> +                                     net->data.direct.linkdev);
> +            }
> +        }
> +
> +	rc = set_macaddr(net->mac, 6, net->data.direct.linkdev);

Why the hardcoded 6?  MAC addresses are fixed length; there's no need to
pass that length as a magic number parameter.

> +++ b/src/qemu/qemu_process.c
> @@ -2663,6 +2663,51 @@ void qemuProcessStop(struct qemud_driver *driver,
>          if (net->type == VIR_DOMAIN_NET_TYPE_DIRECT) {
>              delMacvtap(net->ifname, net->mac, net->data.direct.linkdev,
>                         &net->data.direct.virtPortProfile);
> +
> +            if ( net->data.direct.mode == VIR_DOMAIN_NETDEV_MACVTAP_MODE_PASSTHRU ) {
> +                char newmacname[256], linkdata[64];
> +                unsigned int old_macaddr[6];
> +
> +                sprintf(newmacname,"/tmp/%s@%02x:%02x:%02x:%02x:%02x:%02x",

Same style issues as above.  Same question about newmacname being
stack-allocated.  Another reason to use virAsprintf is to avoid
potential stack overflow issues if net->data.direct.linkdev is really large.


> +                        net->data.direct.linkdev,
> +                        net->mac[0],net->mac[1],net->mac[2],
> +                        net->mac[3],net->mac[4],net->mac[5]);
> +
> +                ret = readlink (newmacname, linkdata, 64);

'make syntax-check' will call you on this one.  Use virFileResolveLink
instead.

> +                if ( ret == 17 ) {
> +                    linkdata[17] = 0;
> +
> +                    ret = sscanf(linkdata,"%x:%x:%x:%x:%x:%x",
> +                                 &old_macaddr[0],&old_macaddr[1],&old_macaddr[2],
> +                                 &old_macaddr[3],&old_macaddr[4],&old_macaddr[5]);

I'm not necessarily a fan of sscanf (it misses out on some integer
overflow issues), although I think this particular usage is probably
safe enough.

> +                    if ( ret == 6 ) {
> +                        unsigned char oldmac[6];
> +			oldmac[0] = (unsigned char)(old_macaddr[0] & 0xFF) ;

Odd indentation, and poor style with space before ;.  Unnecessary mask
and cast since oldmac is already the right type.  You can simply use:

oldmac[0] = old_macaddr[0];

for properly converting the int (necessary for sscanf) down to an
unsigned char.  Even nicer would be to use the POSIX-mandated
sscanf("%hhx", &oldmac[0]) in the first place, if that were portable
(unfortunately, it isn't portable, and gnulib intentionally doesn't
provide a fixed sscanf implementation because sscanf suffers from silent
overflow issues in general).  It would probably be nicer to rewrite this
whole thing to avoid sscanf, and parse directly into unsigned char
without going through intermediate ints.  In fact, it seems like the
conversion from a string into a MAC address ought to be an already
supported common routine... </me searches>

Yep - virParseMacAddr looks like your friend.


> +			oldmac[1] = (unsigned char)(old_macaddr[1] & 0xFF) ;
> +			oldmac[2] = (unsigned char)(old_macaddr[2] & 0xFF) ;
> +			oldmac[3] = (unsigned char)(old_macaddr[3] & 0xFF) ;
> +			oldmac[4] = (unsigned char)(old_macaddr[4] & 0xFF) ;
> +			oldmac[5] = (unsigned char)(old_macaddr[5] & 0xFF) ;
> +
> +                        VIR_WARN(("Resetting MAC address on '%s' "
> +                                  "to '%02x:%02x:%02x:%02x:%02x:%02x'."),
> +                                  net->data.direct.linkdev, 
> +                                  oldmac[0],oldmac[1],oldmac[2],
> +                                  oldmac[3],oldmac[4],oldmac[5]);
> +                        /*reset mac and remove link file-ignore restults*/

s/restults/results/

> +int
> +get_macaddr(const unsigned char *macaddress, int macaddrsize,
> +            const char *srcdev )
> +{
> +    int sockfd; 
> +    int io; 
> +    char buffer[1024]; 
> +    struct ifreq ifr;
> +
> +    strcpy(ifr.ifr_name, srcdev);

Another 'make syntax-check' violation.

> + 
> +    sockfd = socket(AF_INET, SOCK_STREAM, 0); 
> +    if(sockfd < 0){ 
> +        return -1; 
> +    } 
> + 
> +    io = ioctl(sockfd, SIOCGIFHWADDR, (char *)&ifr); 
> +    if(io < 0){ 
> +        return -1; 
> +    } 
> +
> +    bcopy( ifr.ifr_ifru.ifru_hwaddr.sa_data, macaddress, macaddrsize);

bcopy is dead.  Use memcpy.

I ran out of time to review the rest of this; the overall idea has
merit, but there are a lot of changes needed in v2 before anything can
be applied.

-- 
Eric Blake   eblake at redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 619 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20110613/8142092d/attachment-0001.sig>


More information about the libvir-list mailing list