[External] Re: [PATCH] util: support PCI passthrough net device stats collection
zhenwei pi
pizhenwei at bytedance.com
Tue Sep 22 10:23:55 UTC 2020
On 9/22/20 9:48 AM, Laine Stump wrote:
> (Please don't Cc individual developers in patch submissions unless
> they've specifically asked you to do so)
>
> On 9/21/20 8:25 AM, zhenwei pi wrote:
>> Collect PCI passthrough net device stats from kernel by netlink
>> API.
>>
>> Currently, libvirt can not get PCI passthrough net device stats,
>> run command:
>> #virsh domifstat instance --interface=52:54:00:2d:b2:35
>> error: Failed to get interface stats instance 52:54:00:2d:b2:35
>> error: internal error: Interface name not provided
>>
>> The PCI device(usually SR-IOV virtual function device) is detached
>> while it's used in PCI passthrough mode. And we can not parse this
>> device from /proc/net/dev any more.
>>
>> In this patch, libvirt check net device is VF of not firstly, then
>> query virNetDevVFInterfaceStats(new API).
>> virNetDevVFInterfaceStats parses VFs info of all PFs, compares MAC
>> address until the two MAC addresses match.
>> '#ip -s link show' can get the same result. Instead of parsing the
>> output result, implement this feature by libnl API.
>>
>> Notice that this feature deponds on driver of PF.
>> Test on Mellanox ConnectX-4 Lx, it works well.
>> Also test on Intel Corporation 82599ES, it works, but only get 0.
>> (ip-link command get the same result).
>>
>> Signed-off-by: zhenwei pi <pizhenwei at bytedance.com>
>> ---
>> meson.build | 4 ++
>> src/libvirt_private.syms | 1 +
>> src/qemu/qemu_driver.c | 3 +
>> src/util/virnetdev.c | 158
>> +++++++++++++++++++++++++++++++++++++++++++++++
>> src/util/virnetdev.h | 5 ++
>> 5 files changed, 171 insertions(+)
>>
>> diff --git a/meson.build b/meson.build
>> index 24535a403c..e17da9e2b9 100644
>> --- a/meson.build
>> +++ b/meson.build
>> @@ -1392,6 +1392,10 @@ if not get_option('virtualport').disabled()
>> endif
>> endif
>> +if cc.has_header_symbol('linux/if_link.h', 'IFLA_VF_STATS')
>> + conf.set('WITH_VF_STATS', 1)
>> +endif
>> +
>
>
> (a bit of digression here...)
>
>
> I understand why you put this chunk in, but I'm fairly certain that it
> isn't needed.
>
>
> In the case of the few other things from if_link.h that have their own
> "WITH_BLAH" compile-time flag (e.g. WITH_VIRTUALPORT, which checks for
> existence of IFLA_PORT_MAX in if_link.h), they were added in the before
> before time, when the feature in question had been recently added to the
> kernel, and so there were some supported distro versions that had the
> new feature, and some that didn't yet. In order to compile properly on
> all supported platforms (though lacking these new-at-the-time features
> on some) we had to gate them on the presence of some sentinel in
> if_link.h. Those WITH_BLAH flags were then forgotten about, even though
> the list of supported platform/versions has changed over the years, and
> they will now work on *any* supported Linux platform.
>
>
> Now that this patch has pointed them out, I think it's time for some
> house cleaning. IFLA_PORT_MAX, for example, has been in the upstream
> kernel since 2.6.35, and was even backported to the RHEL*6* 2.6.32
> kernel. Even router firmware these days has a newer kernel than that, so
> it's a safe bet that any system with __linux_ and WITH_LIBNL can enable
> all of that code, i.e. we can just get rid of the extra
> WITH_VIRTUALPORT. I just added this to my personal todo list; let's see
> how long it takes until I actually accomplish it (betting starts now!)
>
>
> Back to the topic of *this* patch: I looked up IFLA_VF_STATS for the
> upstream kernel, and it first appeared in kernel 4.2. RHEL7 (which is
> the oldest RHEL now supported by libvirt) is based on kernel 3.10, but
> has *a lot* of backports, and so when I checked I found that it also
> supports IFLA_VF_STATS. I don't have the details handy for the oldest
> version of other Linuxes we support, but I tried eliminating the extra
> check for IFLA_VF_STATS, and the full CI suite on gitlab builds without
> error (there is an unrelated error in ubuntu-18.04 in libxl, and
> different errors in the freebsd, mingw, and macos builds that *are*
> caused by different aspects of this patch, but IFLA_VF_STATS does exist
> on all Linux platforms included in the libvirt upstream CI testing).
>
>
> So support for this feature can be gated on "defined(__linux__) &&
> WITH_LIBNL (as should just about anything else using netlink)
>
>
>> if host_machine.system() == 'windows'
>> ole32_dep = cc.find_library('ole32')
>> oleaut32_dep = cc.find_library('oleaut32')
>> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
>> index bdbe3431b8..bcc40b8d69 100644
>> --- a/src/libvirt_private.syms
>> +++ b/src/libvirt_private.syms
>> @@ -2585,6 +2585,7 @@ virNetDevSetRcvMulti;
>> virNetDevSetupControl;
>> virNetDevSysfsFile;
>> virNetDevValidateConfig;
>> +virNetDevVFInterfaceStats;
>> # util/virnetdevbandwidth.h
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index ae715c01d7..f554010c40 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -10196,6 +10196,9 @@ qemuDomainInterfaceStats(virDomainPtr dom,
>> if (virDomainNetGetActualType(net) ==
>> VIR_DOMAIN_NET_TYPE_VHOSTUSER) {
>> if (virNetDevOpenvswitchInterfaceStats(net->ifname, stats) < 0)
>> goto cleanup;
>> + } else if (virDomainNetGetActualType(net) ==
>> VIR_DOMAIN_NET_TYPE_HOSTDEV) {
>> + if (virNetDevVFInterfaceStats(&net->mac, stats) < 0)
>> + goto cleanup;
>> } else {
>> if (virNetDevTapInterfaceStats(net->ifname, stats,
>>
>> !virDomainNetTypeSharesHostView(net)) < 0)
>> diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c
>> index e1a4cc2bef..be9b8ce4a9 100644
>> --- a/src/util/virnetdev.c
>> +++ b/src/util/virnetdev.c
>> @@ -1489,6 +1489,9 @@ static struct nla_policy
>> ifla_vf_policy[IFLA_VF_MAX+1] = {
>> .maxlen = sizeof(struct ifla_vf_mac) },
>> [IFLA_VF_VLAN] = { .type = NLA_UNSPEC,
>> .maxlen = sizeof(struct ifla_vf_vlan) },
>> +#if defined(WITH_VF_STATS)
>
>
> Due to being a nested #if, the above should have been "# if defined...."
> (note the space after #). Of course that #if will be removed, but to
> catch things like this in the future, you should install the cppi
> package (which is still optional for libvirt, since it is (was?)
> unavailable on some supported platforms) and run "ninja -C build test",
> which will run all the tests, including syntax / code style checks.
>
>
>> + [IFLA_VF_STATS] = { .type = NLA_NESTED },
>> +#endif
>> };
>> @@ -2265,6 +2268,151 @@ virNetDevSetNetConfig(const char *linkdev, int
>> vf,
>> return 0;
>> }
>> +#if defined(WITH_VF_STATS)
>> +static struct nla_policy ifla_vfstats_policy[IFLA_VF_STATS_MAX+1] = {
>> + [IFLA_VF_STATS_RX_PACKETS] = { .type = NLA_U64 },
>> + [IFLA_VF_STATS_TX_PACKETS] = { .type = NLA_U64 },
>> + [IFLA_VF_STATS_RX_BYTES] = { .type = NLA_U64 },
>> + [IFLA_VF_STATS_TX_BYTES] = { .type = NLA_U64 },
>> + [IFLA_VF_STATS_BROADCAST] = { .type = NLA_U64 },
>> + [IFLA_VF_STATS_MULTICAST] = { .type = NLA_U64 },
>> +};
>> +
>> +static int
>> +virNetDevParseVfStats(struct nlattr **tb, virMacAddrPtr mac,
>> + virDomainInterfaceStatsPtr stats)
>> +{
>> + int ret = -1, len;
>> + struct ifla_vf_mac *vf_lladdr;
>> + struct nlattr *nla, *t[IFLA_VF_MAX+1];
>> + struct nlattr *stb[IFLA_VF_STATS_MAX+1];
>> +
>> + if (tb == NULL || mac == NULL || stats == NULL) {
>> + return -1;
>> + }
>> +
>> + if (!tb[IFLA_VFINFO_LIST])
>> + return -1;
>> +
>> + len = nla_len(tb[IFLA_VFINFO_LIST]);
>> +
>> + for (nla = nla_data(tb[IFLA_VFINFO_LIST]); nla_ok(nla, len);
>> + nla = nla_next(nla, &len)) {
>> + ret = nla_parse(t, IFLA_VF_MAX, nla_data(nla), nla_len(nla),
>> + ifla_vf_policy);
>> + if (ret < 0)
>> + return -1;
>> +
>> + if (t[IFLA_VF_MAC] == NULL) {
>> + continue;
>> + }
>> +
>> + vf_lladdr = nla_data(t[IFLA_VF_MAC]);
>> + if(virMacAddrCmpRaw(mac, vf_lladdr->mac)) {
>
>
> The above line failed syntax checks because there must be a space after
> "if" before "("
>
>
>> + continue;
>> + }
>> +
>> + if (t[IFLA_VF_STATS]) {
>> + ret = nla_parse_nested(stb, IFLA_VF_STATS_MAX,
>> + t[IFLA_VF_STATS],
>> + ifla_vfstats_policy);
>> + if (ret < 0)
>> + return -1;
>> +
>> + stats->rx_bytes = nla_get_u64(stb[IFLA_VF_STATS_RX_BYTES]);
>> + stats->tx_bytes = nla_get_u64(stb[IFLA_VF_STATS_TX_BYTES]);
>> + stats->rx_packets =
>> nla_get_u64(stb[IFLA_VF_STATS_RX_PACKETS]);
>> + stats->tx_packets =
>> nla_get_u64(stb[IFLA_VF_STATS_TX_PACKETS]);
>> + }
>> + return 0;
>> + }
>> +
>> + return ret;
>> +}
>> +
>> +/**
>> + * virNetDevVFInterfaceStats:
>> + * @mac: MAC address of the VF interface
>> + * @stats: returns stats of the VF interface
>> + *
>> + * Get the VF interface from kernel by netlink.
>> + * Returns 0 on success, -1 on failure.
>> + */
>> +int
>> +virNetDevVFInterfaceStats(virMacAddrPtr mac,
>> + virDomainInterfaceStatsPtr stats)
>> +{
>> + FILE *fp;
>> + char line[256], *colon, *ifname;
>> + int rc = -1;
>> + void *nlData = NULL;
>> + struct nlattr *tb[IFLA_MAX + 1] = {NULL, };
>> + char *sysfsDevicePath = NULL;
>> +
>> + fp = fopen("/proc/net/dev", "r");
>> + if (!fp) {
>> + virReportSystemError(errno, "%s",
>> + _("Could not open /proc/net/dev"));
>> + return -1;
>> + }
>> +
>> + /* get all PCI net devices, and parse VFs list from netlink API.
>> + * compare MAC address, collect device stats if matching.
>> + */
>> + while (fgets(line, sizeof(line), fp)) {
>> + /* The line looks like:
>> + * " eth0:..."
>> + * Split it at the colon. and strip blank from head.
>> + */
>> + colon = strchr(line, ':');
>> + if (!colon)
>> + continue;
>> + *colon = '\0';
>> + ifname = line;
>> + while((*ifname == ' ') && (ifname < colon))
>
>
> This line also failed syntax check - needs a space after "while"
>
>
>> + ifname++;
>> +
>> + if (virNetDevSysfsFile(&sysfsDevicePath, ifname, "device") < 0)
>> + break;
>> +
>> + if (virNetDevIsPCIDevice(sysfsDevicePath)) {
>> + rc = virNetlinkDumpLink(ifname, -1, &nlData, tb, 0, 0);
>> + if (rc < 0) {
>> + rc = -1;
>> + goto cleanup;
>> + }
>> +
>> + rc = virNetDevParseVfStats(tb, mac, stats);
>> + VIR_FREE(nlData);
>> + if (rc == 0)
>> + goto cleanup;
>> + }
>> + VIR_FREE(sysfsDevicePath);
>> + }
>> +
>> +cleanup:
>
>
> This also failed syntax checks - labels must start in column 2, not
> column 1 (apparently this makes some simple-minded text editors happier
> or something...)
>
>
>> + VIR_FREE(sysfsDevicePath);
>> + VIR_FORCE_FCLOSE(fp);
>> + return rc;
>> +}
>> +
>> +#else /* #if defined(WITH_VF_STATS) */
>> +
>> +int
>> +virNetDevVFInterfaceStats(virMacAddrPtr mac,
>> + virDomainInterfaceStatsPtr stats)
>> +{
>> + virReportSystemError(ENOSYS, "%s",
>> + _("Unable to get VF net device stats on this
>> kernel, please upgrade kernel at lease linux-4.2"));
>> +
>> + /* no need to do anything, just fix compling error here */
>> + if (mac == NULL || stats == NULL) {
>> + return -1;
>> + }
>
>
> Follow the pattern of other platforms to eliminate build error - declare
> the parameters with G_GNUC_UNUSED.
>
>> +
>> + return -1;
>> +}
>> +#endif /* #if defined(WITH_VF_STATS) */
>> #else /* defined(__linux__) && defined(WITH_LIBNL) &&
>> defined(IFLA_VF_MAX) */
>> @@ -2309,6 +2457,16 @@ virNetDevSetNetConfig(const char *linkdev
>> G_GNUC_UNUSED,
>> }
>> +int
>> +virNetDevVFInterfaceStats(virMacAddrPtr mac,
>> + virDomainInterfaceStatsPtr stats)
>> +{
>> + virReportSystemError(ENOSYS, "%s",
>> + _("Unable to get VF net device stats on this
>> platform"));
>> + return -1;
>
>
> Again, you need G_GNUC_UNUSED on the parameters. (This failed the CI
> tests for FreeBSD and MacOS).
>
>
>
>> +}
>> +
>> +
>> #endif /* defined(__linux__) && defined(WITH_LIBNL) &&
>> defined(IFLA_VF_MAX) */
>> VIR_ENUM_IMPL(virNetDevIfState,
>> diff --git a/src/util/virnetdev.h b/src/util/virnetdev.h
>> index 5f581323ed..ff59d9d341 100644
>> --- a/src/util/virnetdev.h
>> +++ b/src/util/virnetdev.h
>> @@ -312,4 +312,9 @@ int virNetDevSysfsFile(char **pf_sysfs_device_link,
>> int virNetDevRunEthernetScript(const char *ifname, const char *script)
>> G_GNUC_NO_INLINE;
>> +int virNetDevVFInterfaceStats(virMacAddrPtr mac,
>> + virDomainInterfaceStatsPtr stats)
>> +ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2);
>> +
>> +
>> G_DEFINE_AUTOPTR_CLEANUP_FUNC(virNetDevRxFilter,
>> virNetDevRxFilterFree);
>
>
> I didn't have time to look at the code itself yet. In the meantime,
> fixing it up so that this is just another one of the "#if
> defined(__linux__) && WITH_LIBNL" functions, and eliminating the CI
> build/test errors would be useful; possibly that's all that will be
> needed, and even if not you'll have a head start :-)
>
>
Thanks a lot, I'll fix all the above problems and push a v2 version.
--
zhenwei pi
More information about the libvir-list
mailing list