[augeas-devel] improving performance of aug_get() and aug_match() with large datasets
Laine Stump
laine at redhat.com
Tue Sep 22 19:18:14 UTC 2015
It was bound to happen eventually. Someone created a host with 514 vlan
interfaces each connected to a host bridge, then started up
virt-manager. virt-manager likes to learn the status of all the network
interfaces on a host by calling libvirt (the equivalent of "virsh
iface-list --all" followed by "virsh iface-dumpxml bobloblaw" for each
interface). libvirt makes some calls to the netcf library, which queries
the interface config on disk using augeas (what amounts to aug_get() and
aug_match() calls). Too bad that when you have 514 vlan+bridge combos,
this operation takes ~20 minutes on good hardware!
I looked into the libvirt part of it and there were some obvious
inefficiencies (the function netcfConnectListAllInterfaces() ends up
calling ncf_if_mac_string() and ncf_if_name() multiple times for each
interface, when it could 1) call ncf_if_mac_string() once, and 2) never
call ncf_if_name() at all), but even fixing those only eliminates about
20% of the total time. I then looked at removing all of the ncf_* calls
in the libvirt function (after the first call to receive a simple list
of interfaces) and found that we're still left with about 40% of the
total time. So there is a lot that can be done in libvirt, but 40% of
the time is still spent in netcf, with the majority of that in calls to
aug_get() and aug_match().
I have two questions based on this:
1) has anyone thought about/looked into optimizing/changing the data
structure used to store nodes in augeas to scale better with larger
datasets (execution time seems to increase at > linear)?
2) I recall that a long time ago augeas put in code to re-read/parse
files only if they had been modified. netcf (and thus libvirt) could
take advantage of this info if it was available in the augeas API - the
first time it retrieved the info for an interface it would take a hit,
but all subsequent times could be much quicker.
More information about the augeas-devel
mailing list