[augeas-devel] improving performance of aug_get() and aug_match() with large datasets

David Lutterkort lutter at watzmann.net
Fri Oct 2 18:32:12 UTC 2015


On Thu, Oct 1, 2015 at 11:44 AM, Laine Stump <laine at redhat.com> wrote:

> On 09/22/2015 03:18 PM, Laine Stump wrote:
>
>> It was bound to happen eventually. Someone created a host with 514 vlan
>> interfaces each connected to a host bridge, then started up virt-manager.
>> [blah blah boring blah removed]
>>
> To update those not included in a separate thread on the topic in
> netcf-devel (I'll try to keep all discussion here from now on):
>
> Dan Berrange pointed out that netcf was calling aug_load() on each entry
> to a public netcf API, and libvirt was calling netcf APIs multiple times
> for each interface. Even though aug_load() checks the mtime of files it has
> already loaded, and avoids re-loading those that haven't been modified (in
> this case none have been modified), it turns out that just doing a stat()
> of 1100 files takes a significant amount of time. So I modified netcf to
> only call aug_load() to do this check if it has been at least 1 second
> since the last time it was called. This made a very large improvement,
> especially when running the upstream versions of all involved packages
> (virt-manager --> libvirt --> netcf --> augeas). But when running the
> versions that are included in RHEL6, it wasn't so rosy. A test setup of 514
> bridge+vlan interfaces which took around 30 minutes (!!) to complete a full
> startup of virt-manager (which calls netcf/augeas to list all interfaces,
> then get the XML config for them) now takes 13 minutes with netcf modified
> to call aug_load() only once per second. (the same operation takes "only" 8
> minutes using all upstream code).
>
> But 13 (or even 8) minutes is still a very long time, so I played around a
> bit in gdb and found that most of the time now seems to be spent in one
> call to aug_match():
>
>
>   r = aug_match(aug, path, "/files/etc/sysconfig/network-scripts/*[ DEVICE
> = 'br1' or BRIDGE = 'br1' or MASTER = 'br1' or MASTER = ../*[BRIDGE =
> 'br1']/DEVICE ]/DEVICE");
>
> (this is the result of a call to netcf's aug_fmt_match() in the netcf
> function aug_get_xml_for_nif())
>
> When I step over that call to aug_match(), there is a very noticeable
> pause before the gdb prompt comes back, while continuing from that point
> all the way through virt-manager's "get all interfaces" loop back to the
> next call to aug_get_xml_for_nif() (including several other calls to
> aug_match() that have much simpler search expressions) seems to happen
> instantly.
>
> So apparently doing a match against all ifcfg files based on this complex
> match expression is really slowing us down. Any ideas on how to either make
> this expression simpler, or alternately how to get augeas doing the search
> more quickly?
>

Was that with the performance stuff I did a few days ago ? (You'd need
Augeas HEAD for that)

Alternatively, can you send me your /etc/sysconfig/network-scripts ? (Fair
warning: I will have no time to look into this next week)


> I have two questions based on this:
>>
>> 1) has anyone thought about/looked into optimizing/changing the data
>> structure used to store nodes in augeas to scale better with larger
>> datasets (execution time seems to increase at > linear)?
>>
>
>From what Dominic turned up, the problem doesn't seem to be so much the
data structure for the tree, as the fact that there was some O(n^2)
behavior in building intermediate data structures.


> 2) I recall that a long time ago augeas put in code to re-read/parse files
>> only if they had been modified. netcf (and thus libvirt) could take
>> advantage of this info if it was available in the augeas API - the first
>> time it retrieved the info for an interface it would take a hit, but all
>> subsequent times could be much quicker.
>>
>
> About this one - I'm wondering how well it would work out for augeas to
> use inotify to learn about modifications to files (including the directory
> that the ifcfg files live in, in case a new file is created). It works okay
> for netcf to avoid calling aug_load() (as mentioned above), but it does
> make me a bit uncomfortable that we sometimes have a mistaken view of the
> config.
>

It would definitely be a possibilty - we would still need to queue
notifications from inotify and only act on them when the user calls
aug_load to avoid things like destroying changes the user made; IOW, it
still needs to stay predictable when the tree changes based on changes in
the FS. It's been a while since I've looked at inotify, but I think it
would also introduce a Linux dependency; we could work around that by only
using it where available, and falling back to today's behavior.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/augeas-devel/attachments/20151002/7f2daf63/attachment.htm>


More information about the augeas-devel mailing list