[augeas-devel] improving performance of aug_get() and aug_match() with large datasets

David Lutterkort lutter at watzmann.net
Sat Oct 3 22:27:07 UTC 2015


On Sat, Oct 3, 2015 at 3:05 PM, David Lutterkort <lutter at watzmann.net>
wrote:

> On Thu, Oct 1, 2015 at 11:44 AM, Laine Stump <laine at redhat.com> wrote:
>
> But 13 (or even 8) minutes is still a very long time, so I played around a
>> bit in gdb and found that most of the time now seems to be spent in one
>> call to aug_match():
>>
>>
>>   r = aug_match(aug, path, "/files/etc/sysconfig/network-scripts/*[
>> DEVICE = 'br1' or BRIDGE = 'br1' or MASTER = 'br1' or MASTER = ../*[BRIDGE
>> = 'br1']/DEVICE ]/DEVICE");
>>
>
> Whoever wrote that code must have thought they were incredibly clever with
> this query ;)
>
> There's a few ways in which I think this can be sped up: for one, rather
> than use 'or', we can build an intermediate nodeset for the first three
> nodesets by matching
>
> (1) /files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) =
> 'br1']/DEVICE
>
> The last term in that 'or' is very expensive since it constitutes a nested
> loop, with "/files/etc/sysconfig/network-scripts/*" being the outer loop
> ("for each ifcfg file") and "../*[BRIDGE = 'br1']/DEVICE" being the inner
> loop ("for each ifcfg file see if it is a BRIDGE and return its DEVICE").
> That can be made a little more targetted by using
>
> (2) /files/etc/sysconfig/network-scripts/*/MASTER[ . = ../*[BRIDGE =
> 'br1']/DEVICE ]
>
> so that we only trigger the inner loop for ifcfg files that actually have
> a MASTER entry. This helps if you don't have bonds - I suspect, if there
> are any bonds on the system, the query will still be very expensive.
>

I shouldn't computer on weekends: this part is total nonsense. It needs to
be

 /files/etc/sysconfig/network-scripts/*[MASTER][MASTER = ../*[BRIDGE =
'brvlan42']/DEVICE ]

The first '[MASTER]' makes sure we only run the second predicate "[MASTER =
../*[BRIDGE = 'brvlan42']/DEVICE ]" with the expensive loop on files that
actually have a MASTER entry.

Attached updated augcmds.txt and output - the timings are roughly in the
same ballpark as mentioned in my previous email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/augeas-devel/attachments/20151003/408ce830/attachment.htm>
-------------- next part --------------
citron:augeas (master)>./src/try -e -r /var/tmp/bridges-for-vlans/root/
augtool> # Original
augtool> match /files/etc/sysconfig/network-scripts/*[ DEVICE = 'brvlan42' or BRIDGE = 'brvlan42' or MASTER = 'brvlan42' or MASTER = ../*[BRIDGE = 'brvlan42']/DEVICE ]/DEVICE
aug_match(/files/etc/sysconfig/network-scripts/*[ DEVICE = 'brvlan42' or BRIDGE = 'brvlan42' or MASTER = 'brvlan42' or MASTER = ../*[BRIDGE = 'brvlan42']/DEVICE ]/DEVICE) = 2
Time: 674ms
/files/etc/sysconfig/network-scripts/ifcfg-p14p1.42/DEVICE = p14p1.42
/files/etc/sysconfig/network-scripts/ifcfg-brvlan42/DEVICE = brvlan42
augtool> #
augtool> #
augtool> # Turn first three 'or' terms into an intermediate nodeset, and triggeraugtool> # the inner loop for MASTER only if there actually are bonds
augtool> match (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*[MASTER][MASTER = ../*[BRIDGE = 'brvlan42']/DEVICE ])/DEVICE
aug_match((/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*[MASTER][MASTER = ../*[BRIDGE = 'brvlan42']/DEVICE ])/DEVICE) = 2
Time: 36ms
/files/etc/sysconfig/network-scripts/ifcfg-p14p1.42/DEVICE = p14p1.42
/files/etc/sysconfig/network-scripts/ifcfg-brvlan42/DEVICE = brvlan42
augtool> #
augtool> #
augtool> # Assuming we have two bonds on the system, using 'or'
augtool> match (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*[MASTER = 'bond0' or MASTER = 'bond1' ])/DEVICE
aug_match((/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*[MASTER = 'bond0' or MASTER = 'bond1' ])/DEVICE) = 2
Time: 5ms
/files/etc/sysconfig/network-scripts/ifcfg-p14p1.42/DEVICE = p14p1.42
/files/etc/sysconfig/network-scripts/ifcfg-brvlan42/DEVICE = brvlan42
augtool> #
augtool> #
augtool> # Assuming there are no bonds on the system
augtool> match (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42'])/DEVICE
aug_match((/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42'])/DEVICE) = 2
Time: 3ms
/files/etc/sysconfig/network-scripts/ifcfg-p14p1.42/DEVICE = p14p1.42
/files/etc/sysconfig/network-scripts/ifcfg-brvlan42/DEVICE = brvlan42
augtool>
-------------- next part --------------
# Original
match /files/etc/sysconfig/network-scripts/*[ DEVICE = 'brvlan42' or BRIDGE = 'brvlan42' or MASTER = 'brvlan42' or MASTER = ../*[BRIDGE = 'brvlan42']/DEVICE ]/DEVICE
#
#
# Turn first three 'or' terms into an intermediate nodeset, and trigger
# the inner loop for MASTER only if there actually are bonds
match (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*[MASTER][MASTER = ../*[BRIDGE = 'brvlan42']/DEVICE ])/DEVICE
#
#
# Assuming we have two bonds on the system, using 'or'
match (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42']|/files/etc/sysconfig/network-scripts/*[MASTER = 'bond0' or MASTER = 'bond1' ])/DEVICE
#
#
# Assuming there are no bonds on the system
match (/files/etc/sysconfig/network-scripts/*[(DEVICE|BRIDGE|MASTER) = 'brvlan42'])/DEVICE


More information about the augeas-devel mailing list