[augeas-devel] [PATCH 0 of 4] Some performance improvements
David Lutterkort
dlutter at redhat.com
Fri Aug 8 22:46:22 UTC 2008
As twoerner and raphink noticed on IRC, Augeas is pretty slow when
processing large files, e.g. an /etc/hosts with 10000 lines.
These patches address some of the slowness by eliminating some quadratic
behavior and reducing general overhead in a few places.
To try them, I did two tests: 'augtool quit' and doing 'set
/files/etc/hosts/10/alias[1] newalias' and then 'save', the first to test
the speed of parsing, the second to test the speed of a complete roundtrip,
including writing a changed file out.
Each test was done on /etc/hosts files of varying sizes; the first column
in the tables below is the number of lines in those files, where each line
had an IP address, a canonical name and one alias.
Before applying these patches, I got the following times on my laptop
(T60). Note that I built with -O2 for the tests - optimization seems to
double the performance of Augeas in general.
parse only parse + save
64 0.06s 64 0.09s
128 0.04s 128 0.11s
256 0.09s 256 0.18s
512 0.07s 512 0.34s
1024 0.11s 1024 0.61s
2048 0.19s 2048 1.18s
4096 0.41s 4096 2.60s
8192 0.97s 8192 7.49s
16384 2.65s 16384 36.25s
32768 8.80s 32768 > 200s
After applying them, I get
parse only parse + save
64 0.06s 64 0.10s
128 0.05s 128 0.07s
256 0.05s 256 0.09s
512 0.08s 512 0.15s
1024 0.09s 1024 0.25s
2048 0.15s 2048 0.51s
4096 0.28s 4096 1.13s
8192 0.53s 8192 2.93s
16384 1.03s 16384 11.72s
32768 2.13s 32768 108.02s
That's still not as good as I would like it (especially for
saving). There's still two fairly obvious ways to optimize further:
(1) The internal 'dict' data structure needs to be turned into a hash
table (instead of being a linked list)
(2) The regexp matcher is called way too often - we throw away a lot of
information and regenerate that later by calling the matcher
again. It would be much cleaner to change the internals so that they
first construct an explicit parse tree and then process it, rather
than the current way of interleaving the two
David
5 files changed, 98 insertions(+), 38 deletions(-)
src/get.c | 53 +++++++++++++++++++++++++++++++++--------------------
src/list.h | 30 ++++++++++++++++++++++++++++++
src/put.c | 41 +++++++++++++++++++++++------------------
src/regexp.c | 8 ++++++++
src/syntax.h | 4 ++++
More information about the augeas-devel
mailing list