[PATCH] schema: Re-structure schema for <filesystem> to avoid broken validation

Peter Krempa pkrempa at redhat.com
Thu Oct 13 15:05:51 UTC 2022


On Thu, Oct 13, 2022 at 14:25:30 +0100, Daniel P. Berrangé wrote:
> On Thu, Oct 13, 2022 at 02:57:33PM +0200, Peter Krempa wrote:
> > The validation of a '<filesystem type='mount'>' device fails if the
> > elements inside are not ordered in the order in the schema despite using
> > <interleave>. This is a bug in libxml2's validator as removing the
> > '<optional>' property from the definition of the 'type' attribute with
> > 'mount' variable fixes the problem.
> > 
> > I've reported it as another instance of a seemingly related issue:
> > 
> >   https://gitlab.gnome.org/GNOME/libxml2/-/issues/131
> > 
> > Meanwhile libvirt can re-arrange the schema by extracting the common
> > bits into a new definition and referencing them from each of the choice
> > groups explicitly.
> > 
> > Resolves: https://gitlab.com/libvirt/libvirt/-/issues/392
> > Signed-off-by: Peter Krempa <pkrempa at redhat.com>
> > ---
> >  src/conf/schemas/domaincommon.rng | 365 +++++++++++++++---------------
> >  1 file changed, 186 insertions(+), 179 deletions(-)
> 
> Reviewed-by: Daniel P. Berrangé <berrange at redhat.com>
> 
> 
> We've had many of these schema ordering problems, and I wondered if
> we've got them all yet.  To the surprise of absolutely no one, the
> answer is no.

I'm not surprised about that. I was surprised though about what this
patch fixes as the schema is actually correct in this case.

[...]

> reverse(root)
> print(etree.tostring(tree, pretty_print=True).decode('utf8'))
> 
> 
> $ for i in `find -name '*.xml' `
>   do
>     echo $1
>     ./xmlrngfuzz.py < $i >  ${i%%xml}rev.xml
>   done

This also includes stuff like:
 - '*-invalid.xml' files
    - these are excluded from validation as they are intentionally
      invalid, but no longer match exclusion pattern after reversal

 - output-only XML such as capabilities/domcapabilities
    - we are strictly defining the order there, I'm not sure if it makes
      sense to allow interleaving

Disregarding the above it's not that bad. I've seen ~40 instances e.g.
in qemuxml2argvdata boiling down to 7 actual mistakes in the scheama.

I'll have a look at some more input xmls, but I don't feel like we
should touch the schema for output-only ones.


More information about the libvir-list mailing list