[RFC 01/21] build-aux: Add a tool to generate xml parse/format functions

Shi Lei shi_lei at massclouds.com
Tue Jun 30 16:06:36 UTC 2020


>On Wed, Jun 10, 2020 at 09:20:29AM +0800, Shi Lei wrote:
>> This tool is used to generate parsexml/formatbuf functions for structs.
>> It is based on libclang and its python-binding.
>> Some directives (such as genparse, xmlattr, etc.) need to be added on
>> the declarations of structs to direct the tool.
>>
>> Signed-off-by: Shi Lei <shi_lei at massclouds.com>
>> ---
>>  build-aux/generator/directive.py | 839 +++++++++++++++++++++++++++++++
>>  build-aux/generator/go           |  14 +
>>  build-aux/generator/main.py      | 416 +++++++++++++++
>>  build-aux/generator/utils.py     | 100 ++++
>
>FWIW, we've tried to standardize on scripts/ directory for any helper
>scripts used during the build process. Calling it "generator" is probably
>too broad, as we've lots of diffferent generators.
>
>I think this would would be a good fit as scripts/xmlgen/ 

Okay. I move it there.

>
>>  po/POTFILES.in                   |   1 +
>>  5 files changed, 1370 insertions(+)
>>  create mode 100644 build-aux/generator/directive.py
>>  create mode 100755 build-aux/generator/go
>>  create mode 100755 build-aux/generator/main.py
>>  create mode 100644 build-aux/generator/utils.py
>
>
>> +T_CLEAR_FUNC_IMPL = '''
>> +void
>> +${typename}Clear(${typename}Ptr def)
>> +{
>> +    if (!def)
>> +        return;
>> +
>> +    ${body}
>> +}
>> +'''
>> +
>> +T_CLEAR_FUNC_DECL = '''
>> +void
>> +${typename}Clear(${typename}Ptr def);
>
>In general I think we have a preference for ${typename}Free
>functions, rather than ${typename}Clear.
>
>A Clear() func is needed when a struct is often stack allocated,
>or when the method allocating & free'ing the struct is different
>from the method clearing it. The latter case is generally considered
>bad practice - it is safer and easier to verify code when clearing
>and freeing is done at the same same.
>
>
>> +def clearMember(member):
>> +    mtype = TypeTable().get(member['type'])
>> +
>> +    refname = 'def->%s' % member['name']
>> +    if member.get('array'):
>> +        refname += '[i]'
>> +
>> +    code = ''
>> +    if mtype['meta'] == 'Struct':
>> +        if member['pointer'] and not mtype.get('external'):
>> +            code = '%sClear(%s);' % (mtype['name'], refname)
>> +            code += '\nVIR_FREE(%s);' % refname
>
>We should use g_free() for any new code. 

Okay.

>
>> +        else:
>> +            code = '%sClear(&%s);' % (mtype['name'], refname)
>> +    elif mtype['name'] == 'String':
>> +        code = 'VIR_FREE(%s);' % refname
>> +    elif mtype['name'] in ['Chars', 'UChars']:
>> +        code = 'memset(%s, 0, sizeof(%s));' % (refname, refname)
>> +    elif not member.get('array'):
>> +        code = '%s = 0;' % refname
>> +
>> +    if member.get('specified'):
>> +        assert not member.get('array'), "'specified' can't come with 'array'."
>> +        code += '\n%s_specified = false;' % refname
>> +
>> +    if member.get('array') and code:
>> +        counter = counterName(member['name'])
>> +        if singleline(code):
>> +            code = render(T_LOOP_SINGLE, counter=counter, body=code)
>> +        else:
>> +            code = render(T_LOOP_MULTI,
>> +                          counter=counter, body=indent(code, 2))
>> +        code += '\nVIR_FREE(def->%s);' % member['name']
>> +        code += '\ndef->%s = 0;' % counter
>> +
>> +    return code
>> +
>> +
>> +T_CLEAR_NAMESPACE = '''
>> +if (def->namespaceData && def->ns.free)
>> +    (def->ns.free)(def->namespaceData);
>> +'''
>> +
>> +
>> +def makeClearFunc(writer, atype):
>> +    if 'genparse' not in atype:
>> +        return
>> +
>> +    blocks = BlockAssembler()
>> +    for member in atype['members']:
>> +        blocks.append(clearMember(member))
>> +
>> +    if 'namespace' in atype:
>> +        blocks.append(T_CLEAR_NAMESPACE.strip())
>> +
>> +    body = indent(blocks.output('\n\n'), 1)
>> +
>> +    impl = render(T_CLEAR_FUNC_IMPL, typename=atype['name'], body=body)
>> +    writer.write(atype, 'clearfunc', '.c', impl)
>> +
>> +    decl = render(T_CLEAR_FUNC_DECL, typename=atype['name'])
>> +    writer.write(atype, 'clearfunc', '.h', decl)
>
>Bearing in mind the above, I'd say we should create Free() funcs,
>not Clear() funcs here.
>
>> +
>> +
>> +#
>> +# Templates for parsing member block
>> +#
>> +T_READ_ATTR_BY_PROP = '${mdvar} = virXMLPropString(node, "${tname}");'
>> +T_READ_ELEM_BY_PROP = '${mdvar} = virXMLChildNode(node, "${tname}");'
>> +T_READ_ELEM_CONTENT = '${mdvar} = virXMLChildNodeContent(node, "${tname}");'
>> +
>> +T_PARSE_MEMBER_MORE = '''
>> +${number} = virXMLChildNodeSet(node, "${tname}", &nodes);
>> +if (${number} > 0) {
>> +    size_t i;
>> +
>> +    if (VIR_ALLOC_N(def->${name}, ${number}) < 0)
>> +        goto error;
>> +
>> +    for (i = 0; i < ${number}; i++) {
>> +        xmlNodePtr tnode = nodes[i];
>> +        ${item}
>> +    }
>> +    def->${counter} = ${number};
>> +    VIR_FREE(nodes);
>> +} else if (${number} < 0) {
>> +    virReportError(VIR_ERR_XML_ERROR, "%s",
>> +                   _("Invalid ${tname} element found."));
>
>For error messages we should avoid using python string substitution.
>Instead we want todo
>
>    virReportError(VIR_ERR_XML_ERROR,
>                   _("Invalid %s element found.").
>	   "${tname}");
>
>This means that when we repeat this code with many different values
>of ${tname}, we're only getting 1 string for the translation team
>to deal with.
>
>The same point for other virReportError error callsin this file,
>I won't mention them all individually. 

Got it. Thanks!

>
>> +T_ALLOC_MEMORY = '''
>> +if (VIR_ALLOC(def->${name}) < 0)
>> +    goto error;
>> +'''
>
>We prefer  g_new0 for new code now.
>
>As a general rule for any code, if there is a GLib API that works
>then we should use that, instead of a libvirt API.
> 

Okay.

>
>
>> +T_PARSE_FUNC_DECL = '''
>> +int
>> +${funcname}(${args});
>> +'''
>> +
>> +T_PARSE_FUNC_IMPL = '''
>> +int
>> +${funcname}(${args})
>> +{
>> +    ${declare_vars}
>> +    VIR_USED(instname);
>> +    VIR_USED(opaque);
>
>There should not be any need for this macro. Instead change
>the function declaration to add G_GNUC_UNUSED for "instname"
>and "opaque". This annotation is harmless if the variable
>really is used. 

Okay.

>
>> +
>> +    if (!def)
>> +        goto error;
>> +
>> +    ${body}
>> +
>> +    return 0;
>> +
>> + error:
>> +    ${cleanup_vars}
>> +    ${typename}Clear(def);
>
>IIUC, the caller of this method is responsible for allocating
>'def', and so the caller should be responsible for running
>${typename}Free(def). IOW, we should not need to call Clear
>for this. 

I suggest we can retain it.
Sometimes the caller doesn't call Free, so this clear function ensures that
members of 'def' will be cleared or freed on error.
And it will not hurt performance, since it is only called on error.  

>
>> +    return -1;
>> +}
>
>> +def _handleTmpVars(tmpvars):
>> +    heads, tails = [], []
>> +    tmpvars = dedup(tmpvars)
>> +    for var in tmpvars:
>> +        if var == 'nodes':
>> +            heads.append('xmlNodePtr *nodes = NULL;')
>> +            tails.append('VIR_FREE(nodes);')
>> +        elif var.endswith('Str'):
>> +            heads.append('g_autofree char *%s = NULL;' % var)
>> +        elif var.endswith('Node'):
>> +            heads.append('xmlNodePtr %s = NULL;' % var)
>> +        else:
>> +            assert var.endswith('Nodes') and var.startswith('n')
>> +            heads.append('int %s = 0;' % var)
>> +
>> +    return '\n'.join(heads), '\n'.join(tails)
>
>One thing I noticed in the generated code is that we;re
>duplicating the strings twice:
>
>Consider this snippet of generated code:
>
>    g_autofree char *nameStr = NULL;
>
>    nameStr = virXMLPropString(node, "name");
>    if (nameStr == NULL) {
>        virReportError(VIR_ERR_XML_ERROR,
>                       _("Missing 'name' setting in '%s'"),
>                       instname);
>        goto error;
>    }
>
>    def->name = g_strdup(nameStr);
>
>
>virXMLPropString already  duplicates the string, and then
>we g_strdup again. So we should just get rid of the
>g_autofree and directly assign def->name = nameStr.
> 

Okay. I would consider how to modify it.

>
>
>> diff --git a/build-aux/generator/go b/build-aux/generator/go
>> new file mode 100755
>> index 0000000..9e30e08
>> --- /dev/null
>> +++ b/build-aux/generator/go
>> @@ -0,0 +1,14 @@
>> +# This is a command-line tool
>> +
>> +libclang_line=`ldconfig -p | grep libclang`
>
>This needs to be libclang.so, otherwise it can match on the
>wrong library:
>
>   $ ldconfig -p | grep libclang
>	libclang.so.10 (libc6,x86-64) => /lib64/libclang.so.10
>	libclang.so (libc6,x86-64) => /lib64/libclang.so
>	libclang-cpp.so.10 (libc6,x86-64) => /lib64/libclang-cpp.so.10
>	libclang-cpp.so (libc6,x86-64) => /lib64/libclang-cpp.so
> 

Okay.

>> +export libclang_path=`expr "$libclang_line" : '.* => \(.*\)$'`
>> +if test -z "$libclang_path"; then
>> +    echo "libclang is required by libvirt\n"
>> +    exit -1
>> +fi
>
>> diff --git a/build-aux/generator/main.py b/build-aux/generator/main.py
>> new file mode 100755
>> index 0000000..78a90f1
>> --- /dev/null
>> +++ b/build-aux/generator/main.py
>> @@ -0,0 +1,416 @@
>> +#!/usr/bin/env python3
>> +#
>> +# Copyright (C) 2020 Shandong Massclouds Co.,Ltd.
>> +#
>> +# This library is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU Lesser General Public
>> +# License as published by the Free Software Foundation; either
>> +# version 2.1 of the License, or (at your option) any later version.
>> +#
>> +# This library is distributed in the hope that it will be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +# Lesser General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU Lesser General Public
>> +# License along with this library.  If not, see
>> +# <http://www.gnu.org/licenses/>.
>> +#
>
>> +if __name__ == "__main__":
>
>> +
>> +    libclang_path = os.environ.get('libclang_path')
>> +    assert libclang_path, 'No libclang library.'
>> +    Config.set_library_file(libclang_path)
>
>I'm wondering if we really need this at all ? AFAICT, the libclang
>python library works fine without it, so this is only required if
>trying to use a non-standard libclang, which I think ought to be
>possible already by setting LD_LIBRARY_PATH
> 

I'm working on Ubuntu. On Ubuntu 20.04 LTS and 18.04, I find:

# ldconfig -p | grep libclang
        libclang-10.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libclang-10.so.1
        libclang-10.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libclang-10.so

There's no libclang.so!

If I install python3 bindings from the standard apt repository, it works!
By default, this version just find 'libclang-10.so'.

But if I use 'pip3' to install python3 bindings, it doesn't work.
Because it is another verison. This version find 'libclang.so' and it can't find it!

So I think we can retain these lines to adapt to the distinction among platforms.

Regards,
Shi Lei

>
>Regards,
>Daniel
>--
>|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
>|: https://libvirt.org -o- https://fstop138.berrange.com :|
>|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
>




More information about the libvir-list mailing list