[FYI PATCH 5/5] util: open code virXMLNodeContentString to access the node object directly

Laine Stump laine at redhat.com
Mon Jul 20 20:48:43 UTC 2020

(I am *NOT* advocating that we apply this patch. Just providing it for
informational purposes, since we had previously discussed this
possibility on the list)

Since it's impossible to determine whether xmlNodeContent has returned
a NULL due to OOM, or due to badly formed / evil XML, this patch open
codes virXMLNodeContentString to get the content string directly from
the node.

This turns out to not be so easy as it seemed at first glance when it
was suggested - the "content" member of the element node itself does
not contain the content string for the node. The content string that
we want can be found (at least for our uses of libxml) by looking for
a child node of the element node - if that child node is of type
XML_TEXT_NODE, then the content member of *that* node is the string
we're looking for. If there is no child node, then the element has no
content, so we return "". Likewise, if the child node is type
XML_TEXT_NODE but has no content, we also return "". In all other
cases, we log an error and return because this is some case that
hasn't been encountered in our test cases, so either someone is
sending bad XML, or our assumptions about the layout of the XML node
object list are incorrect.

Note that while calling virXMLNodeContentString() would return NULL
from an OOM situation, this new code will exit the process on OOM
(since it is calling glib for memory allocation).

Signed-off-by: Laine Stump <laine at redhat.com>
 src/util/virxml.c | 43 ++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 38 insertions(+), 5 deletions(-)

diff --git a/src/util/virxml.c b/src/util/virxml.c
index 5315d4ff6f..b2298d74c8 100644
--- a/src/util/virxml.c
+++ b/src/util/virxml.c
@@ -538,7 +538,17 @@ virXMLPropStringLimit(xmlNodePtr node,
 char *
 virXMLNodeContentString(xmlNodePtr node)
-    char *ret = (char *)xmlNodeGetContent(node);
+    /* We specifically avoid using virXMLNodeContentString() here, because
+     * when NULL is returned, it is difficult/impossible to
+     * distinguish between 1) OOM, 2) NULL content, 3) some other error.
+     */
+    /* for elements used the way libvirt uses them, the xmlNode object
+     * for an element will have a type of XML_ELEMENT_NODE, and if the
+     * node has any content, it will be in the content field of a
+     * child node of that object which is itself of type
+     * XML_TEXT_NODE.
+     */
     if (node->type !=  XML_ELEMENT_NODE) {
@@ -547,15 +557,38 @@ virXMLNodeContentString(xmlNodePtr node)
         return NULL;
-    if (!ret) {
+    /* no children --> empty element node */
+    if (!node->children)
+        return g_strdup("");
+    /* if the child isn't text, or there is more than a single node
+     * hanging off "children", our assumptions have been wrong
+     */
+    if (node->children->type != XML_TEXT_NODE) {
+        virReportError(VIR_ERR_INTERNAL_ERROR,
+                       _("child of element node '%s' has unexpected name '%s', type %d"),
+                       node->name, node->children->name, node->children->type);
+        return NULL;
+    }
+    if (node->children->next) {
-                       _("node '%s' has unexpected NULL content. This could be caused by malformed input, or a memory allocation failure"),
+                       _("child of element node '%s' is type XML_TEXT_NODE, but is a list"),
+                       node->name);
+        return NULL;
+    }
+    if (node->children->children) {
+        virReportError(VIR_ERR_INTERNAL_ERROR,
+                       _("child of element node '%s' is type XML_TEXT_NODE, but has children"),
         return NULL;
-    return ret;
+    /* if content is NULL, return "" instead */
+    if (!node->children->content)
+        return g_strdup("");
+    return g_strdup((char *)node->children->content);
+ }

More information about the libvir-list mailing list