[Libguestfs] [nbdkit PATCH v2 4/8] api: Add nbdkit_str[n]dup_intern helper

Eric Blake eblake at redhat.com
Thu Aug 27 02:16:48 UTC 2020


Implementing .default_export with its 'const char *' return is tricky
in the sh plugin: we must return dynamic memory, but must avoid a
use-after-free.  And we don't want to change the return type of
.default_export to 'char *', because that would make our choice of
malloc()/free() part of the API, preventing either nbdkit or a plugin
from experimenting with an alternate allocator implementation.
Elsewhere, we have done things like nbdkit_extents_new(), so that even
though the client is directing allocation, the actual call to malloc()
is done by nbdkit proper, so that nbdkit later calling free() does not
tie the plugin's hands, and nbdkit could change its underlying
allocation without breaking ABI.  (Note that nbdkit_realpath() and
nbdkit_absolute_path() require the caller to use free(), but as they
are used during .config, it's less of a burden for plugins to take
care of that in .unload.)

We could document that the burden is on the plugin to avoid the memory
leak, by making sh track all strings it returns, then finally clean
them up during .unload.  But this is still a memory leak in the case
of .default_export, which can be called multiple times when there are
multiple clients, and presumably with different values per client
call; better would be freeing the memory at .close when each client
goes away.  Except that .default_export is called before .open, and
therefore before we have access to the handle struct that will
eventually make it to .close.

So, the solution is to let nbdkit track an alternate string on our
behalf, where nbdkit then cleans it up as the client goes away.

There are several places in the existing code base that can also take
advantage of this copying functionality, including in filters that
want to pass altered strings to .config while still obeying lifetime
rules.

Signed-off-by: Eric Blake <eblake at redhat.com>
---
 docs/nbdkit-filter.pod  |  9 +++++---
 docs/nbdkit-plugin.pod  | 33 +++++++++++++++++++++++++---
 include/nbdkit-common.h |  4 ++++
 server/internal.h       |  7 +++++-
 server/connections.c    |  2 +-
 server/main.c           | 26 ++++++----------------
 server/nbdkit.syms      |  2 ++
 server/plugins.c        | 11 +++-------
 server/public.c         | 48 +++++++++++++++++++++++++++++++++++++++++
 filters/log/log.c       | 10 +++------
 10 files changed, 110 insertions(+), 42 deletions(-)

diff --git a/docs/nbdkit-filter.pod b/docs/nbdkit-filter.pod
index dd667c0c..7d72d138 100644
--- a/docs/nbdkit-filter.pod
+++ b/docs/nbdkit-filter.pod
@@ -183,7 +183,8 @@ You can modify parameters when you call the C<next> function.  However
 be careful when modifying strings because for some methods
 (eg. C<.config>) the plugin may save the string pointer that you pass
 along.  So you may have to ensure that the string is not freed for the
-lifetime of the server.
+lifetime of the server; you may find C<nbdkit_strdup_intern> helpful
+for avoiding a memory leak while still obeying lifecycle constraints.

 Note that if your filter registers a callback but in that callback it
 doesn't call the C<next> function then the corresponding method in the
@@ -450,8 +451,10 @@ requires write access to the underlying device in case a journal needs
 to be replayed for consistency as part of the mounting process.

 The C<exportname> string is only guaranteed to be available during the
-call.  If the filter needs to use it (other than immediately passing
-it down to the next layer) it must take a copy.  The C<exportname> and
+call (different than the lifetime for the return of C<nbdkit_export_name>
+used by plugins).  If the filter needs to use it (other than immediately
+passing it down to the next layer) it must take a copy, although
+C<nbdkit_strdup_intern> is useful for this task.  The C<exportname> and
 C<is_tls> parameters are provided so that filters do not need to use
 the plugin-only interfaces of C<nbdkit_export_name> and
 C<nbdkit_is_tls>.
diff --git a/docs/nbdkit-plugin.pod b/docs/nbdkit-plugin.pod
index 50cf991d..056b1450 100644
--- a/docs/nbdkit-plugin.pod
+++ b/docs/nbdkit-plugin.pod
@@ -748,7 +748,10 @@ leak unintended information.
 If the plugin returns C<NULL> or an invalid string (such as longer
 than 4096 bytes), the client is not permitted to connect to the
 default export.  However, this is not an error in the protocol, so it
-is not necessary to call C<nbdkit_error>.
+is not necessary to call C<nbdkit_error>.  If the callback will
+not be returning a compile-time constant string, you may find
+C<nbdkit_strdup_intern> helpful for returning a value that avoids a
+memory leak.

 =head2 C<.open>

@@ -1504,8 +1507,9 @@ content.

 Return the optional NBD export name if one was negotiated with the
 current client (this uses thread-local magic so no parameter is
-required).  The returned string is only valid while the client is
-connected, so if you need to store it in the plugin you must copy it.
+required).  The returned string is valid at least through the
+C<.close> of the current connection, but if you need to store it
+in the plugin for use by more than one client you must copy it.

 The export name is a free-form text string, it is not necessarily a
 path or filename and it does not need to begin with a C<'/'>
@@ -1516,6 +1520,29 @@ client data, be cautious when parsing it.>

 On error, C<nbdkit_error> is called and the call returns C<NULL>.

+=head1 STRING LIFETIME
+
+Some callbacks are specified to return C<const char *>, even when a
+plugin may not have a suitable compile-time constant to return.
+Returning dynamically-allocated memory for such a callback would
+induce a memory leak or otherwise complicate the plugin to perform
+additional bookkeeping.  For these cases, nbdkit provides a
+convenience function for creating a copy of a string for better
+lifetime management.
+
+ const char *nbdkit_strdup_intern (const char *str);
+ const char *nbdkit_strndup_intern (const char *str, size_t n);
+
+Returns a copy of C<str>, possibly limited to a maximum of C<n> bytes,
+so that the caller may reclaim str and use the copy in its place.  If
+the copy is created outside the scope of a connection (such as during
+C<.load> or C<.config>), the lifetime of the copy will last at least
+through C<.unload>; if it was created within a connection (such as
+during C<.preconnect>, C<.default_export>, or C<.open>), the lifetime
+will last at least through C<.close>.
+
+On error, C<nbdkit_error> is called and the call returns C<NULL>.
+
 =head1 AUTHENTICATION

 A server may use C<nbdkit_is_tls> to limit which export names work
diff --git a/include/nbdkit-common.h b/include/nbdkit-common.h
index 5e8c0b6f..e9d31421 100644
--- a/include/nbdkit-common.h
+++ b/include/nbdkit-common.h
@@ -120,6 +120,10 @@ NBDKIT_EXTERN_DECL (int, nbdkit_nanosleep, (unsigned sec, unsigned nsec));
 NBDKIT_EXTERN_DECL (int, nbdkit_peer_name,
                     (struct sockaddr *addr, socklen_t *addrlen));
 NBDKIT_EXTERN_DECL (void, nbdkit_shutdown, (void));
+NBDKIT_EXTERN_DECL (const char *, nbdkit_strdup_intern,
+                    (const char *str));
+NBDKIT_EXTERN_DECL (const char *, nbdkit_strndup_intern,
+                    (const char *str, size_t n));

 struct nbdkit_extents;
 NBDKIT_EXTERN_DECL (int, nbdkit_add_extent,
diff --git a/server/internal.h b/server/internal.h
index e2a68513..f11f897e 100644
--- a/server/internal.h
+++ b/server/internal.h
@@ -237,6 +237,7 @@ reset_handle (struct handle *h)
   h->can_cache = -1;
 }

+DEFINE_VECTOR_TYPE(string_vector, char *);
 struct connection {
   pthread_mutex_t request_lock;
   pthread_mutex_t read_lock;
@@ -258,8 +259,9 @@ struct connection {
   bool structured_replies;
   bool meta_context_base_allocation;

+  string_vector interns;
   char *exportname_from_set_meta_context;
-  char *exportname;
+  const char *exportname;

   int sockin, sockout;
   connection_recv_function recv;
@@ -298,6 +300,9 @@ extern int protocol_recv_request_send_reply (void);
  */
 #define base_allocation_id 1

+/* public.c */
+extern void free_interns (void);
+
 /* crypto.c */
 #define root_tls_certificates_dir sysconfdir "/pki/" PACKAGE_NAME
 extern void crypto_init (bool tls_set_on_cli);
diff --git a/server/connections.c b/server/connections.c
index 67a68469..d9f685c9 100644
--- a/server/connections.c
+++ b/server/connections.c
@@ -360,7 +360,7 @@ free_connection (struct connection *conn)
   pthread_mutex_destroy (&conn->status_lock);

   free (conn->exportname_from_set_meta_context);
-  free (conn->exportname);
+  free_interns ();

   /* This is needed in order to free a field in struct handle. */
   for_each_backend (b)
diff --git a/server/main.c b/server/main.c
index 17c4c324..ae552447 100644
--- a/server/main.c
+++ b/server/main.c
@@ -631,15 +631,9 @@ main (int argc, char *argv[])
    *
    * Keys must live for the life of nbdkit.  Since we want to avoid
    * modifying argv (so that /proc/PID/cmdline remains sane) but we
-   * need to create a key from argv[i] = "key=value" we must save the
-   * keys in an array which is freed at the end of main().
+   * need to create a key from argv[i] = "key=value" we must intern
+   * the keys, which are then freed at the end of main().
    */
-  char **keys = calloc (argc, sizeof (char *));
-  if (keys == NULL) {
-    perror ("calloc");
-    exit (EXIT_FAILURE);
-  }
-
   magic_config_key = top->magic_config_key (top);
   for (i = 0; optind < argc; ++i, ++optind) {
     size_t n;
@@ -647,12 +641,10 @@ main (int argc, char *argv[])
     p = strchr (argv[optind], '=');
     n = p - argv[optind];
     if (p && is_config_key (argv[optind], n)) { /* Is it key=value? */
-      keys[optind] = strndup (argv[optind], n);
-      if (keys[optind] == NULL) {
-        perror ("strndup");
+      const char *key = nbdkit_strndup_intern (argv[optind], n);
+      if (key == NULL)
         exit (EXIT_FAILURE);
-      }
-      top->config (top, keys[optind], p+1);
+      top->config (top, key, p+1);
     }
     else if (magic_config_key == NULL) {
       if (i == 0)               /* magic script parameter */
@@ -677,9 +669,7 @@ main (int argc, char *argv[])
   if (dump_plugin) {
     top->dump_fields (top);
     top->free (top);
-    for (i = 1; i < argc; ++i)
-      free (keys[i]);
-    free (keys);
+    free_interns ();
     exit (EXIT_SUCCESS);
   }

@@ -717,9 +707,7 @@ main (int argc, char *argv[])
   crypto_free ();
   close_quit_pipe ();

-  for (i = 1; i < argc; ++i)
-    free (keys[i]);
-  free (keys);
+  free_interns ();

   /* Note: Don't exit here, otherwise this won't work when compiled
    * for libFuzzer.
diff --git a/server/nbdkit.syms b/server/nbdkit.syms
index 212e36aa..d17878b7 100644
--- a/server/nbdkit.syms
+++ b/server/nbdkit.syms
@@ -74,6 +74,8 @@
     nbdkit_set_error;
     nbdkit_shutdown;
     nbdkit_stdio_safe;
+    nbdkit_strdup_intern;
+    nbdkit_strndup_intern;
     nbdkit_vdebug;
     nbdkit_verror;

diff --git a/server/plugins.c b/server/plugins.c
index da28b2f1..7425857e 100644
--- a/server/plugins.c
+++ b/server/plugins.c
@@ -328,17 +328,13 @@ plugin_open (struct backend *b, int readonly, const char *exportname,
    * will still need to save the export name in the handle because of
    * the lifetime issue.
    */
-  conn->exportname = strdup (exportname);
-  if (conn->exportname == NULL) {
-    nbdkit_error ("strdup: %m");
+  conn->exportname = nbdkit_strdup_intern (exportname);
+  if (conn->exportname == NULL)
     return NULL;
-  }

   r = p->plugin.open (readonly);
-  if (r == NULL) {
-    free (conn->exportname);
+  if (r == NULL)
     conn->exportname = NULL;
-  }
   return r;
 }

@@ -367,7 +363,6 @@ plugin_close (struct backend *b, void *handle)

   if (handle && p->plugin.close)
     p->plugin.close (handle);
-  free (conn->exportname);
   conn->exportname = NULL;
 }

diff --git a/server/public.c b/server/public.c
index d10d466e..512e9caa 100644
--- a/server/public.c
+++ b/server/public.c
@@ -726,3 +726,51 @@ nbdkit_peer_name (struct sockaddr *addr, socklen_t *addrlen)

   return 0;
 }
+
+/* Functions for manipulating intern'd strings. */
+
+static string_vector global_interns;
+
+void
+free_interns (void)
+{
+  struct connection *conn = threadlocal_get_conn ();
+  string_vector *list = conn ? &conn->interns : &global_interns;
+
+  string_vector_iter (list, (void *) free);
+  string_vector_reset (list);
+}
+
+const char *
+nbdkit_strndup_intern (const char *str, size_t n)
+{
+  struct connection *conn = threadlocal_get_conn ();
+  string_vector *list = conn ? &conn->interns : &global_interns;
+  char *copy;
+
+  if (str == NULL) {
+    nbdkit_error ("nbdkit_strndup_intern: no string given");
+    errno = EINVAL;
+    return NULL;
+  }
+
+  copy = strndup (str, n);
+  if (copy == NULL) {
+    nbdkit_error ("strndup: %m");
+    return NULL;
+  }
+
+  if (string_vector_append (list, copy) == -1) {
+    nbdkit_error ("malloc: %m");
+    free (copy);
+    return NULL;
+  }
+
+  return copy;
+}
+
+const char *
+nbdkit_strdup_intern (const char *str)
+{
+  return nbdkit_strndup_intern (str, SIZE_MAX);
+}
diff --git a/filters/log/log.c b/filters/log/log.c
index 71c21211..4525f362 100644
--- a/filters/log/log.c
+++ b/filters/log/log.c
@@ -134,7 +134,7 @@ log_get_ready (nbdkit_next_get_ready *next, void *nxdata, int thread_model)
 struct handle {
   uint64_t connection;
   uint64_t id;
-  char *exportname;
+  const char *exportname;
   int tls;
 };

@@ -305,9 +305,8 @@ log_open (nbdkit_next_open *next, void *nxdata,
    * it in log_prepare.  We must take a copy because this string has a
    * short lifetime.
    */
-  h->exportname = strdup (exportname);
+  h->exportname = nbdkit_strdup_intern (exportname);
   if (h->exportname == NULL) {
-    nbdkit_error ("strdup: %m");
     free (h);
     return NULL;
   }
@@ -322,10 +321,7 @@ log_open (nbdkit_next_open *next, void *nxdata,
 static void
 log_close (void *handle)
 {
-  struct handle *h = handle;
-
-  free (h->exportname);
-  free (h);
+  free (handle);
 }

 static int
-- 
2.28.0




More information about the Libguestfs mailing list