[Libguestfs] [libnbd PATCH] python: Speed up pread
Richard W.M. Jones
rjones at redhat.com
Sat May 28 17:16:21 UTC 2022
On Fri, May 27, 2022 at 04:44:16PM -0500, Eric Blake wrote:
> Instead of malloc'ing a C buffer, nbd_pread()ing into it, then copying
> it into an immutable Python bytes object, we can instead pre-create a
> correctly-sized Python bytearray object, then nbd_pread() directly
> into that object's underlying bytes.
>
> While the data copying might not be the bottleneck compared to the
> networking costs, it does have noticeable results; on my machine:
>
> $ export script='
> m=1024*1024
> size=h.get_size()
> for i in range(size // m):
> buf = h.pread(m, m*i)
> '
> $ time ./run nbdkit -U - pattern 10G --run 'nbdsh -u $uri -c "$script"'
>
> sped up from about 7.8s pre-patch to about 7.1s post-patch,
> approximately a 10% speedup.
>
> Note that this slightly changes the python API: h.pread[_structured]
> now returns a mutable 'bytearray' object, rather than an immutable
> 'bytes' object. This makes it possible to modify the just-read string
> in place, rather than having to create yet another memory buffer for
> any modifications, which offers even more speedups when writing a
> read-modify-write paradigm in python. But the change is
> backwards-compatible - python already states that a read-write buffer
> may be returned instead of readonly for any client that already
> operated only on a buffer in a readonly manner.
Although this is not an API change, in general we have no obligation
to maintain backwards compat for the Python API. (The C API is quite
a different matter of course.) Nevertheless, it's nice not to break
things.
> Note that h.pread is more like Python read() semantics in creating a
> buffer, while h.aio_pread is more like Python readinto() semantics in
> modifying a passed-in buffer. But now that both code paths have a
> python object prior to calling into the C API, my next task is to
> improve the h.*pread_structured callback function to pass its buffer
> as a slice of the Python input buffer, rather than doing yet another
> round of pointless memcpy from C into python objects.
You might want to have a look at the nbdkit Python bindings for
inspiration because Nir made those either zero- or low-copy (I forget
exactly which).
> generator/Python.ml | 17 +++++++++--------
> 1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/generator/Python.ml b/generator/Python.ml
> index 4ab18f6..1c4446e 100644
> --- a/generator/Python.ml
> +++ b/generator/Python.ml
> @@ -1,6 +1,6 @@
> (* hey emacs, this is OCaml code: -*- tuareg -*- *)
> (* nbd client library in userspace: Python bindings
> - * Copyright (C) 2013-2021 Red Hat Inc.
> + * Copyright (C) 2013-2022 Red Hat Inc.
> *
> * This library is free software; you can redistribute it and/or
> * modify it under the terms of the GNU Lesser General Public
> @@ -293,7 +293,7 @@ let
> | BytesIn (n, _) ->
> pr " Py_buffer %s = { .obj = NULL };\n" n
> | BytesOut (n, count) ->
> - pr " char *%s = NULL;\n" n;
> + pr " PyObject *%s = NULL;\n" n;
> pr " Py_ssize_t %s;\n" count
> | BytesPersistIn (n, _)
> | BytesPersistOut (n, _) ->
> @@ -432,8 +432,8 @@ let
> | Bool _ -> ()
> | BytesIn _ -> ()
> | BytesOut (n, count) ->
> - pr " %s = malloc (%s);\n" n count;
> - pr " if (%s == NULL) { PyErr_NoMemory (); goto out; }\n" n
> + pr " %s = PyByteArray_FromStringAndSize (NULL, %s);\n" n count;
> + pr " if (%s == NULL) goto out;\n" n
> | BytesPersistIn (n, _) | BytesPersistOut (n, _) ->
> pr " %s_buf = nbd_internal_py_get_aio_buffer (%s);\n" n n;
> pr " if (!%s_buf) goto out;\n" n;
> @@ -479,7 +479,7 @@ let
> function
> | Bool n -> pr ", %s" n
> | BytesIn (n, _) -> pr ", %s.buf, %s.len" n n
> - | BytesOut (n, count) -> pr ", %s, %s" n count
> + | BytesOut (n, count) -> pr ", PyByteArray_AS_STRING (%s), %s" n count
> | BytesPersistIn (n, _)
> | BytesPersistOut (n, _) -> pr ", %s_buf->data, %s_buf->len" n n
> | Closure { cbname } -> pr ", %s" cbname
> @@ -524,8 +524,9 @@ let
> let use_ret = ref true in
> List.iter (
> function
> - | BytesOut (n, count) ->
> - pr " py_ret = PyBytes_FromStringAndSize (%s, %s);\n" n count;
> + | BytesOut (n, _) ->
> + pr " py_ret = %s;\n" n;
> + pr " %s = NULL;\n" n;
> use_ret := false
> | Bool _
> | BytesIn _
> @@ -572,7 +573,7 @@ let
> | BytesIn (n, _) ->
> pr " if (%s.obj)\n" n;
> pr " PyBuffer_Release (&%s);\n" n
> - | BytesOut (n, _) -> pr " free (%s);\n" n
> + | BytesOut (n, _) -> pr " Py_XDECREF (%s);\n" n
> | BytesPersistIn _ | BytesPersistOut _ -> ()
> | Closure { cbname } ->
> pr " free_user_data (%s_user_data);\n" cbname
Looks good,
Reviewed-by: Richard W.M. Jones <rjones at redhat.com>
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW
More information about the Libguestfs
mailing list