[Libguestfs] [PATCH libnbd] nbdfuse: New tool to present a network block device in a FUSE filesystem.
Eric Blake
eblake at redhat.com
Mon Oct 14 21:06:27 UTC 2019
On 10/12/19 9:21 AM, Richard W.M. Jones wrote:
> This program allows you to turn a network block device source into a
> FUSE filesystem containing a virtual file:
>
> $ nbdkit memory 128M
> $ mkdir mp
> $ nbdfuse mp/ramdisk nbd://localhost &
> $ ls -l mp
> total 0
> -rw-rw-rw-. 1 rjones rjones 134217728 Oct 12 15:09 ramdisk
> $ dd if=/dev/urandom bs=1M count=128 of=mp/ramdisk conv=notrunc,nocreat
> 128+0 records in
> 128+0 records out
> 134217728 bytes (134 MB, 128 MiB) copied, 3.10171 s, 43.3 MB/s
> $ fusermount -u mp
>
Cool!
> There are still some shortcomings, such as lack of zero and trim
> support. These are documented in the TODO file.
>
> +++ b/README
> @@ -82,6 +82,8 @@ Optional:
>
> * Python >= 3.3 to build the Python 3 bindings and NBD shell (nbdsh).
>
> + * FUSE to build the nbdfuse program.
Minimum version?
> +++ b/docs/libnbd.pod
> @@ -840,6 +840,7 @@ L<https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md>.
>
> L<libnbd-security(3)>,
> L<nbdsh(1)>,
> +L<nbdfuse(1)>,
Worth sorting these two alphabetically?
> L<qemu(1)>.
>
> +++ b/fuse/nbdfuse.c
> +
> +#define FUSE_USE_VERSION 26
> +
> +#include <fuse.h>
> +#include <fuse_lowlevel.h>
> +
> +#include <libnbd.h>
> +
> +#define MAX_REQUEST_SIZE (64 * 1024 * 1024)
Although this works with nbdkit, qemu-nbd doesn't like more than 32M.
(We really should find time to teach nbdkit/libnbd about block size
reporting, but that's a bigger project...)
> +
> +static struct nbd_handle *nbd;
> +static bool readonly = false;
Looks funny to initialize a static variable to 0 in a block of static
variables with no initializers (C guarantees 0-initialization even if
you aren't explicit).
> +static char *mountpoint, *filename;
> +static const char *pidfile;
> +static char *fuse_options;
> +static struct fuse_chan *ch;
> +static struct fuse *fuse;
> +static struct timespec start_t;
> +static uint64_t size;
> +
> +static void __attribute__((noreturn))
> +usage (FILE *fp, int exitcode)
> +{
> + fprintf (fp,
> +" nbdfuse [-r] MOUNTPOINT[/FILENAME] URI\n"
Do we want to use any #ifdefs to avoid advertising URI support on the
command line when libnbd is compiled without libxml2?
> +"Other modes:\n"
> +" nbdfuse MOUNTPOINT[/FILENAME] --command CMD [ARGS ...]\n"
> +" nbdfuse MOUNTPOINT[/FILENAME] --socket-activation CMD [ARGS ...]\n"
> +" nbdfuse MOUNTPOINT[/FILENAME] --fd N\n"
> +" nbdfuse MOUNTPOINT[/FILENAME] --tcp HOST PORT\n"
> +" nbdfuse MOUNTPOINT[/FILENAME] --unix SOCKET\n"
No mention of nbdfuse -o or -P.
> +"\n"
> +"Please read the nbdfuse(1) manual page for full usage.\n"
> +);
> + exit (exitcode);
nbdfuse --help > /dev/full
exits with status 0 because we didn't check for error on stdout/stderr.
That's a corner case, and many programs don't care about it, but it's
worth deciding if we want to care.
> +}
> +
> +static void
> +display_version (void)
> +{
> + printf ("%s %s\n", PACKAGE_NAME, PACKAGE_VERSION);
> +}
> +
> +static void
> +fuse_help (const char *prog)
> +{
> + static struct fuse_operations null_operations;
> + const char *tmp_argv[] = { prog, "--help", NULL };
> + fuse_main (2, (char **) tmp_argv, &null_operations, NULL);
> + exit (EXIT_SUCCESS);
> +}
> +
> +static bool
> +is_directory (const char *path)
> +{
> + struct stat statbuf;
> +
> + if (stat (path, &statbuf) == -1)
> + return false;
> + return S_ISDIR (statbuf.st_mode);
Accepts a symlink-to-directory, but that's fine by me.
> +}
> +
> +int
> +main (int argc, char *argv[])
> +{
> + enum {
> + MODE_URI,
> + MODE_COMMAND,
> + MODE_FD,
> + MODE_SOCKET_ACTIVATION,
> + MODE_TCP,
> + MODE_UNIX,
> + } mode = MODE_URI;
> + enum {
> + HELP_OPTION = CHAR_MAX + 1,
> + FUSE_HELP_OPTION,
> + };
> + /* Note the "+" means we stop processing as soon as we get to the
> + * first non-option argument (the mountpoint) and then we parse the
> + * rest of the command line without getopt.
> + */
> + const char *short_options = "+o:P:rV";
> + const struct option long_options[] = {
> + { "fuse-help", no_argument, NULL, FUSE_HELP_OPTION },
> + { "help", no_argument, NULL, HELP_OPTION },
> + { "pidfile", required_argument, NULL, 'P' },
> + { "pid-file", required_argument, NULL, 'P' },
> + { "readonly", no_argument, NULL, 'r' },
> + { "read-only", no_argument, NULL, 'r' },
> + { "version", no_argument, NULL, 'V' },
Worth a long-option synonym for -o?
> + /* The next parameter is either a URI or a mode switch. */
> + if (strcmp (argv[optind], "--command") == 0 ||
> + strcmp (argv[optind], "--cmd") == 0) {
> + mode = MODE_COMMAND;
> + optind++;
> + }
Is it worth using getopt_long() in this section for allowing unambiguous
prefix spellings (--c for example) and/or a short option (-c for example)?
> + else if (strcmp (argv[optind], "--socket-activation") == 0 ||
> + strcmp (argv[optind], "--systemd-socket-activation") == 0) {
> + mode = MODE_SOCKET_ACTIVATION;
> + optind++;
> + }
On the same theme, '--socket' as a synonym is easier to type than
--socket-activation.
> + else if (strcmp (argv[optind], "--fd") == 0) {
> + mode = MODE_FD;
> + optind++;
> + }
> + else if (strcmp (argv[optind], "--tcp") == 0) {
> + mode = MODE_TCP;
> + optind++;
> + }
> + else if (strcmp (argv[optind], "--unix") == 0) {
> + mode = MODE_UNIX;
> + optind++;
> + }
> + else if (argv[optind][0] == '-') {
> + fprintf (stderr, "%s: unknown mode: %s\n\n", argv[0], argv[optind]);
> + usage (stderr, EXIT_FAILURE);
> + }
> +
> + /* Check there are enough parameters following given the mode. */
> + switch (mode) {
> + case MODE_URI:
> + case MODE_FD:
> + case MODE_UNIX:
> + if (argc - optind != 1)
> + usage (stderr, EXIT_FAILURE);
> + break;
> + case MODE_TCP:
> + if (argc - optind != 2)
> + usage (stderr, EXIT_FAILURE);
> + break;
> + case MODE_COMMAND:
> + case MODE_SOCKET_ACTIVATION:
> + if (argc - optind < 1)
> + usage (stderr, EXIT_FAILURE);
> + break;
> + }
> + /* At this point we know the command line is valid, and so can start
> + * opening FUSE and libnbd.
> + */
> +
> + /* Create the libnbd handle. */
> + nbd = nbd_create ();
> + if (nbd == NULL) {
> + fprintf (stderr, "%s\n", nbd_get_error ());
> + exit (EXIT_FAILURE);
> + }
> +
> + /* Connect to the NBD server synchronously. */
> + switch (mode) {
> +
> + case MODE_FD:
> + if (sscanf (argv[optind], "%d", &fd) != 1) {
Overflow is undetected.
> + fprintf (stderr, "%s: could not parse file descriptor: %s\n\n",
> + argv[0], argv[optind]);
> + exit (EXIT_FAILURE);
> + }
> + if (nbd_connect_socket (nbd, fd) == -1) {
> + fprintf (stderr, "%s\n", nbd_get_error ());
> + exit (EXIT_FAILURE);
> + }
> + break;
> +
> +
> + /* Create the FUSE args. */
> + if (fuse_opt_add_arg (&fuse_args, argv[0]) == -1) {
> + fuse_opt_error:
> + perror ("fuse_opt_add_arg");
> + exit (EXIT_FAILURE);
> + }
> +
> + if (fuse_options) {
> + if (fuse_opt_add_arg (&fuse_args, "-o") == -1 ||
> + fuse_opt_add_arg (&fuse_args, fuse_options) == -1)
> + goto fuse_opt_error;
> + }
> +
> + /* Create the FUSE mountpoint. */
> + ch = fuse_mount (mountpoint, &fuse_args);
> + if (ch == NULL) {
> + fprintf (stderr,
> + "%s: fuse_mount failed: see error messages above", argv[0]);
> + exit (EXIT_FAILURE);
> + }
> +
> + /* Set F_CLOEXEC on the channel. Some versions of libfuse don't do
> + * this.
> + */
> + fd = fuse_chan_fd (ch);
> + if (fd >= 0) {
> + int flags = fcntl (fd, F_GETFD, 0);
> + if (flags >= 0)
> + fcntl (fd, F_SETFD, flags & ~FD_CLOEXEC);
Doesn't check for (unlikely) error.
> + }
> +
> + /* Create the FUSE handle. */
> + fuse = fuse_new (ch, &fuse_args,
> + &fuse_operations, sizeof fuse_operations, NULL);
> + if (!fuse) {
> + perror ("fuse_new");
> + exit (EXIT_FAILURE);
> + }
> + fuse_opt_free_args (&fuse_args);
> +
> + /* Catch signals since they can leave the mountpoint in a funny
> + * state. To exit the program callers must use ‘fusermount -u’. We
> + * also must be careful not to call exit(2) in this program until we
> + * have unmounted the filesystem below.
> + */
> + memset (&sa, 0, sizeof sa);
> + sa.sa_handler = SIG_IGN;
> + sa.sa_flags = SA_RESTART;
> + sigaction (SIGPIPE, &sa, NULL);
> + sigaction (SIGINT, &sa, NULL);
> + sigaction (SIGQUIT, &sa, NULL);
> +
> + /* Ready to serve, write pidfile. */
> + if (pidfile) {
> + fp = fopen (pidfile, "w");
> + if (fp) {
> + fprintf (fp, "%ld", (long) getpid ());
> + fclose (fp);
> + }
> + }
> +
> + /* Enter the main loop. */
> + r = fuse_loop (fuse);
> + if (r != 0)
> + perror ("fuse_loop");
> +
> + /* Close FUSE. */
> + fuse_unmount (mountpoint, ch);
> + fuse_destroy (fuse);
> +
> + /* Close NBD handle. */
> + nbd_close (nbd);
> +
> + free (mountpoint);
> + free (filename);
> + free (fuse_options);
> +
> + exit (r == 0 ? EXIT_SUCCESS : EXIT_FAILURE);
Looks deceptively simple :)
> +}
> +
> +/* Wraps calls to libnbd functions and automatically checks for a
> + * returns errors in the format required by FUSE. It also prints out
Missing a word or two after 'checks for a'
> + * the full error message on stderr, so that we don't lose it.
> + */
> +#define CHECK_NBD_ERROR(CALL) \
> + do { if ((CALL) == -1) return check_nbd_error (); } while (0)
> +static int
> +check_nbd_error (void)
> +{
> + int err;
> +
> + fprintf (stderr, "%s\n", nbd_get_error ());
> + err = nbd_get_errno ();
> + if (err != 0)
> + return -err;
> + else
> + return -EIO;
> +}
> +
> +static int
> +nbdfuse_getattr (const char *path, struct stat *statbuf)
> +{
> + const int mode = readonly ? 0444 : 0666;
> +
> + memset (statbuf, 0, sizeof (struct stat));
> +
> + /* We're probably making some Linux-specific assumptions here, but
> + * this file is not compiled on non-Linux systems.
> + */
> + statbuf->st_atim = start_t;
> + statbuf->st_mtim = start_t;
> + statbuf->st_ctim = start_t;
> + statbuf->st_uid = geteuid ();
> + statbuf->st_gid = getegid ();
Comment is interesting if true. However, a google search for 'man
fuse_main' pulls up https://man.openbsd.org/fuse_main.3 as its first
hit, so I think FUSE has graduated to non-Linux systems, so we may have
to revisit this later.
> +static int
> +nbdfuse_readdir (const char *path, void *buf,
> + fuse_fill_dir_t filler,
> + off_t offset, struct fuse_file_info *fi)
> +{
> + if (strcmp (path, "/") != 0)
> + return -ENOENT;
> +
> + filler (buf, ".", NULL, 0);
> + filler (buf, "..", NULL, 0);
> + filler (buf, filename, NULL, 0);
> +
Does FUSE have a way to populate d_type during readdir (DT_DIR for '.',
'..', DT_REG for filename)?
> +static int
> +nbdfuse_write (const char *path, const char *buf,
> + size_t count, off_t offset,
> + struct fuse_file_info *fi)
> +{
> + /* Probably shouldn't happen because of nbdfuse_open check. */
> + if (readonly)
> + return -EACCES;
Is EROFS any better here?
> +++ b/fuse/nbdfuse.pod
> @@ -0,0 +1,262 @@
> +=head1 NAME
> +
> +nbdfuse - present a network block device in a FUSE filesystem
> +
> +=head1 SYNOPSIS
> +
> + nbdfuse [-o FUSE-OPTION] [-P PIDFILE] [-r]
> + MOUNTPOINT[/FILENAME] URI
This synopsis looks better than the one in usage().
> +
> +The NBD device itself can be local or remote and is specified by an
> +NBD URI (like C<nbd://localhost>, see L<nbd_connect_uri(3)>) or
> +various other modes.
> +
> +Use C<fusermount -u MOUNTPOINT> to unmount the filesystem after you
> +have used it.
Does umount(8) call into fusermount correctly?
> +
> +This program is similar in concept to L<nbd-client(8)> (which turns
> +NBD into F</dev/nbdX> device nodes), except:
Is it worth mentioning that qemu-nbd(8) alongside nbd-client(8)?
> +
> +=over 4
> +
> +=item *
> +
> +nbd-client is faster because it uses a special kernel module
> +
> +=item *
> +
> +nbd-client requires root, but nbdfuse can be used by any user
> +
> +=item *
> +
> +nbdfuse virtual files can be mounted anywhere in the filesystem
> +
> +=item *
> +
> +nbdfuse uses libnbd to talk to the NBD server
> +
> +=item *
> +
> +nbdfuse requires FUSE support in the kernel
> +
> +=back
Decent list.
> +
> +=head1 EXAMPLES
> +
> +=head2 Present a remote NBD server as a local file
> +
> +If there is a remote NBD server running on C<example.com> at the
> +default NBD port number (10809) then you can turn it into a local file
> +by doing:
> +
> + $ mkdir dir
> + $ nbdfuse dir nbd://example.com &
> + $ ls -l dir/
> + total 0
> + -rw-rw-rw-. 1 nbd nbd 1073741824 Jan 1 10:10 nbd
> +
> +The file is called F<dir/nbd> and you can read and write to it as if
> +it is a normal file. Note that writes to the file will write to the
> +remote NBD server. After using it, unmount it:
> +
> + $ fusermount -u dir
> + $ rmdir dir
> +
> +=head2 Use nbdkit to create a file backed by a temporary RAM disk
> +
> +L<nbdkit(1)> has an I<-s> option allowing it to serve over
> +stdin/stdout. You can combine this with nbdfuse as follows:
> +
> + $ mkdir dir
> + $ nbdfuse dir/ramdisk --command nbdkit -s memory 1G &
> + $ ls -l dir/
> + total 0
> + -rw-rw-rw-. 1 nbd nbd 1073741824 Jan 1 10:10 ramdisk
> + $ dd if=/dev/urandom bs=1M count=100 of=mp/ramdisk conv=notrunc,nocreat
> + 100+0 records in
> + 100+0 records out
> + 104857600 bytes (105 MB, 100 MiB) copied, 2.08319 s, 50.3 MB/s
> +
> +When you have finished with the RAM disk, you can unmount it as below
> +which will cause nbdkit to exit and the RAM disk contents to be
> +discarded:
> +
> + $ fusermount -u dir
> + $ rmdir dir
What a fun way to use memory :)
> +
> +=head2 Use qemu-nbd to read and modify a qcow2 file
> +
> +L<qemu-nbd(8)> cannot serve over stdin/stdout, but it can use systemd
> +socket activation. You can combine this with nbdfuse and use it to
> +open any file format which qemu understands:
> +
> + $ mkdir dir
> + $ nbdfuse dir/file.raw \
> + --socket-activation qemu-nbd -f qcow2 file.qcow2 &
> + $ ls -l dir/
> + total 0
> + -rw-rw-rw-. 1 nbd nbd 1073741824 Jan 1 10:10 file.raw
> +
> +File F<dir/file.raw> is in raw format, backed by F<file.qcow2>. Any
> +changes made to F<dir/file.raw> are reflected into the qcow2 file. To
> +unmount the file do:
> +
> + $ fusermount -u dir
> + $ rmdir dir
> +
The real power shines through - we have used the FUSE kernel module for
user-space mounting of a qcow2 image, instead of the nbd kernel module
for root-only mounting of a qcow2 image ;)
> +Some potentially useful FUSE options:
> +
> +=over 4
> +
> +=item B<-o> B<allow_other>
> +
> +Allow other users to see the filesystem. This option has no effect
> +unless you enable it globally in F</etc/fuse.conf>.
> +
> +=item B<-o> B<kernel_cache>
> +
> +Allow the kernel to cache files (reduces the number of reads that have
> +to go through the L<libnbd(3)> API). This is generally a good idea if
> +you can afford the extra memory usage.
> +
> +=item B<-o> B<uid=>N B<-o> B<gid=>N
> +
> +Use these options to map UIDs and GIDs.
Does this line up with the stats we reported earlier in getattr()?
> +
> +=back
> +
> +=item B<-P> PIDFILE
> +
> +=item B<--pidfile> PIDFILE
> +
> +When nbdfuse is ready to serve, write the nbdfuse process ID (PID) to
> +F<PIDFILE>. This can be used in scripts to wait until nbdfuse is
> +ready. Note you mustn't try to kill nbdfuse. Use C<fusermount -u> to
> +unmount the mountpoint which will cause nbdfuse to exit cleanly.
> +
> +=item B<-r>
> +
> +=item B<--readonly>
> +
> +Access the network block device read-only. The virtual file will have
> +read-only permissions, and any writes will return errors.
> +
> +=item B<--socket-activation> CMD [ARGS ...]
> +
> +Select systemd socket activation mode. This is similar to
> +I<--command>, but is used for servers like L<qemu-nbd(8)> which
> +support systemd socket activation. See L</EXAMPLES> above and
> +L<nbd_connect_systemd_socket_activation(3)>.
> +
> +=item B<--tcp> HOST PORT
> +
> +Select TCP mode. Connect to an NBD server on a host and port over an
> +unencrypted TCP socket. See also L<nbd_connect_tcp(3)>.
How hard would it be to support encryption? Obviously, the fuse-mounted
file will be unencrypted, but libnbd connect to an encrypted nbd server
could prove useful.
> +
> +=item B<--unix> SOCKET
> +
> +Select Unix mode. Connect to an NBD server on a Unix domain socket.
> +See also L<nbd_connect_unix(3)>.
> +
> +=item B<-V>
> +
> +=item B<--version>
> +
> +Display the package name and version and exit.
> +
> +=back
> +
> +=head1 NOTES
> +
> +=head2 Loop mounting
> +
> +It is tempting (and possible) to loop mount the file. However this
> +will be very slow and may sometimes deadlock. Better alternatives are
> +to use either L<nbd-client(8)>, or more securely L<libguestfs(3)>,
Worth mentioning qemu-nbd(8) alongside nbd-client(8)?
> +L<guestfish(1)> or L<guestmount(1)> which can all access NBD servers.
> +
> +=head2 As a way to access NBD servers
> +
> +You can use this to access NBD servers, but it is usually better (and
> +definitely much faster) to use L<libnbd(3)> directly instead. To
> +access NBD servers from the command line, look at L<nbdsh(1)>.
> +
Overall looks like a fun wrapper, to demonstrate how many layers we can
shuffle data through to produce/consume it in the format of interest ;)
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
More information about the Libguestfs
mailing list