[virt-tools-list] [virt-bootstrap 4/4] docker: Add support for whiteout files

Cole Robinson crobinso at redhat.com
Tue Aug 6 18:21:39 UTC 2019


On 8/4/19 2:06 PM, Radostin Stoyanov wrote:
> A whiteout is an empty file with a special name that signifies
> a path that should not be present in upper layers.
> 
> When container image layer contains whiteouts (described in [2])
> virt-bootstrap should handle them appropriately:
> 
> .wh.PATH     : PATH should be deleted
> .wh..wh..opq : all children (including sub-directories and all
> 	       descendants) of the folder containing this file
> 	       should be removed.
> 
> [1] https://github.com/opencontainers/image-spec/blob/master/layer.md#whiteouts
> [2] https://github.com/moby/moby/blob/d1f470946/pkg/archive/whiteouts.go
> 
> Signed-off-by: Radostin Stoyanov <rstoyanov1 at gmail.com>
> ---
>  src/virtBootstrap/utils.py    | 12 ++++-
>  src/virtBootstrap/whiteout.py | 93 +++++++++++++++++++++++++++++++++++
>  2 files changed, 103 insertions(+), 2 deletions(-)
>  create mode 100644 src/virtBootstrap/whiteout.py
> 
> diff --git a/src/virtBootstrap/utils.py b/src/virtBootstrap/utils.py
> index d6031f1..245d8e0 100644
> --- a/src/virtBootstrap/utils.py
> +++ b/src/virtBootstrap/utils.py
> @@ -35,6 +35,7 @@ import logging
>  import shutil
>  
>  import passlib.hosts
> +from virtBootstrap import whiteout
>  
>  try:
>      import guestfs
> @@ -279,8 +280,12 @@ def safe_untar(src, dest):
>      # Note: Here we use --absolute-names flag to get around the error message
>      # "Cannot open: Permission denied" when symlynks are extracted, with the
>      # qemu:/// driver. This flag must not be used outside virt-sandbox.
> -    params = ['--', '/bin/tar', 'xf', src, '-C', '/mnt', '--exclude', 'dev/*',
> -              '--overwrite', '--absolute-names']
> +    params = ['--', '/bin/tar', 'xf', src,
> +              '-C', '/mnt',
> +              '--exclude', 'dev/*',
> +              '--exclude', '*/%s*' % whiteout.PREFIX,
> +              '--overwrite',
> +              '--absolute-names']
>      # Preserve file attributes following the specification in
>      # https://github.com/opencontainers/image-spec/blob/master/layer.md
>      if os.geteuid() == 0:
> @@ -341,6 +346,9 @@ def untar_layers(layers_list, dest_dir, progress):
>          tar_file, tar_size = layer
>          log_layer_extract(tar_file, tar_size, index + 1, nlayers, progress)
>  
> +        # Apply whiteout changes with respect to parent layers
> +        whiteout.apply_whiteout_changes(tar_file, dest_dir)
> +
>          # Extract layer tarball into destination directory
>          safe_untar(tar_file, dest_dir)
>  
> diff --git a/src/virtBootstrap/whiteout.py b/src/virtBootstrap/whiteout.py
> new file mode 100644
> index 0000000..1f9efd4
> --- /dev/null
> +++ b/src/virtBootstrap/whiteout.py
> @@ -0,0 +1,93 @@
> +# -*- coding: utf-8 -*-
> +# Authors: Radostin Stoyanov <rstoyanov1 at gmail.com>
> +#
> +# Copyright (c) 2019 Radostin Stoyanov
> +#
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation, either version 3 of the License, or
> +# (at your option) any later version.
> +
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see <http://www.gnu.org/licenses/>.
> +
> +"""
> +Whiteouts are files with a special meaning for a layered filesystem.
> +The should not be extracted in the destination directory.
> +

* They should

> +Whiteout prefix (.wh.) followed by a filename means that the file has
> +been removed from.
> +

Ending with 'from' sounds a bit weird here. Maybe 'means that the file
has been removed ' but I'm not sure that's totally accurate

> +Whiteout meta prefix (.wh..wh.) has special meaning which is not for
> +removing a file.
> +"""

This reads a little weird too. Maybe it's fine to just drop this text
and add a link to the file(s) you mention in the commit message. The
process_whiteout function below already describes the behavior well, not
sure if it needs to be duplicated here.


> +
> +import logging
> +import os
> +import shutil
> +import tarfile
> +
> +
> +PREFIX = ".wh."
> +METAPREFIX = PREFIX + PREFIX
> +OPAQUE = METAPREFIX + ".opq"
> +
> +# pylint: disable=invalid-name
> +logger = logging.getLogger(__name__)
> +
> +
> +def apply_whiteout_changes(tar_file, dest_dir):
> +    """
> +    Process files with whiteout prefix and apply
> +    changes in destination folder.
> +    """
> +    for path in get_whiteout_files(tar_file):
> +        basename = os.path.basename(path)
> +        dirname = os.path.dirname(path)
> +        dirname = os.path.join(dest_dir, dirname)
> +
> +        process_whiteout(dirname, basename)
> +
> +
> +def get_whiteout_files(filepath):
> +    """
> +    Return a list of whiteout files from tar file
> +    """
> +    whiteout_files = []
> +    with tarfile.open(filepath) as tar:
> +        for path in tar.getnames():
> +            if os.path.basename(path).startswith(PREFIX):
> +                whiteout_files.append(path)
> +    return whiteout_files
> +
> +
> +def process_whiteout(dirname, basename):
> +    """
> +    Process a whiteout file:
> +
> +    .wh.PATH     : PATH should be deleted
> +    .wh..wh..opq : all children (including sub-directories and all
> +                   descendants) of the folder containing this file
> +                   should be removed
> +
> +    When a folder is first created in a layer an opq file will be
> +    generated. In such case, there is nothing to remove we can simply
> +    ignore the opque whiteout file.
> +    """
> +    if basename == OPAQUE:
> +        if os.path.isdir(dirname):
> +            shutil.rmtree(dirname)
> +            os.makedirs(dirname)
> +    elif not basename.startswith(METAPREFIX):
> +        target = os.path.join(dirname, basename[len(PREFIX):])
> +        if os.path.isfile(target):
> +            os.remove(target)
> +        elif os.path.isdir(target):
> +            shutil.rmtree(target)
> +        else:
> +            logger.error("%s is not a file or directory", target)
> 

I applied it but didn't test it, but the code looks fine to me.

Reviewed-by: Cole Robinson <crobinso at redhat.com>

- Cole




More information about the virt-tools-list mailing list