[dm-devel] dm overlaybd: targets mapping OverlayBD image

Gao Xiang hsiangkao at linux.alibaba.com
Wed May 24 11:06:46 UTC 2023



On 2023/5/24 03:48, Giuseppe Scrivano wrote:
> Gao Xiang <hsiangkao at linux.alibaba.com> writes:
> 
>> Hi Giuseppe,
>>
>> On 2023/5/24 01:11, Giuseppe Scrivano wrote:
>>> Gao Xiang <hsiangkao at linux.alibaba.com> writes:
>>>
>>
>> ...
>>
>>>> Agreed, I hope you guys could actually sit down and evaluate a proper
>>>> solution on the next OCI v2, currently I know there are:
>>>>
>>>>    - Composefs
>>>>    - (e)stargz   https://github.com/containerd/stargz-snapshotter
>>>>    - Nydus       https://github.com/containerd/nydus-snapshotter
>>>>    - OverlayBD   https://github.com/containerd/accelerated-container-image
>>>>    - SOCI        https://github.com/awslabs/soci-snapshotter
>>>>    - Tarfs
>>>>    - (maybe even more..)
>>>>
>>>> Honestly, I do think OSTree/Composefs is the best approach for now for
>>>> deduplication and page cache sharing (due to kernel limitation of page
>>>> cache sharing and overlayfs copyup limitation).  I'm too tired of
>>>> container image stuffs honestly.  Too much unnecessary manpower waste.
>>> for a file-based storage model, I am not sure a new format would
>>> really
>>> buy us much or it can be significantly different.
>>> Without a proper support from the kernel, a new format would still
>>> need
>>> to create the layout overlay expects, so it won't be much different than
>>> what we have now.
>>
>> I've seen lot efforts on this, for example,
>> https://docs.google.com/presentation/d/1lBKVrYzm9JEYuw-gIEsrcePSK0jL1Boe/edit#slide=id.p22
>>
>> Merging the writable layer and read-only layers with overlayfs is
>> feasible. I mean, at least for composefs model on backing XFS/btrfs, we
>> could merge these layers with overlayfs so that I guess reflink could
>> be done to avoid full copyup as well?  I do think that's a net win.
>>
>>> The current OCI format, with some tweaks like (e)stargz or
>>> zstd:chunked,
>>> already make its content addressable and a client can retrieve only the
>>> subset of the files that are needed.  At the same time we maintain the
>>> simplicity of a tarball and it won't break existing clients.
>>
>> (e)stargz or zstd:chunked still needs to be converted by the publisher
>> and not all exist OCI images are stored in this way.  But apart from
>> detailed comparsion, disk mapping image approaches seems really a
>> drawback at least on my side.
> 
> these images can be treated as if all their files are missing and the
> checksum is calculated on the receiver side.  They will still be stored
> locally indexed by their checksum.  We lose the possibility to pull only
> the missing files but we maintain the other advantages at runtime.  In
> this way moving to a new format can be done incrementally without
> breaking what we have now.

Yeah, that is on-demand loading stuffs (another story) but my
opinion was that I could see a win of composefs model is that
you could use EROFS + overlayfs + XFS/btrfs to do partial copyup
by using clone_file_range() to copy up within the same fs (since
all layers including the writable layer are actually landed
in the same fs so overlayfs will just clone_file_range()).

In principle, we could do some hack to do clone_file_range()
across different fses which are actually backed by the same
fs for other approaches, but that approach cannot be not
easily landed upstream TBH.

Thanks,
Gao Xiang



More information about the dm-devel mailing list