[dm-devel] [LSF/MM/BFP ATTEND] [LSF/MM/BFP TOPIC] Storage: Copy Offload

joshi.k at samsung.com joshi.k at samsung.com
Thu Feb 13 05:11:28 UTC 2020


I am very keen on this topic.
I've been doing some work for "NVMe simple copy", and would like to discuss
and solicit opinion of community on the following:

- Simple-copy, unlike XCOPY and P2P, is limited to copy within a single
namespace. Some of the problems that original XCOPY work [2] faced may not
be applicable for simple-copy, e.g. split of single copy due to differing
device-specific limits.
Hope I'm not missing something in thinking so?

- [Block I/O] Async interface (through io-uring or AIO) so that multiple
copy operations can be queued.

- [File I/O to user-space] I think it may make sense to extend
copy_file_range API to do in-device copy as well.

- [F2FS] GC of F2FS may leverage the interface. Currently it uses
page-cache, which is fair. But, for relatively cold/warm data (if that needs
to be garbage-collected anyway), it can rather bypass the Host and skip
running into a scenario when something (useful) gets thrown out of cache.

- [ZNS] ZNS users (kernel or user-space) would be log-structured, and will
benefit from internal copy. But failure scenarios (partial copy,
write-pointer position) need to be discussed.

Thanks,
Kanchan

> -----Original Message-----
> From: linux-nvme [mailto:linux-nvme-bounces at lists.infradead.org] On Behalf
> Of Chaitanya Kulkarni
> Sent: Tuesday, January 7, 2020 11:44 PM
> To: linux-block at vger.kernel.org; linux-scsi at vger.kernel.org; linux-
> nvme at lists.infradead.org; dm-devel at redhat.com; lsf-pc at lists.linux-
> foundation.org
> Cc: axboe at kernel.dk; msnitzer at redhat.com; bvanassche at acm.org; Martin K.
> Petersen <martin.petersen at oracle.com>; Matias Bjorling
> <Matias.Bjorling at wdc.com>; Stephen Bates <sbates at raithlin.com>;
> roland at purestorage.com; mpatocka at redhat.com; hare at suse.de; Keith Busch
> <kbusch at kernel.org>; rwheeler at redhat.com; Christoph Hellwig <hch at lst.de>;
> frederick.knight at netapp.com; zach.brown at ni.com
> Subject: [LSF/MM/BFP ATTEND] [LSF/MM/BFP TOPIC] Storage: Copy Offload
> 
> Hi all,
> 
> * Background :-
> -----------------------------------------------------------------------
> 
> Copy offload is a feature that allows file-systems or storage devices to
be
> instructed to copy files/logical blocks without requiring involvement of
the local
> CPU.
> 
> With reference to the RISC-V summit keynote [1] single threaded
performance is
> limiting due to Denard scaling and multi-threaded performance is slowing
down
> due Moore's law limitations. With the rise of SNIA Computation Technical
> Storage Working Group (TWG) [2], offloading computations to the device or
> over the fabrics is becoming popular as there are several solutions
available [2].
> One of the common operation which is popular in the kernel and is not
merged
> yet is Copy offload over the fabrics or on to the device.
> 
> * Problem :-
> -----------------------------------------------------------------------
> 
> The original work which is done by Martin is present here [3]. The latest
work
> which is posted by Mikulas [4] is not merged yet. These two approaches are
> totally different from each other. Several storage vendors discourage
mixing
> copy offload requests with regular READ/WRITE I/O. Also, the fact that the
> operation fails if a copy request ever needs to be split as it traverses
the stack it
> has the unfortunate side-effect of preventing copy offload from working in
> pretty much every common deployment configuration out there.
> 
> * Current state of the work :-
> -----------------------------------------------------------------------
> 
> With [3] being hard to handle arbitrary DM/MD stacking without splitting
the
> command in two, one for copying IN and one for copying OUT. Which is then
> demonstrated by the [4] why [3] it is not a suitable candidate. Also, with
[4]
> there is an unresolved problem with the two-command approach about how to
> handle changes to the DM layout between an IN and OUT operations.
> 
> * Why Linux Kernel Storage System needs Copy Offload support now ?
> -----------------------------------------------------------------------
> 
> With the rise of the SNIA Computational Storage TWG and solutions [2],
existing
> SCSI XCopy support in the protocol, recent advancement in the Linux Kernel
File
> System for Zoned devices (Zonefs [5]), Peer to Peer DMA support in the
Linux
> Kernel mainly for NVMe devices [7] and eventually NVMe Devices and
subsystem
> (NVMe PCIe/NVMeOF) will benefit from Copy offload operation.
> 
> With this background we have significant number of use-cases which are
strong
> candidates waiting for outstanding Linux Kernel Block Layer Copy Offload
> support, so that Linux Kernel Storage subsystem can to address previously
> mentioned problems [1] and allow efficient offloading of the data related
> operations. (Such as move/copy etc.)
> 
> For reference following is the list of the use-cases/candidates waiting
for Copy
> Offload support :-
> 
> 1. SCSI-attached storage arrays.
> 2. Stacking drivers supporting XCopy DM/MD.
> 3. Computational Storage solutions.
> 7. File systems :- Local, NFS and Zonefs.
> 4. Block devices :- Distributed, local, and Zoned devices.
> 5. Peer to Peer DMA support solutions.
> 6. Potentially NVMe subsystem both NVMe PCIe and NVMeOF.
> 
> * What we will discuss in the proposed session ?
> -----------------------------------------------------------------------
> 
> I'd like to propose a session to go over this topic to understand :-
> 
> 1. What are the blockers for Copy Offload implementation ?
> 2. Discussion about having a file system interface.
> 3. Discussion about having right system call for user-space.
> 4. What is the right way to move this work forward ?
> 5. How can we help to contribute and move this work forward ?
> 
> * Required Participants :-
> -----------------------------------------------------------------------
> 
> I'd like to invite block layer, device drivers and file system developers
to:-
> 
> 1. Share their opinion on the topic.
> 2. Share their experience and any other issues with [4].
> 3. Uncover additional details that are missing from this proposal.
> 
> Required attendees :-
> 
> Martin K. Petersen
> Jens Axboe
> Christoph Hellwig
> Bart Van Assche
> Stephen Bates
> Zach Brown
> Roland Dreier
> Ric Wheeler
> Trond Myklebust
> Mike Snitzer
> Keith Busch
> Sagi Grimberg
> Hannes Reinecke
> Frederick Knight
> Mikulas Patocka
> Matias Bjørling
> 
> [1]https://protect2.fireeye.com/url?k=22656b2d-7fb63293-2264e062-
> 0cc47a31ba82-2308b42828f59271&u=https://content.riscv.org/wp-
> content/uploads/2018/12/A-New-Golden-Age-for-Computer-Architecture-
> History-Challenges-and-Opportunities-David-Patterson-.pdf
> [2] https://protect2.fireeye.com/url?k=44e3336c-19306ad2-44e2b823-
> 0cc47a31ba82-70c015d1b0aaeb3f&u=https://www.snia.org/computational
> https://protect2.fireeye.com/url?k=a366c2dc-feb59b62-a3674993-
> 0cc47a31ba82-
> 20bc672ec82b62b3&u=https://www.napatech.com/support/resources/solution
> -descriptions/napatech-smartnic-solution-for-hardware-offload/
>       https://protect2.fireeye.com/url?k=90febdca-cd2de474-90ff3685-
> 0cc47a31ba82-
> 277b6b09d36e6567&u=https://www.eideticom.com/products.html
> https://protect2.fireeye.com/url?k=4195e835-1c46b18b-4194637a-
> 0cc47a31ba82-
> a11a4c2e4f0d8a58&u=https://www.xilinx.com/applications/data-
> center/computational-storage.html
> [3] git://git.kernel.org/pub/scm/linux/kernel/git/mkp/linux.git xcopy [4]
> https://protect2.fireeye.com/url?k=455ff23c-188cab82-455e7973-
> 0cc47a31ba82-e8e6695611f4cc1f&u=https://www.spinics.net/lists/linux-
> block/msg00599.html
> [5] https://lwn.net/Articles/793585/
> [6] https://protect2.fireeye.com/url?k=08eb17f6-55384e48-08ea9cb9-
> 0cc47a31ba82-1b80cd012aa4f6a3&u=https://nvmexpress.org/new-nvmetm-
> specification-defines-zoned-
> namespaces-zns-as-go-to-industry-technology/
> [7] https://protect2.fireeye.com/url?k=54b372ee-09602b50-54b2f9a1-
> 0cc47a31ba82-ea67c60915bfd63b&u=https://github.com/sbates130272/linux-
> p2pmem
> [8] https://protect2.fireeye.com/url?k=30c2303c-6d116982-30c3bb73-
> 0cc47a31ba82-95f0ddc1afe635fe&u=https://kernel.dk/io_uring.pdf
> 
> Regards,
> Chaitanya
> 
> _______________________________________________
> linux-nvme mailing list
> linux-nvme at lists.infradead.org
> https://protect2.fireeye.com/url?k=d145dc5a-8c9685e4-d1445715-
> 0cc47a31ba82-
> 3bf90c648f67ccdd&u=http://lists.infradead.org/mailman/listinfo/linux-nvme







More information about the dm-devel mailing list