[dm-devel] Integrity discard/trim extremely slow on NVMe SSD storage (~10GiB/minute)

Mikulas Patocka mpatocka at redhat.com
Mon Apr 26 15:33:32 UTC 2021



On Mon, 19 Apr 2021, Melvin Vermeeren wrote:

> Note: This was originally posted on cryptsetup GitLab.
> Note: Reposting here for better visibility as it appears to be a kernel bug.
> Ref: https://gitlab.com/cryptsetup/cryptsetup/-/issues/639
> 
> Issue description
> -----------------
> 
> With a Seagate FireCuda 520 2TB NVMe SSD running in PCIe 3.0 x4 mode (my 
> motherboard does not have PCIe 4.0), discards through `dm-integrity` 
> layer are extremely slow to the point of being almost unusable or in 
> some cases fully unusable.
> 
> This is so slow that having the `discard` option on swap in not 
> possible, as it takes around 3 minutes to complete for 32GiB swap 
> causing timeouts during boot which in turn causes various other services 
> to fail resulting in a drop to the emergency shell.
> 
> `blkdiscard` directly to NVMe device takes I think 10 sec or so for the 
> entire 2TB, but through `dm-integrity` the rate is approx 10GiB per 
> minute, meaning over 3 hours to discard the entire 2TB. Normal read and 
> write operations are not affected and are high performance, easily 
> reaching 2GiB/s through the entire layer: `disk dm-integrity mdadm luks 
> lvm ext4`.
> 
> Checking the kernel thread usage in htop quite some 
> `dm-integrity-offload` threads are in the `D` state with `0.0` CPU usage 
> when discarding, which is rather odd. No integrity threads are actually 
> working and read-write disk usage measured with `dstat` is not even 
> 1MiB/s.
> 
> To detail the above, `dstat` shows extremely clear timings: 2 seconds 0k 
> write, 1 second 512k write, repeat. Possible timeout in locks somewhere 
> or other problematic lock situation?
> 
> Steps for reproducing the issue
> -------------------------------
> 
> 1. Create two 10G partitions on SSD.
> 2. Setup `dm-integrity` on one of these and open the device with `--allow-
> discards`.
> 3. `blkdiscard` both partitions.
> 	* Raw partition is done instantly.
> 	* Integrity partition takes around a minute.
> 
> Additional info
> ---------------
> 
> The NVMe device is formatted to native 4096 byte sectors and the `dm-
> integrity` layer also uses 4096 byte sectors.
> 
> Debian bullseye (testing), kernel 5.10.0-6-rt-amd64 5.10.28-1. Same issue 
> occurred during testing with Arch Linux liveiso which is kernel 5.11.x. 
> Cryptsetup package version 2.3.5.
> 
> On another server system (IBM POWER9, ppc64le) with SAS 3.0 SSD discard is 
> working properly at more than acceptable speeds, showing significant CPU usage 
> while discarding. In this case it is a regular Intel amd64 desktop system.
> 
> Debug log
> ---------
> 
> Nothing really fails, dmesg and syslog show no issues/warnings at all, not 
> sure what to include.
> 
> Only appears to effect NVMe
> ---------------------------
> 
> Further tests on the same machine show that SATA SSD is not affected by this 
> issue and discards at high performance. Appears to be NVMe-specific bug:
> Ref: https://gitlab.com/cryptsetup/cryptsetup/-/issues/639#note_555208783

I tried it on my nvme device (Samsung SSD 960 EVO 500GB) and I could 
discard 32GB in 5 seconds.

I assume that it is specific to the nvme device you are using. The device 
is perhaps slow due to a mix of dicard+read+write requests that 
dm-integrity generates.

> If there is anything I can do to help feel free to let me know.
> Note that I am not subscribed to dm-level, please CC me directly.
> 
> Thanks,

Could you try it on other nvme disks?

Mikulas




More information about the dm-devel mailing list