[dm-devel] dm-crypt: flush crypt_queue on suspend? on REQ_FLUSH?

Eric Wheeler dm-devel at lists.ewheeler.net
Sat Sep 3 06:26:50 UTC 2016

On Fri, 2 Sep 2016, Eric Wheeler wrote:
> We have a KVM => dm-crypt => dm-thin stack and a snapshot may have 
> partially completed queued IO out-of-order.  EXT4 is giving errors like 
> this after mounting a snapshot, but only on files recently modified near 
> the snapshot time.  This might imply out-of-order writes since, 
> presumably, the ext4 journal would handle deletions in journal order and 
> snapshots should be safe at any time:
> 	kernel: EXT4-fs error (device dm-2): ext4_lookup:1441: inode #1196093: comm rsync: deleted inode referenced: 1188710 
> Notably, the ext4 error did not present prior to the snapshot.  We have 
> rsync logs from hours before that didn't report this message, but all 
> later rsyncs do.
> Perhaps related, or perhaps not, crypt_map() notes the following:
> 	1914          * If bio is REQ_FLUSH or REQ_DISCARD, just bypass crypt queues.
> 	1915          * - for REQ_FLUSH device-mapper core ensures that no IO is in-flight
> 	1916          * - for REQ_DISCARD caller must use flush if IO ordering matters
> However, crypt_map returns DM_MAPIO_SUBMITTED after calling 
> kcryptd_queue_crypt(io) which processes asynchronously on 
> cc->crypto_queue.  In crypt_ctr(), alloc_workqueue sets max_active to 
> num_online_cpus() unless 'same_cpu_crypt' is set in the dm table (mine is 
> not), so if I understand correctly, completion could happen out of order 
> with CPUs >1.
> Does the dm stack know that the bio (dm_crypt_io) hasn't been completed 
> [since] it was put on a work queue [and then a write_thread]?  If so, I 
> would like to understand that dm bio-tracking mechanism, so please point 
> me at documentation or code.
> If not, then does crypt_map() need a workqueue_flush(cc->crypto_queue) on 
> I'm new to the dm-crypt code, so please educate me if I'm missing 
> something here.

On second look, simply adding workqueue_flush isn't enough since I've not 
set 'submit_from_crypt_cpus'; it's the cc->write_thread's job to get the 
bio's submitted via kcryptd_io_write() after pulling them from a red-black 
tree that kcryptd_crypt_write_io_submit() inserted into.  Is the red-black 
tree guaranteed to pop off in order, or at least to flush in order when a 
REQ_FLUSH comes through?

If not, then I think this means I need to set both 'same_cpu_crypt' 
(DM_CRYPT_SAME_CPU) and 'submit_from_crypt_cpus' (DM_CRYPT_NO_OFFLOAD) to 
guarantee ordering by restricting the workqueue to serial work 
(DM_CRYPT_SAME_CPU) and force immediate generic_make_request inside 
kcryptd_crypt_write_io_submit (DM_CRYPT_NO_OFFLOAD) instead of queuing to 
the cc->write_thread.

Your thoughts?


> Thank you for your help!
> --
> Eric Wheeler
> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel

More information about the dm-devel mailing list