[dm-devel] [PATCH] deadlock with suspend and quotas

Alasdair G Kergon agk at redhat.com
Wed Nov 30 13:33:32 UTC 2011


On Tue, Nov 29, 2011 at 11:19:01AM +0100, Jan Kara wrote:
> On Mon 28-11-11 18:32:18, Mikulas Patocka wrote:
> > - skipping sync on frozen filesystem violates sync semantics. 
> > Applications, such as databases, assume that when sync finishes, data were 
> > written to stable storage. If we skip sync when the filesystem is frozen, 
> > we can cause data corruption in these applications (if the system crashes 
> > after we skipped a sync).

>   Here I don't agree. Filesystem must guarantee there are no dirty data on
> a frozen filesystem. Ext4 and XFS do this, ext3 would need proper
> page_mkwrite() implementation for this but that's the problem of ext3, not
> freezing code in general. If there are no dirty data, sync code (and also
> flusher thread) is free to return without doing anything.
 
Consider, during a 'create a snapshot' operation:
   I/O flow:  application -> filesystem -> LV -> disk

dm lockfs is issued by LVM.
  When this returns, the filesystem should be locked i.e. not issue any
  further I/O to the LV.  (But if it did happen to issue I/O, it
  wouldn't be a problem, as it would just get queued by dm and have no
  impact on the snapshot creation operation.)

The application is still running and might still be issuing writes to
the filesystem and might itself issue 'sync'.  But a 'sync' would only
be meaningful for already-completed writes and the lockfs process should
have already seen that they have hit disk.  So a sync issued while a
device is locked can always be skipped.  Have I missed something in this
reasoning, Mikulas?

Alasdair




More information about the dm-devel mailing list