[linux-lvm] cache IO blocking
Xen
list at xenhideout.nl
Tue Jun 14 21:50:54 UTC 2016
I am sorry if this sounds repetitive,
I have an SDD + HDD cache combination.
And I am not sure it is not related to the SSD entirely.
I do test runs of dd if=/dev/zero of=/dev/<vg>/<cached lv>, and the
system can freeze when I do so.
The cache for the specific volume I dd to is very small in relation to
the volume itself.
However, that "vault cache" is not even used (1 block out of 60800) yet.
So I am writing to the combined volume called /dev/linux/vault.
vault linux Cwi-aoC--- 435,27g [vault_cache]
[vault_corig] 0,00 9,18 0,00
[vault_cache] linux Cwi---C--- 3,71g
0,00 9,18 0,00
[vault_cache_cdata] linux Cwi-ao---- 3,71g
[vault_cache_cmeta] linux ewi-ao---- 8,00m
[vault_corig] linux owi-aoC--- 435,27g
I try to put a little load on the system (such as media library rescan)
and processes can block for more than 2 minutes.
Such that a TTY will output messages such that "Process <X> has been
blocking for more than 120 seconds".
It doesn't happen all the time or constantly. The first 2 test runs, it
did happen. Without the cache, it hasn't happened yet.
I mean without the cache to "vault". "root" is also cached using the
same:
root linux Cwi-aoC--- 20,00g [root_cache]
[root_corig] 64,74 11,95 0,00
[root_cache] linux Cwi---C--- 7,42g
64,74 11,95 0,00
[root_cache_cdata] linux Cwi-ao---- 7,42g
[root_cache_cmeta] linux ewi-ao---- 12,00m
[root_corig] linux owi-aoC--- 20,00g
So basically I can get _huge IO blocking_ where the CPU (top) is
indicating waiting for IO, (io wait is near 100%) and the entire system
freezes for basically all pieces of harddisk IO, (to the affected
drives) for a cache that is not actually getting utilized much (as I
said, 1/60800 currently) but writing to it causes the other volume (in
this case) (which is "root") to block IO.
So "vault_cache" and "root_cache" are both on the SSD, and "vault_corig"
and "root_corig" are both on the HDD. Writing to "vault" using DD can
cause "root" to stop responding, in the sense of incurring huge IO
blocks.
This is irrespective of cache mode (writethrough/writeback) and cache
policy (smq vs mq). And I wonder if this is just related to the SSD, or
whether I will keep seeing this behaviour when I replace it.
Regards.
More information about the linux-lvm
mailing list