[linux-lvm] cache on SSD makes system unresponsive

Oleg Cherkasov o1e9 at member.fsf.org
Sat Oct 21 14:10:36 UTC 2017

On 21. okt. 2017 04:55, Mike Snitzer wrote:
> On Thu, Oct 19 2017 at  5:59pm -0400,
> Oleg Cherkasov <o1e9 at member.fsf.org> wrote:
>> On 19. okt. 2017 21:09, John Stoffel wrote:
> So aside from SAR outout: you don't have any system logs?  Or a vmcore
> of the system (assuming it crashed?) -- in it you could access the
> kernel log (via 'log' command in crash utility.

Unfortunately no logs.  I have tried to see if I may recover dmesg 
however no luck.  All logs but the latest dmesg boot are zeroed.  Of 
course there are messages, secure and others however I do not see any 
valuable information there.

System did not crash, OOM were going wind however I did manage to 
Ctrl-Alt-Del from the main console via iLO so eventually it rebooted 
with clean disk umount.

> More specifics on the workload would be useful.  Also, more details on
> the LVM cache configuration (block size?  writethrough or writeback?
> etc).

No extra params but specifying mode writethrough initially.  Hardware 
RAID1 on cache disk is 64k and on main array hardware RAID5 128k.

I had followed precisely documentation from RHEL doc site so lvcreate, 
lvconvert to update type and then lvconvert to add cache.

I have decided to try writeback after and shifted cachemode to it with 

> I'll be looking very closely for any sign of memory leaks (both with
> code inspection and testing while kemmleak is enabled).
> But the more info you can provide on the workload the better.

According to SAR there are no records about 20min before I reboot, so I 
suspect SAR daemon failed a victim of OOM.

