[linux-lvm] cache on SSD makes system unresponsive

Mon Oct 23 20:45:56 UTC 2017

>>>>> "Oleg" == Oleg Cherkasov <o1e9 at member.fsf.org> writes:

Oleg> On 21. okt. 2017 04:55, Mike Snitzer wrote:
>> On Thu, Oct 19 2017 at  5:59pm -0400,
>> Oleg Cherkasov <o1e9 at member.fsf.org> wrote:
>> 
>>> On 19. okt. 2017 21:09, John Stoffel wrote:
>>>> 
>> 
>> So aside from SAR outout: you don't have any system logs?  Or a vmcore
>> of the system (assuming it crashed?) -- in it you could access the
>> kernel log (via 'log' command in crash utility.

Oleg> Unfortunately no logs.  I have tried to see if I may recover dmesg 
Oleg> however no luck.  All logs but the latest dmesg boot are zeroed.  Of 
Oleg> course there are messages, secure and others however I do not see any 
Oleg> valuable information there.

Oleg> System did not crash, OOM were going wind however I did manage to 
Oleg> Ctrl-Alt-Del from the main console via iLO so eventually it rebooted 
Oleg> with clean disk umount.

Bummers.  Maybe you can setup a syslog server to use to log verbose
kernel logs elsewhere, including the OOM messages?  

>> 
>> More specifics on the workload would be useful.  Also, more details on
>> the LVM cache configuration (block size?  writethrough or writeback?
>> etc).

Oleg> No extra params but specifying mode writethrough initially.
Oleg> Hardware RAID1 on cache disk is 64k and on main array hardware
Oleg> RAID5 128k.

Oleg> I had followed precisely documentation from RHEL doc site so lvcreate, 
Oleg> lvconvert to update type and then lvconvert to add cache.

Oleg> I have decided to try writeback after and shifted cachemode to it with 
Oleg> lvcache.

>> I'll be looking very closely for any sign of memory leaks (both with
>> code inspection and testing while kemmleak is enabled).
>> 
>> But the more info you can provide on the workload the better.

Oleg> According to SAR there are no records about 20min before I reboot, so I 
Oleg> suspect SAR daemon failed a victim of OOM.

Maybe if you could take a snapshot of all the processes on the system
before you run the test, and then also run 'vmstat 1' to a log file
while running the test?

As a wierd thought... maybe it's because you have a 1gb meta data LV
that's causing problems?  Maybe you need to just accept the default
size?

It might also be instructive to make the cache be just half the SSD in
size and see if that helps.  It *might* be that as other people have
mentioned, that your SSD's performance drops off a cliff when it's
mostly full.  So reducing the cache size, even to only 80% of the size
of the disk, might give it enough spare empty blocks to stay
performant?

John