[dm-devel] Using device-mapper with many targets

Tom Parker palfrey at tevp.net
Tue Jul 31 21:21:20 UTC 2007


Summary: RAID 1 with 600 targets/RAID 5 with 50 targets causes OOM with 
320MB of RAM, and I don't know why.

First off, thanks for everyone's hard work on this lovely piece of 
software. It's very nice to be able to mess around with things at this 
level without having to delve into kernel space, and being able to 
manipulate things with command lines and libraries :-)

I've been trying some experiments recently, mainly as a precursor to 
some ideas I had, and I've run into some issues with device-mapper. 
Specifically, my use of *lots* of targets, and the tendancy of the 
kernel to head into OOM territory when you try this. I've been using a 
2.6.22 kernel (with Ubuntu's basic patches, but I don't think that's 
what's causing this) under a VMware virtual machine with 320MB RAM, and 
it seems to be having some issues. It's got a default text-mode Ubuntu 
7.04 setup, with probably a few more things loaded than it really needs, 
but not that much.

With RAID 1 targets, I can get up to about 600 targets before the whole 
thing seems to lock hard (dmsetup itself gets OOM'ed, and I can't get 
back to the shell or to any other of the virtual terminals). While 
scaling up to that point, I find that for every RAID 1 target, 
device-mapper seems to eat ~.5MB of RAM. The targets themselves are 
currently very small (1MB I think) as this was more a test in many 
targets, rather than in large targets.

I was also interested in the RAID 5 patch, so I'd earlier tried a kernel 
with the 20070720 version of the patch (marked as 2.6.22-rc7, but it 
applied cleanly against the aforementioned 2.6.22 kernel, and it was the 
latest version available when I was downloading). OOM also occurred, but 
at only ~50 targets. In this case, there's a substantial delay between 
telling dm-setup to load the target and OOM (dmesg post-mortem says ~50 
seconds), and the shell is still responsive, so I could get a dmesg log 
(attached).

I've also attached the genlist.py that I hacked together for generating 
many targets, as well as the 100 item "test" target file I started with 
for RAID 5 before narrowing the problem down to ~50. genlist should be 
fairly self-explanatory (command line for the example test file was 
"python genlist.py -l 100 sda sdb sdc")

Any thoughts on what's causing this, and anything that can be done to 
reduce/remove the problem? I'm guessing this isn't a usual target 
configuration, as most people with 50 RAID 5 targets probably have GBs 
of RAM...

I can reproduce all of this easily, so if there's any further questions 
about my configuration or recommendations for things to upgrade, I'd 
love to hear them!

Thanks,

Tom Parker
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20070731/125a1646/attachment.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dmesg-log
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20070731/125a1646/attachment-0001.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: genlist.py
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20070731/125a1646/attachment-0002.ksh>


More information about the dm-devel mailing list