[dm-devel] mirrored device with thousand of mappingtableentries
Mike Snitzer
snitzer at redhat.com
Mon Mar 7 20:10:27 UTC 2011
On Sun, Mar 06 2011 at 9:59pm -0500,
Martin K. Petersen <martin.petersen at oracle.com> wrote:
> >>>>> "Zdenek" == Zdenek Kabelac <zkabelac at redhat.com> writes:
>
> Zdenek> My finding seems to show that BIP-256 slabtop segment grow by
> Zdenek> ~73KB per each device (while dm-io is ab out ~26KB)
>
> Ok, I see it now that I tried with a bunch of DM devices.
>
> DM allocates a bioset per volume. And since each bioset has an integrity
> mempool you'll end up with a bunch of memory locked down. It seems like
> a lot but it's actually the same amount as we reserve for the data path
> (bio-0 + biovec-256).
>
> Since a bioset is not necessarily tied to a single block device we can't
> automatically decide whether to allocate the integrity pool or not. In
> the DM case, however, we just set up the integrity profile so the
> information is available.
>
> Can you please try the following patch? This will change things so we
> only attach an integrity pool to the bioset if the logical volume is
> integrity-capable.
Hey Martin,
I just took the opportunity to review DM's blk_integrity code a bit more
closely -- with an eye towards stacking devices. I found an issue that
I think we need to fix that has to do with a DM device's limits being
established during do_resume() and not during table_load().
Unfortunately, a DM device's blk_integrity gets preallocated during
table_load(). dm_table_prealloc_integrity()'s call to
blk_integrity_register() establishes the blk_integrity's block_size.
But a DM device's queue_limits aren't stacked until a DM device is
resumed -- via dm_calculate_queue_limits().
For some background please see the patch header of this commit:
http://git.kernel.org/linus/754c5fc7ebb417
The final blk_integrity for the DM device isn't fully established until
do_resume()'s eventual call to dm_table_set_integrity() -- by passing a
template to blk_integrity_register(). dm_table_set_integrity() does
validate the 'block_size' of each DM devices' blk_integrity to make sure
they all match. So the code would catch the inconsistency should it
arise.
All I'm saying is: it's possible for a table_load() to not have the
awareness that a newly added device's queue_limits will cause the DM
device's final queue_limits to be increased (say a 4K device was
added to dm_device2, and dm_device2 is now being added to another
dm_device1).
So it seems we need to establish bi->sector_size during the final stage
of blk_integrity_register(), e.g. when a template is passed. Not sure
if you'd agree with that change in general but it'll work for DM because
the queue_limits are established before dm_table_set_integrity() is set.
Maybe revalidate/change the 'block_size' during the final stage in case
it changed?
Thanks,
Mike
More information about the dm-devel
mailing list