[dm-devel] Newbie device mapper questions
Johannes Bauer
dfnsonfsduifb at gmx.de
Tue Jun 16 18:54:33 UTC 2015
On 15.06.2015 21:52, Doug Dumitru wrote:
>> Sounds pretty easy and I also got surprisingly far with my little kernel
>> module. I've so far implemented ctr, dtr, map and status.
>
> Congratulations, you are actually a long way there.
Thanks but I think I have the mountain still ahead -- still, I would
really like to figure out the nitty-gritty.
> You have to allocate a bio, populate it, allocate pages for buffer,
> populate the bvec, and call make_request (or generic make request). You
> will get the completion from the bio on the bottom half of the interrupt
> handler, so how much work you can do there is debatable. You cannot start
> an new IO from there, which you need to. You will probably want to start a
> helper thread and have the completion routine schedule itself onto your
> thread. Once you are back on your thread, you can do just about anything.
>
> Because you need to do IO, you will not be able to do a simple bio "bounce
> redirect". You will need to do the IO youself (ie, call another make
> request), but you can use the callers bvec for this, so there is no data
> copy required. Once the request completes, you can then fin the caller.
Oh, wow. This sounds truly terrifying. Let's dive in!
I tried to read your hints one word at a time. So here's the somewhat
pseudocodish solution to my homework:
struct bio *b = bio_alloc(GFP_NOIO, 1);
b->bi_size = 8;
bio_alloc_pages(b, GFP_NOIO);
b->bi_sector = 1234;
b->bi_bdev = lc->metadev->bdev;
b->bi_rw = READ;
b->bi_private = local_ctx;
b->bi_end_io = read_complete_callback;
generic_make_request(bi);
static void read_complete_callback(struct bio *b, int error) {
// ???
printk(KERN_INFO "First read byte: %02x\n",
b->bi_io_vec[0]->bv_page[0]);
}
So I hope this is even remotely close to what I should end up with.
This will alloc a new bio with, as I understand it, one page buffer in
b->bi_io_vec. This buffer is then allocated with bio_alloc_pages to 8
sectors in size (i.e. exactly one page of 4096 bytes). Then the read
address, block device and read mode is set. I pass some kind of local
context so I can do something meaningful in the callback and specify the
callback function. Then I execute the request.
As I understand, this executes asynchronously. So here comes the
threading into play, right? Just pseudocode (because I can't judge how
far I'm off here), but let's say this is map():
void read_complete_callback() {
semaphore_inc(local_ctx);
}
void map() {
local_ctx->semaphore->value = 0;
// Issue read as above
generic_make_request(bi);
semaphore_dec(&local_ctx->semaphore);
// Now the concurrent async IO has finished and we interpret the data
[...]
}
Oh boy I really don't know if this is even remotely close. Any hints, as
easy as they may seem to you guys, are really greatly appreciated. I've
never worked with this stuff.
> If you cannot continue because devices are not present or the right size,
> yes you should fail the ctr routine.
Alright!
> If you want to setup /proc or other monitoring stuff, you can use the init
> routine, probably plus some statics, to setup "views" into your module. If
> you want to support multiple instances (and you should), setup a
> /proc/{yourname} directory on the init and then populate it with
> sub-directories every time you create a device.
Okay, I'll try to do this (want to make statistics available via procfs
later on), but one construction site at a time for me.
>> - Can I determine the size the bio in map() will have already in ctr()
>> somehow? Can I assume it will never change if it was once determined?
>> The reason is that for my example I need to make sure the chunk size is
>> a integer multiple of the bio size and I would only like to check this
>> once (in ctr) and not every time (in map).
>
> Block size will not change. The size of requests to you is limited by the
> setup of ti->max_io_len. If you don't set this with recent kernels, you
> will only get 4K, which is not all that efficient. This is actually part
> of another big topic of "stacked limits", which someone could write a book
> on (and I would read it).
So if I would want to do a large I/O operation (say write one megabyte
of data to a block device somewhere within my driver) I'd have to make
lots of calls to generic_make_request?
Thank you so much for your help,
Best regards,
Johannes
More information about the dm-devel
mailing list