[dm-devel] [mdadm PATCH 4/4] Create: tell udev device is not ready when first created.

Wed May 3 14:27:22 UTC 2017

On 05/02/2017 03:40 PM, Jes Sorensen wrote:
> On 05/02/2017 07:40 AM, Peter Rajnoha wrote:
>> On 05/01/2017 06:35 AM, NeilBrown wrote:
>>> On Fri, Apr 28 2017, Peter Rajnoha wrote:
>>>> Then mdadm opens the devive, clears any old content/signatures the data
>>>> area may contain, then closes it - this generates the third event -
>>>> which is the "synthetic change" event (as a result of the inotify WATCH
>>>> rule). And this one would drop the "not initialized" flag in udev db
>>>> and
>>>> the scans in udev are now enabled.
>>>
>>> Nope, it would be incorrect for mdadm to clear any old content.
>>> Sometimes people want to ignore old content.  Sometimes they very
>>> definitely want to use it.  It would be wrong for any code to try to
>>> guess what is wanted.
>>
>> The mdadm is not going to guess - it can have an option to
>> enable/disable the wiping on demand directly on command line (which is
>> also what is actually done in LVM).
> 
> I know the anaconda team keeps pushing for the nonsense of being able to
> wipe drives on creation. It is wrong, it is broken, and it is not going
> to happen.
> 

I'm not thinking about anaconda at the moment at all. It's just one of
the many users of mdadm. I'm thinking about a fix in general for all the
users which expect the device to be initialized properly when mdadm
--create returns.

>> Otherwise, if mdadm is not going to wipe/initialize the device itself,
>> then it needs to wait for the external tool to do it (based on external
>> configuration or on some manual wipefs-like call). So the sequence
>> would be:
>>
>>  1) mdadm creating the device
>>  2) mdadm setting up the device, marking it as "not initialized yet"
>>  4a) mdadm waiting for the external decision to be made about wiping part
>>  4b) external tool doing the wiping (or not) based on configuration or
>> user's will
>>  5) mdadm continuing and finishing when the wiping part is complete
>>
>> I expect mdadm to return only if the device is properly initialized
>> otherwise it's harder for other tools called after mdadm to start
>> working with the device - they need to poll for the state laboriously
>> and check for readiness.
> 
> What defines readiness? Some believe a raid1 array must be fully
> assembled with all members, other setups are happy to have one running
> drive in place.....
> 

With "ready" I mean the time when it's safe to do a scan without seeing
old data (garbage) that may confuse udev hooks and udev event listeners.
That scan is done at some time - at the time the activating "change"
uevent comes and this rule does not pass
ATTR{md/array_state}=="|clear|inactive" (if it passes, the device is not
scanned yet).

> 4a/4b in your list here once again has no place in mdadm. Please kindly
> tell the anaconda team to go back and handle this properly instead.

The mdadm is creating the dev and so it should be responsible primarily
for providing a device which is cleared and ready for use without
causing any confusion on event-based system where various scans are
executed based on incoming udev events.

Alternatively, if mdadm is not going to be the place where the wiping
happens, I'd expect at least the sequence above (which is more complex,
yes, that's why I think having the wiping directly in mdadm is much
easier solution).

If you don't wipe the data and you don't give time for others to hook in
to do that, you make it harder for others (they need to deactivate all
the stack/garbage that is found in the data area after previous use).

Also, we can't reliably call wiping on the underlying components first,
because once they become MD components, the data are for the MD device
has an offset and new data range is revealed from the underlying devices
which may expose old signatures which were not visible on those
underlying devices before the MD device got created.

-- 
Peter