Reinstall and keep data LV

Fri Dec 31 00:24:34 UTC 2021

On Wed, Sep 1, 2021 at 1:32 PM David Lehman <dlehman at redhat.com> wrote:

> On Tue, Aug 31, 2021 at 4:37 PM Markus Falb <wnefal at gmail.com> wrote:
>
>>
>>
>> > On 30.08.2021, at 23:26, Brian C. Lane <bcl at redhat.com> wrote:
>> >
>> > On Thu, Aug 26, 2021 at 06:11:49PM +0200, Markus Falb wrote:
>> >
>> >> The solution is to activate the LVs in %pre
>> >> It turns out that there is /dev/sda present but not the device files
>> for /dev/sdaX.
>> >>
>> >> …snip
>> >> %pre
>> >> mknod /dev/sda2 b 8 2
>> >> pvscan
>> >> vgchange -ay
>> >> %end
>> >> snap…
>> >>
>> >> alternatively this oneliner is working too, interestingly
>> >>
>> >> …snip
>> >> %pre
>> >> parted /dev/sda unit MiB print
>> >> %end
>> >> snip…
>> >>
>> >> Note that with the parted command it is not necessary to vgchange
>> afterwards.
>> >>
>> >> Is there a builtin kickstart command that accomplishes the same
>> instead of some %pre?
>> >> If not, why is %pre necessary? %pre was not necessary with RHEL7. Is
>> this by design or is it a bug?
>> >
>> > This is a bug of some sort, as David said. The fact that parted print
>> > fixed it makes me think that your storage is slow, since all parted does
>> > is open/close the device and tell the kernel to rescan the partitions --
>> > which should have already happened at boot time when the device
>> > appeared.
>>
>> I am testing with a kvm VM created with Virtual Machine Manager on CentOS
>> 7.
>> The VM has a scsi disk (changing to IDE or SATA does not change the
>> behaviour)
>>
>> I remember that I was trying “udevadm settle” in %pre and this was
>> returning
>> fast and that’s why I thought that it was not waiting for some slow udev
>> event.
>>
>> I had another look.
>> I added a sleep 600 and removed the parted from parted (600s should be
>> plenty of time for detection)
>>
>> Here is my interpretation:
>>
>> The kernel *did* detect the partitions in early initramfs
>>
>> …
>> Aug 31 14:19:50 localhost kernel:  sda: sda1 sda2
>> Aug 31 14:19:50 localhost kernel: sd 0:0:0:0: [sda] Attached SCSI disk
>> …
>>
>> If I add rd.break kernel parameter I can see that the devices are there.
>> But after switching root (pivoting) they are gone. I do not know if this
>> is expected or not.
>>
>
> The lvs being gone is expected, but the partitions being gone is not
> expected.
>
>
>>
>> So while %pre is running sda1 and sda2 are not present given that I did
>> not trigger udev with parted or similar.
>>
>> After %pre is finished it is detecting sda1 and sda2 again, and it is
>> finding the VG and the LVs, but then it is stopping the VG (which is what
>> I find strange) and throwing the error
>
>
> Stopping the the VG is a normal part of the process. The installer makes a
> model of the storage and then deactivates it until it is needed. The
> partitions
> should still be visible, however.
>
>
>>
>>
>> …
>> initramfs
>> …
>> Aug 31 14:19:50 localhost kernel:  sda: sda1 sda2
>> Aug 31 14:19:50 localhost kernel: sd 0:0:0:0: [sda] Attached SCSI disk
>> …
>> pivot root filesystem
>> …
>> running pre (10 minutes sleep)
>> …
>> Aug 31 14:30:14 test kernel:  sda: sda1 sda2
>> …
>> Aug 31 14:30:15 test org.fedoraproject.Anaconda.Modules.Storage[1903]:
>> DEBUG:blivet:                 DeviceTree.handle_device: name: lvm.vg1-root
>> ; info: {'DEVLINKS':
>> '/dev/disk/by-uuid/9c60e33e-03e0-42c4-a583-868f4fd1b2b4 '
>> …
>> Aug 31 14:30:15 test org.fedoraproject.Anaconda.Modules.Storage[1903]:
>> INFO:program:Running [36] lvm vgchange -an lvm.vg1 --config= devices {
>> preferred_names=["^/dev/mapper/", "^/dev/md/", "^/dev/sd"] } log {level=7
>> file=/tmp/lvm.log syslog=0} …
>> …
>> Aug 31 14:30:21 test org.fedoraproject.Anaconda.Modules.Storage[1903]:
>> Logical volume "root" given in logvol command does not exist.
>> …
>>
>> If someone is interested, I created a gist with the kickstarts and logs at
>> https://gist.github.com/610acf7379f48d0e5c38f4edb9cda176
>> (you can clone it with git)
>>
>> I found no obvious error, but there is a lot of stuff and I could have
>> missed something easily.
>>
>
> I see in storage.log that it successfully looks up (get_device_by_name)
> the lvm.vg1-root LV in its model shortly before the error occurs, which is
> strange. I also do not see any failing calls to get_device_by_name for
> the root LV once the initial scan has completed.
>
>
>>
>> Given that anaconda sees the LVs, do you still think that it is a kernel
>> problem or the storage too slow?
>>
>> Best Regards, Markus
>> and thanks too all who took the time answering.
>
>
I know this is very old, but I am just getting back to readin list emails,
and compared to other lists, this one does not have much traffic - which is
good because kickstart is doing what is supposed to.

To the OP, I had this same issue when using %pre to not only create arrays
on storage controllers, but them trying to read them.  My only ever
solution I could find that always worked was "partprobe" sprinkles through
my scripts.  The good thing about partprobe is that it blocks and does not
return until the kernel says it is ok.  In my case, creating the arrays
through the BMC, etc manually would also fix it, but who wants to click all
of those buttons in the slow ass BMC guis..

Anyhow, putting a partprobe before every new disk call fixed it for me and
all device nodes where properly populated by udev, etc afterward..

-Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/kickstart-list/attachments/20211230/1c4cad16/attachment.htm>