Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Software RAID install on previous used mdadm disks #107

Open
olivierlambert opened this issue Dec 5, 2018 · 20 comments
Open

Software RAID install on previous used mdadm disks #107

olivierlambert opened this issue Dec 5, 2018 · 20 comments
Assignees

Comments

@olivierlambert
Copy link
Member

IIRC, we are already using mdadm --zero-superblock /dev/sdX to clean all the selected disks from previous mdadm superblocks.

However, if mdadm was used on partitions level (eg sda2), our command won't clean it, and the install will fail.

Ideally, we should loop on every partition and remove the superblock. Maybe there is a better way (super block detection?) to find and remove only where a superblock was stored before.

@randadinata
Copy link

If we already have user consent for destroying data, can't we just nuke the first and last 2MiB with dd followed by partprobe ? 🤣 everything in between doesn't matter anymore

@olivierlambert
Copy link
Member Author

That's an option (but same idea: on each partition). So it won't simplify a lot the equation (need to loop on each partition).

@oallart
Copy link

oallart commented Dec 11, 2018

We have a similar approach, a script that nukes the md's and other bits.
It can be passed as <script stage="installation-start" type="url"> from a remote server on an unattended install

The script runs

mdadm --zero-superblock
wipefs --all
sgdisk -Z

on all drives and/or partitions

@olivierlambert
Copy link
Member Author

Can you describe with more details exactly each step? (and in which order?) So we can maybe do that instead just the zero on the whole disk (and miss partitions).

@oallart
Copy link

oallart commented Dec 11, 2018

Yes, am working on refining that right now, it's not quite world ready yet. Works well via pxe on a rescue boot, but not quite there as an integrated step with the xs answerfile method.

Basically it does

  1. identify, activate and destroy LVM
  2. identify, activate and destroy md's
  3. wipefs (erase filesystem, raid or partition-table signatures)
  4. zap the GPT and MBR data structures
  5. dd zero some parts just in case

Some of these are probably redundant but work well. We use the script to zero drives for reinstall much faster than dban or a full dd zero can do.
I'll get something a bit cleaner and will share.

@oallart
Copy link

oallart commented Dec 12, 2018

Ok so here's something I tested a bit and does work when supplied from an answer file as <script stage="installation-start" type="url">

Still a bit crude but works well.
Output is redirected to /tmp/prescript.log
I also have in there a bit to prevent the package installation delay caused by the md resync.

#!/bin/sh
# O. Allart - 2018/12
# to be executed at the very first stage of install of a fresh xenserver
# - dbanlite style wipe
# - disable md resync
{
# identify partitions, md devices
# map partitions to md devices 
echo "md devices found:"
cat /proc/mdstat | grep ^md  
if [[ $? -ne 0 ]]
then 
	echo "No software RAID md device found in /proc/mdstat, no MD to destroy"
else
	
	for DEVICE in $(cat /proc/mdstat | sed -n 's/\(md[0-9]\+\).*\(sd[a-f][1-9]\?\).*\(sd[a-f][1-9]\?\).*/\1:\2:\3/p'); do
		# Extract md device and associated devices
		MD=$(echo $DEVICE | cut -d: -f1)
		DEV1=$(echo $DEVICE | cut -d: -f2)
		DEV2=$(echo $DEVICE | cut -d: -f3)
	
		# test these are valid
		mdadm --detail /dev/$MD | head -5
		if [[ $? -ne 0 ]]; then
			echo "Reported device /dev/$MD invalid"
			exit 6
		fi
	
		mdadm -E /dev/$DEV1 | head -10
		if [[ $? -ne 0 ]]; then
			echo "Reported partion /dev/$DEV1 invalid"
			exit 7
		fi
		mdadm -E /dev/$DEV2 | head -10
		if [[ $? -ne 0 ]]; then
			echo "Reported partion /dev/$DEV2 invalid"
			exit 7
		fi
	
		echo "Stopping device"
		mdadm --stop /dev/$MD
		if [[ $? -ne 0 ]]; then echo "Device $MD could not be stopped" && exit 8; fi
	
		echo "Zeroing superblock on /dev/$DEV1"
		mdadm --zero-superblock /dev/$DEV1
		if [[ $? -ne 0 ]]; then
			mdadm --zero-superblock /dev/$DEV1
			if [[ $? -ne 0 ]]; then echo "CRITICAL: Partion /dev/$DEV1 could not be zero'd - Drive is NOT ready for reuse" && exit 9; fi
		fi

		echo "Zeroing superblock on /dev/$DEV2"
		mdadm --zero-superblock /dev/$DEV2
		if [[ $? -ne 0 ]]; then
			mdadm --zero-superblock /dev/$DEV2
			if [[ $? -ne 0 ]]; then echo "CRITICAL: Partion /dev/$DEV2 could not be zero'd - Drive is NOT ready for reuse" && exit 9; fi
		fi
		echo "-------------------------------------------------------------"
	
	done
fi

# Finishing touch: wipe FS signatures, zap partition tables.
for DRIVE in $(cat /proc/partitions | grep -o "sd[a-z]$")
do
        echo Finishing $DRIVE
        wipefs --all /dev/$DRIVE
        sgdisk -Z /dev/$DRIVE
done
 
# delays resync to speed up install in raid1 md configs
echo 0 > /proc/sys/dev/raid/speed_limit_max
echo 0 > /proc/sys/dev/raid/speed_limit_min
} > /tmp/prescript.log 2>&1

@olivierlambert
Copy link
Member Author

Pinging this info to @nraynaud who did the software RAID stuff, for potential inclusion directly in the installer 👍

@gdelafond
Copy link

gdelafond commented Dec 12, 2018

@olivierlambert @nraynaud if you include it in the installer, beware that disk's name will not always match sd[a-z]$.
As far as I know, Linux disk name scheme is the following:

  • SATA/SAS: sd[a-z]+$
  • inside a VM (to make tests) usually: xvd[a-z]+$
  • NVMe: nvme[0-9]+$.

Some rules are defined in /lib/udev/rules.d/60-persistent-storage.rules

Maybe I should not take information from /proc/partitions but from something like: lsblk | awk '$6 == "disk" {print $1}' ?

@gdelafond
Copy link

Instead of erasing all available drives, maybe the installer should ask for the disk have to be erased. Or only erase disk that have been chosen for the XCP installation.

@olivierlambert
Copy link
Member Author

Yes, this is already what we do (the select disk only are magic block zeroed). But we lack the fact of doing that on all partitions

@oallart
Copy link

oallart commented Dec 12, 2018

Good points.
As said earlier, it is a bit crude and more specific to our use. But glad to see the ball rolling and hoping for the feature to be included someday. It's nice that xcp-ng has the tools available to perform the various tasks (sgdisk, wipefs etc.).
We work a lot with answer files (see my posts on upgrading too) so we can build the logic around drives in there. Until the feature is built in, there is an avenue for people to use the feature externally. Those script stage entries are incredibly useful.

@olivierlambert
Copy link
Member Author

@oallart feel free to create a dedicated entry in the Wiki with a "how to", this could be useful for all XCP-ng users 👍

@oallart
Copy link

oallart commented Dec 14, 2018

@olivierlambert yep I have already started and taken over some sections 😄

@gdelafond
Copy link

5. dd zero some parts just in case

You can wipe all fs information with:

DISK=sda
LBAS=$(cat /sys/block/$DISK/size)
dd if=/dev/zero of=/dev/$DISK bs=512 count=1024
dd if=/dev/zero of=/dev/$DISK bs=512 seek=$(($LBAS-1024)) count=1024

@nraynaud
Copy link
Member

nraynaud commented Dec 20, 2018

Hi all, I am working on the issue. The UI side of things is a bit complicated.

I worked with the installer yesterday.

  1. Here is what I have:
  • some RAID array devices (/dev/md127) could be hidden in the UI because they expose less than 46GB. But their underlying members could represent more than that and be recycled in a new configuration for XCP-ng.
  • if a RAID array exists but is hidden, modifying it will simply not happen, there is a guard in the code, but there is no user feedback.
  1. I am thinking of various UI solutions:
  • add a screen between "EULA" and "Select Primary disk" that would show everything (disks, partitions, RAIDs, and maybe LVM) and allow for some destructive actions on those (delete RAIDs, partitions, boot bits, FS markers, RAID member markers). Then the workflow would continue to the "Select Primary disk" screen.
  • Or somehow show what has been filtered out in the Select Primary disk screen partition and allow interaction with it (I am still unclear on this)
  1. As for the partitions (eg. /dev/sda2), should we keep the partitions as they exist or destroy them and use full disks all the time?

@olivierlambert
Copy link
Member Author

IMHO, when the user select its disks, it should destroy everything on it, without any other possibilities. XCP-ng is a kind of "Xen Appliance", not a "normal" Linux distro. Partitioning is done by XCP-ng, not the user.

@stormi
Copy link
Member

stormi commented Jan 9, 2019

For those willing to test the pull request or even help developing the feature, here's a guide that explains how to build a modified ISO image with a modified installer:
https://github.com/xcp-ng/xcp/wiki/Modifying-the-installer

@klou
Copy link

klou commented Jan 21, 2019

I'm not in a position to try this, but we upgraded from a XS-7.0 (RAID 1 on individual partitions) to XCP 7.5 (RAID 1 on individual disks) a few months ago, and I'm trying to figure out why my IO sucks.

Anyways, the below is from dmesg, in case it helps as an additional example of stuff left over on a 7.5 conversion.

[    3.011900] GPT:Primary header thinks Alt. header is not at the end of the disk.
[    3.011903] GPT:1465148799 != 1465149167
[    3.011905] GPT:Alternate GPT header not at the end of the disk.
[    3.011906] GPT:1465148799 != 1465149167
[    3.011907] GPT: Use GNU Parted to correct GPT errors.
[    3.011921]  sdb: sdb1 sdb2 sdb3 sdb4 sdb5 sdb6
[    3.012711] sd 2:0:0:0: [sdb] Attached SCSI disk
[    3.014863] GPT:Primary header thinks Alt. header is not at the end of the disk.
[    3.014865] GPT:1465148799 != 1465149167
[    3.014867] GPT:Alternate GPT header not at the end of the disk.
[    3.014868] GPT:1465148799 != 1465149167
[    3.014870] GPT: Use GNU Parted to correct GPT errors.
[    3.014882]  sda: sda1 sda2 sda3 sda4 sda5 sda6
[    3.015562] sd 0:0:0:0: [sda] Attached SCSI disk

@ydirson
Copy link
Contributor

ydirson commented Dec 8, 2022

Let's describe the problem differently: we're installing an appliance, not a general-purpose OS... so we should not care at all about whatever RAID/LVM setup had been on the disks we're anyway going to overwrite. The problem is, when booting the ISO, some udev rules react to the presence of software-RAID signatures in some disks/partitions and assemble them... which is what we don't want. And in fact, that udev rules file from CentOS (/lib/udev/rules.d/65-md-incremental.rules) already has a special-case to neutralize it when running the Anaconda installer.

So we're left with a few actions to take:

  • inform udev that an installer is running, so it won't auto-assemble RAID arrays
  • clear the partition table in the disks selected for assembling a new RAID for good measure
  • add special support to detect a previous installation of XCP-ng on RAID, since this is the one case where we may want to activate a preexisting RAID ("may", because we still don't want to activate it if we're going to overwrite the disks with a new install)

@ydirson ydirson assigned ydirson and unassigned nraynaud Dec 13, 2022
@ydirson
Copy link
Contributor

ydirson commented Dec 13, 2022

A test image is now available here. Please let us know if it works for you!
It is based on the 8.3-alpha2 install image, with installer changes detailed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants