mdadm replace failed hard drive RAID1
If yoo need to replace a failed drive
I got a few failed hard drives in software RAID1 and decided to write this article not to search for the procedure the next time this happens.
Detect mdadm failed hard drive
If you have a lot of error messages in your /var/log/messages and probably get a mail from mdadm monitorring
This is an automatically generated mail message from mdadm running on <host> A DegradedArray event had been detected on md device /dev/md3. Faithfully yours, etc. P.S. The /proc/mdstat file currently contains the following: Personalities : [raid1] md3 : active raid1 sdb4[1] 1847608639 blocks super 1.2 [2/1] [_U] md2 : active raid1 sdb3[1] 1073740664 blocks super 1.2 [2/1] [_U] md1 : active raid1 sdb2[1] 524276 blocks super 1.2 [2/1] [_U] md0 : active (auto-read-only) raid1 sdb1[1] 8387572 blocks super 1.2 [2/1] [_U] unused devices: <none>
The main thing when you cat /proc/mdadm is that you will see [_U] or [U_] depending witch hard drive has failed if everything is OK you will see [UU]
Removing the mdadm failed hard drive
In my case i had /dev/sda failed so i had to mark /dev/sda1, /dev/sda2, /dev/sda3 and /dev/sda4 as failed and remove them from their respective RAID arrays
mdadm --manage /dev/md0 --fail /dev/sda1 mdadm --manage /dev/md0 --remove /dev/sda1 mdadm --manage /dev/md1 --fail /dev/sda2 mdadm --manage /dev/md1 --remove /dev/sda2 mdadm --manage /dev/md2 --fail /dev/sda3 mdadm --manage /dev/md2 --remove /dev/sda3 mdadm --manage /dev/md3 --fail /dev/sda4 mdadm --manage /dev/md3 --remove /dev/sda4
Now shutdown the server and replace the hard drive
shutdown -h now
Add the new hard drive
After replacing the failed /dev/sda disk boot the system and copy the partition table to match the old /dev/sdb drive witch has the data.
The simple command is:
sfdisk -d /dev/sdb | sfdisk /dev/sda
Them use fdisk -l to check it.
If you have message like this: WARNING: GPT (GUID Partition Table) detected on ‘/dev/sdb’! The util fdisk doesn’t support GPT. Use GNU Parted. you need to install gdisk
apt-get install gdisk
And now copy the partition table from disk /dev/sdb to /dev/sda
sgdisk -R /dev/sda /dev/sdb sgdisk -G /dev/sda
You should get The operation has completed successfully. as output on both commands.
Now add the new partitions to the RAID1 Array:
mdadm --manage /dev/md0 --add /dev/sda1 mdadm --manage /dev/md1 --add /dev/sda2 mdadm --manage /dev/md2 --add /dev/sda3 mdadm --manage /dev/md3 --add /dev/sda4
You can monitor the process with cat /proc/mdadm
cat /proc/mdstat Personalities : [raid1] md3 : active raid1 sda4[2] sdb4[1] 1847608639 blocks super 1.2 [2/1] [_U] [>....................] recovery = 0.0% (1575168/1847608639) finish=1119.2min speed=27488K/sec md2 : active raid1 sda3[2] sdb3[1] 1073740664 blocks super 1.2 [2/1] [_U] resync=DELAYED md1 : active raid1 sda2[2] sdb2[1] 524276 blocks super 1.2 [2/1] [_U] resync=DELAYED md0 : active raid1 sda1[2] sdb1[1] 8387572 blocks super 1.2 [2/1] [_U] resync=DELAYED unused devices: <none>
After a fer hours you should have [UU] at the and of each Array
cat /proc/mdstat Personalities : [raid1] md3 : active raid1 sda4[2] sdb4[1] 1847608639 blocks super 1.2 [2/2] [UU] md2 : active raid1 sda3[2] sdb3[1] 1073740664 blocks super 1.2 [2/2] [UU] md1 : active raid1 sda2[2] sdb2[1] 524276 blocks super 1.2 [2/2] [UU] md0 : active raid1 sda1[2] sdb1[1] 8387572 blocks super 1.2 [2/2] [UU] unused devices: <none>
I hope this article will help you not to lose your data.
mdadm: https://en.wikipedia.org/wiki/Mdadm