Mr.Brownstone
“He won’t leave me alone…” Posts: 1,546 View profile Send message |
|
|
Taken from elsewhere for my own notes:
1. Use "mdadm --manage /dev/md0 -r /dev/sdd" to remove the drive that was marked as faulty from the array. 2. Power down and replace the drive with a good drive. 3. Power up and set the partition table on the new drive to match those of the other drives in the array. Here we used "sfdisk -d /dev/sda | sfdisk /dev/sdd". 4. Add the proper partition on the new drive into the array, "mdadm --manage /dev/md0 -a /dev/sdd2" 5. Sit back and wait for the recovery to happen, you can "cat /proc/mdstat" to watch its progress; you should see something like: Bash Script: Personalities : [raid5] md0 : active raid5 sdd2[4] sdc2[2] sdb2[1] sda2[0] 731985408 blocks level 5, 256k chunk, algorithm 2 [4/3] [UUU_] [===>.................] recovery = 19.7% (48253056/243995136) finish=59.1min speed=55184K/sec |
|
| Originally posted: 15th of Dec 2008, 11:35 pm |
Mr.Brownstone
“He won’t leave me alone…” Posts: 1,546 View profile Send message |
|
|
To get identification of the failed drive so it can be removed without error:
Bash Script: emerge --quiet sdparm sdparm -i /dev/sd* |
|
| Originally posted: 15th of Dec 2008, 11:50 pm |
Mr.Brownstone
“He won’t leave me alone…” Posts: 1,546 View profile Send message |
|
|
If you wish to use the opportunity of a failed drive to increase the size of your array, refer to this:
http://linux-raid.osdl.org/index.php/Growing |
|
| Originally posted: 16th of Dec 2008, 12:02 am |
Mr.Brownstone
“He won’t leave me alone…” Posts: 1,546 View profile Send message |
|
|
Another backup of a useful thread.
How to backup your hard drive (the type of format doesn't matter) using dd. Boot to some rescue mode by using the install media (generally "linux rescue") otherwise enter rescue mode manually: Linux Recovery Make sure not to be booted to your hard drive, nor to have any of those partitions mounted. Now use any combination of dd, ssh or rsh, gzip or bzip2 to backup the drive (I recommend using ssh versus rsh; however, ssh is generally not available during the rescue mode, whereas rsh is available): You can backup the whole drive (if you have enough space on your destination system) as follows (This method also grabs the MBR): Bash Script: dd if=/dev/sda | rsh user@dest "gzip -9 >20030220-backup-sda.dd.gz" Or (if bandwidth is short) Bash Script: dd if=/dev/sda | gzip -c9 | rsh remuser@remhost 'cat >whateveryoulike.gz' A restore using this method would be as follows: Bash Script: rsh user@dest "cat 20080220-backup-sda.dd.gz | gunzip" | dd of=/dev/sdaTo backup individual partitions, be sure to grab the MBR because it contains the partition table, as well as any partitions you want to backup: Bash Script: dd if=/dev/sda bs=512 count=1 | rsh user@dest "cat - > 20030220-backup-mbr.dd" dd if=/dev/sda1 | rsh user@dest "gzip -9 > 20030220-backup-sda1.dd.gz" A restore would go as follows - be sure to restore the MBR, reboot, then restore the other partitions. Bash Script: rsh user@dest "cat 20030220-backup-mbr.dd" | dd of=/dev/sda reboot to re-read partition table (come back into rescue mode) rsh user@dest "cat 20030220-backup-sda1.dd.gz | gunzip" | dd of=/dev/sda1 Depending on which machine is the fastest and how fast your network is, you need to decide when you will do the compression. Your choices are to compress before sending over the network, but if this machine is much slower than the server you are sending to, then it may be better to send the uncompressed data over the network to the destination server and compress as the data arrives. Just keep in mind that the transfer over the network will be a little slower if sending uncompressed data rather than compressed -- also the network speed affects this too -- 10 Mbit vs. 100 Mbit. Use your best judgement. |
|
| Originally posted: 19th of Dec 2008, 3:22 pm |
Mr.Brownstone
“He won’t leave me alone…” Posts: 1,546 View profile Send message |
|
|
Following the instructions in one of the links above I'd managed to render parts of my RAID5 partition unreadable after growing it:
Bash Script: mdadm /dev/md0 --grow --size=max Bash Script: e2fsck 1.39 (29-May-2006) The filesystem size (according to the superblock) is 366285952 blocks The physical size of the device is 195697024 blocks Either the superblock or the partition table is likely to be corrupt! Abort<y>? yes Bash Script: ls /mnt/raid/ ls: cannot access /mnt/raid/old_drives: Input/output error ls: cannot access /mnt/raid/docs: Input/output error ls: cannot access /mnt/raid/iso+vm: Input/output error ls: cannot access /mnt/raid/music: Input/output error ... (some files and folders still accessible) Bash Script: e2fsck -f /dev/md0 ... Error reading block 205980086 (Invalid argument) while doing inode scan. Ignore error<y>? yes Force rewrite<y>? /dev/md0: e2fsck canceled. /dev/md0: ***** FILE SYSTEM WAS MODIFIED ***** /dev/md0: ********** WARNING: Filesystem still has errors ********** Step #1, find out how big the array should be: Bash Script: morgul ~ # cat /proc/mdstat Personalities : [raid5] [raid4] md0 : active raid5 sda1[0] sdc1[2] sdb1[1] 2930271744 blocks level 5, 256k chunk, algorithm 2 [3/3] [UUU] unused devices: <none> morgul ~ # mdadm -D /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Mon Sep 11 02:00:29 2006 Raid Level : raid5 Array Size : 782788096 (746.52 GiB 801.58 GB) Used Dev Size : 1465135872 (1397.26 GiB 1500.30 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Wed Dec 24 15:43:29 2008 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 256K UUID : e2c34576:3fcb9849:534480be:98638fa7 Events : 0.15459491 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 2 8 33 2 active sync /dev/sdc1 Bash Script: morgul ~ # mdadm /dev/md0 --grow --size=782788096 morgul ~ # cat /proc/mdstat Personalities : [raid5] [raid4] md0 : active raid5 sda1[0] sdc1[2] sdb1[1] 1565576192 blocks level 5, 256k chunk, algorithm 2 [3/3] [UUU] unused devices: <none> Weirdly, once fixed I could see that the mdadm detail report for the problem partition had "backwards" numbers for Array and Used Dev size. Here is the output of the "healthy" array, with the "unhealthy" one already printed above: Bash Script: morgul ~ # mdadm -D /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Mon Sep 11 02:00:29 2006 Raid Level : raid5 Array Size : 1565576192 (1493.05 GiB 1603.15 GB) Used Dev Size : 782788096 (746.52 GiB 801.58 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Wed Dec 24 16:06:53 2008 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 256K UUID : e2c34576:3fcb9849:534480be:98638fa7 Events : 0.15459622 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 2 8 33 2 active sync /dev/sdc1 |
|
| Originally posted: 24th of Dec 2008, 4:08 pm |