RAID6-2017: Unterschied zwischen den Versionen
Zur Navigation springen
Zur Suche springen
Root (Diskussion | Beiträge) Keine Bearbeitungszusammenfassung |
|||
Zeile 154: | Zeile 154: | ||
10 8 49 - spare /dev/sdd1 | 10 8 49 - spare /dev/sdd1 | ||
=== 21.10.2019 Ausfall von sdc === | |||
* Heute Abend waren die Shares plötzlich unendlich langsam | |||
* Teilweise waren sogar die Shares "offline" | |||
* Im Serverraum dann folgendes Bild: eine LED einer Platte war dauerhaft an, also wie unter Dauerfeuer | |||
* Das HDD LED des Boards war auch auf Dauerbetrieb ohne Flackern, also irgendwas schlimm im argen | |||
* Ich habe die "schuldige" Platte einfach rausgezogen, danach wurde das Spare in den Verbund aufgenommen | |||
* Es erfolgte der Weiterbetrieb störungsfrei und das Recovery mit Hilfe des Spares | |||
[Mon Oct 21 19:50:35 2019] ata5.00: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen | |||
[Mon Oct 21 19:50:35 2019] ata5.00: irq_stat 0x00400040, connection status changed | |||
[Mon Oct 21 19:50:35 2019] ata5: SError: { HostInt PHYRdyChg 10B8B DevExch } | |||
[Mon Oct 21 19:50:35 2019] ata5.00: failed command: READ DMA EXT | |||
[Mon Oct 21 19:50:35 2019] ata5.00: cmd 25/00:08:f0:4f:fa/00:00:43:00:00/e0 tag 26 dma 4096 in | |||
res 50/00:00:ef:4f:fa/00:00:43:00:00/e3 Emask 0x50 (ATA bus error) | |||
[Mon Oct 21 19:50:35 2019] ata5.00: status: { DRDY } | |||
[Mon Oct 21 19:50:35 2019] ata5: hard resetting link | |||
[Mon Oct 21 19:50:36 2019] ata5: SATA link down (SStatus 0 SControl 310) | |||
[Mon Oct 21 19:50:41 2019] ata5: hard resetting link | |||
[Mon Oct 21 19:50:41 2019] ata5: SATA link down (SStatus 0 SControl 310) | |||
[Mon Oct 21 19:50:46 2019] ata5: hard resetting link | |||
[Mon Oct 21 19:50:47 2019] ata5: SATA link down (SStatus 0 SControl 310) | |||
[Mon Oct 21 19:50:47 2019] ata5.00: disabled | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] tag#26 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] tag#26 Sense Key : Illegal Request [current] | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] tag#26 Add. Sense: Unaligned write command | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] tag#26 CDB: Read(16) 88 00 00 00 00 00 43 fa 4f f0 00 00 00 08 00 00 | |||
[Mon Oct 21 19:50:47 2019] blk_update_request: I/O error, dev sdc, sector 1140477936 | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: rejecting I/O to offline device | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] killing request | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: rejecting I/O to offline device | |||
... | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: rejecting I/O to offline device | |||
[Mon Oct 21 19:50:47 2019] ata5: EH complete | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] CDB: Read(16) 88 00 00 00 00 00 43 fa 4f f8 00 00 00 08 00 00 | |||
[Mon Oct 21 19:50:47 2019] blk_update_request: I/O error, dev sdc, sector 1140477944 | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: rejecting I/O to offline device | |||
[Mon Oct 21 19:50:47 2019] ata5.00: detaching (SCSI 4:0:0:0) | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] Stopping disk | |||
[Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK | |||
[Mon Oct 21 19:50:51 2019] md/raid:md127: Disk failure on sdc1, disabling device. | |||
md/raid:md127: Operation continuing on 4 devices. | |||
[Mon Oct 21 19:51:01 2019] RAID conf printout: | |||
[Mon Oct 21 19:51:01 2019] --- level:6 rd:5 wd:4 | |||
[Mon Oct 21 19:51:01 2019] disk 0, o:1, dev:sde1 | |||
[Mon Oct 21 19:51:01 2019] disk 1, o:1, dev:sdf1 | |||
[Mon Oct 21 19:51:01 2019] disk 2, o:1, dev:sdb1 | |||
[Mon Oct 21 19:51:01 2019] disk 3, o:1, dev:sda1 | |||
[Mon Oct 21 19:51:01 2019] disk 4, o:0, dev:sdc1 | |||
[Mon Oct 21 19:51:01 2019] RAID conf printout: | |||
[Mon Oct 21 19:51:01 2019] --- level:6 rd:5 wd:4 | |||
[Mon Oct 21 19:51:01 2019] disk 0, o:1, dev:sde1 | |||
[Mon Oct 21 19:51:01 2019] disk 1, o:1, dev:sdf1 | |||
[Mon Oct 21 19:51:01 2019] disk 2, o:1, dev:sdb1 | |||
[Mon Oct 21 19:51:01 2019] disk 3, o:1, dev:sda1 | |||
[Mon Oct 21 19:51:01 2019] RAID conf printout: | |||
[Mon Oct 21 19:51:01 2019] --- level:6 rd:5 wd:4 | |||
[Mon Oct 21 19:51:01 2019] disk 0, o:1, dev:sde1 | |||
[Mon Oct 21 19:51:01 2019] disk 1, o:1, dev:sdf1 | |||
[Mon Oct 21 19:51:01 2019] disk 2, o:1, dev:sdb1 | |||
[Mon Oct 21 19:51:01 2019] disk 3, o:1, dev:sda1 | |||
[Mon Oct 21 19:51:01 2019] disk 4, o:1, dev:sdd1 | |||
[Mon Oct 21 19:51:01 2019] md: unbind<sdc1> | |||
[Mon Oct 21 19:51:01 2019] md: export_rdev(sdc1) | |||
[Mon Oct 21 19:51:01 2019] md: recovery of RAID array md127 | |||
[Mon Oct 21 19:51:01 2019] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. | |||
[Mon Oct 21 19:51:01 2019] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. | |||
[Mon Oct 21 19:51:01 2019] md: using 128k window, over a total of 1953382400k. |
Version vom 21. Oktober 2019, 19:02 Uhr
- host "server"
- Raid-6 aus 5+1 2TB Partitionen (1x3TB Platte und 5x2TB Platten)
- erstellt am "Wed Jan 28 22:47:57 2015"
- 4x 512 GByte im RAID 6 migriert auf
- 4x 2 TB im RAID 6 (-> 4 TB) migriert auf
- 5x 2 TB im RAID 6 (-> 6 TB)
Maximum
- Maximaler Vollausbau (da 32 Bit-ext4):
- 7x 3 TB im RAID 6 (-> 15 TB)
RAID-Platten
1 ; 2 ; 3 ; 4 ; 5 ; 6 0 ; /dev/sdd1 ; 01.1-7 ; 2TB ; ST2000VX004-1RU1 ; Z4Z1NATB1 ; /dev/sde1 ; 01.1-9 ; 2TB ; WD20EFRX-68E ; WD-WCC4M1ZJ8ZKZ1 ; /dev/sdf1 ; ; 2TB ; WD20PURX-64P ; WD-WCC4M0XEZ7CH (B2) 2 ; /dev/sdb1 ; 17.0-4 ; 2TB ; ST2000VX004-1RU1 ; Z4Z1NB4W 3 ; /dev/sda1 ; 17.0-3 ; 2TB ; WD20EFRX-68E ; WD-WCC4M5ZDK1AE 4 ; /dev/sdc1 ; 17.0-5 ; 3TB ; TOSHIBA DT01ACA3 ; 554468NGS
1 = RaidDevice 2 = Linux Partition Name 3 = SATA Connector 4 = Capacity 5 = Model 6 = Serial
Unused-Platten
TOSHIBA DT01ACA3 5544688GS (A3) sdd
Details
Dateisystem
- ext4 (ohne 64 Bit!)
Festplatten-Positionen
- A1|A2|A3|A4|B1|B2|B3|B4 (Bay-Bezeichnungen von Links nach Rechts)
Kabel serial /dev/disk/by-id/ RAID-Rolle device A1 1 rot WD-WCC4M1ZJ8ZKZ "ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M1ZJ8ZKZ" 1 sdgA2 2 rot Z4Z1NATB "ata-ST2000VX004-1RU164_Z4Z1NATB" 0 sdeA2 2 rot -- leer -- A3 3 gelb 5544688GS "ata-TOSHIBA_DT01ACA300_5544688GS" Spare sdd A4 4 rot 554468NGS "ata-TOSHIBA_DT01ACA300_554468NGS" 4 sdc B1 5 rot Z4Z1NB4W "ata-ST2000VX004-1RU164_Z4Z1NB4W" 2 sdb B2 6 gelb WD-WCC4M0XEZ7CH "ata-WDC_WD20PURX-64P6ZY0_WD-WCC4M0XEZ7CH" 0 sdf B3 7 rot WD-WCC4M5ZDK1AE "ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M5ZDK1AE" 3 sda B4 8 rot -- leer --
A1 |
A2 |
A3 |
A4 |
B1 |
B2 |
B3 |
B4 |
defekt |
Details
/dev/md127: Version : 1.2 Creation Time : Wed Jan 28 22:47:57 2015 Raid Level : raid6 Array Size : 5860147200 (5588.67 GiB 6000.79 GB) Used Dev Size : 1953382400 (1862.89 GiB 2000.26 GB) Raid Devices : 5 Total Devices : 7 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Tue Oct 16 13:08:39 2018 State : clean Active Devices : 5 Wrking Devices : 7 Failed Devices : 0 Spare Devices : 2 Layout : left-symmetric Chunk Size : 512K Name : raib23:0 UUID : a9b9721a:7da8602e:313975c3:10fa337e Events : 13372 Number Major Minor RaidDevice State 5 8 65 0 active sync /dev/sde1 6 8 97 1 active sync /dev/sdg1 4 8 17 2 active sync /dev/sdb1 7 8 1 3 active sync /dev/sda1 8 8 33 4 active sync /dev/sdc1 9 8 81 - spare /dev/sdf1 10 8 49 - spare /dev/sdd1
Ereignisse
26.07.2019 Ausfall von "sde"
[9159727.049398] md/raid:md127: Disk failure on sde1, disabling device. md/raid:md127: Operation continuing on 4 devices.
- AUTOMATISCH wurde "sdf" vom "spare" zum Verband hinzugefügt
[9159732.583652] md: recovery of RAID array md127 [9186393.380680] md: md127: recovery done. [9186394.116871] RAID conf printout: [9186394.116878] --- level:6 rd:5 wd:5 [9186394.116881] disk 0, o:1, dev:sdf1 [9186394.116884] disk 1, o:1, dev:sdg1 [9186394.116886] disk 2, o:1, dev:sdb1 [9186394.116888] disk 3, o:1, dev:sda1 [9186394.116890] disk 4, o:1, dev:sdc1
- nunmehr bot sich folgendes Bild
Number Major Minor RaidDevice State 9 8 81 0 active sync /dev/sdf1 6 8 97 1 active sync /dev/sdg1 4 8 17 2 active sync /dev/sdb1 7 8 1 3 active sync /dev/sda1 8 8 33 4 active sync /dev/sdc1 5 8 65 - faulty /dev/sde1 10 8 49 - spare /dev/sdd1
- nach einem remove der faulty Platte
Number Major Minor RaidDevice State 9 8 81 0 active sync /dev/sdf1 6 8 97 1 active sync /dev/sdg1 4 8 17 2 active sync /dev/sdb1 7 8 1 3 active sync /dev/sda1 8 8 33 4 active sync /dev/sdc1 10 8 49 - spare /dev/sdd1
21.10.2019 Ausfall von sdc
- Heute Abend waren die Shares plötzlich unendlich langsam
- Teilweise waren sogar die Shares "offline"
- Im Serverraum dann folgendes Bild: eine LED einer Platte war dauerhaft an, also wie unter Dauerfeuer
- Das HDD LED des Boards war auch auf Dauerbetrieb ohne Flackern, also irgendwas schlimm im argen
- Ich habe die "schuldige" Platte einfach rausgezogen, danach wurde das Spare in den Verbund aufgenommen
- Es erfolgte der Weiterbetrieb störungsfrei und das Recovery mit Hilfe des Spares
[Mon Oct 21 19:50:35 2019] ata5.00: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen [Mon Oct 21 19:50:35 2019] ata5.00: irq_stat 0x00400040, connection status changed [Mon Oct 21 19:50:35 2019] ata5: SError: { HostInt PHYRdyChg 10B8B DevExch } [Mon Oct 21 19:50:35 2019] ata5.00: failed command: READ DMA EXT [Mon Oct 21 19:50:35 2019] ata5.00: cmd 25/00:08:f0:4f:fa/00:00:43:00:00/e0 tag 26 dma 4096 in res 50/00:00:ef:4f:fa/00:00:43:00:00/e3 Emask 0x50 (ATA bus error) [Mon Oct 21 19:50:35 2019] ata5.00: status: { DRDY } [Mon Oct 21 19:50:35 2019] ata5: hard resetting link [Mon Oct 21 19:50:36 2019] ata5: SATA link down (SStatus 0 SControl 310) [Mon Oct 21 19:50:41 2019] ata5: hard resetting link [Mon Oct 21 19:50:41 2019] ata5: SATA link down (SStatus 0 SControl 310) [Mon Oct 21 19:50:46 2019] ata5: hard resetting link [Mon Oct 21 19:50:47 2019] ata5: SATA link down (SStatus 0 SControl 310) [Mon Oct 21 19:50:47 2019] ata5.00: disabled [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] tag#26 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] tag#26 Sense Key : Illegal Request [current] [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] tag#26 Add. Sense: Unaligned write command [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] tag#26 CDB: Read(16) 88 00 00 00 00 00 43 fa 4f f0 00 00 00 08 00 00 [Mon Oct 21 19:50:47 2019] blk_update_request: I/O error, dev sdc, sector 1140477936 [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: rejecting I/O to offline device [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] killing request [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: rejecting I/O to offline device ... [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: rejecting I/O to offline device [Mon Oct 21 19:50:47 2019] ata5: EH complete [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] CDB: Read(16) 88 00 00 00 00 00 43 fa 4f f8 00 00 00 08 00 00 [Mon Oct 21 19:50:47 2019] blk_update_request: I/O error, dev sdc, sector 1140477944 [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: rejecting I/O to offline device [Mon Oct 21 19:50:47 2019] ata5.00: detaching (SCSI 4:0:0:0) [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] Stopping disk [Mon Oct 21 19:50:47 2019] sd 4:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [Mon Oct 21 19:50:51 2019] md/raid:md127: Disk failure on sdc1, disabling device. md/raid:md127: Operation continuing on 4 devices. [Mon Oct 21 19:51:01 2019] RAID conf printout: [Mon Oct 21 19:51:01 2019] --- level:6 rd:5 wd:4 [Mon Oct 21 19:51:01 2019] disk 0, o:1, dev:sde1 [Mon Oct 21 19:51:01 2019] disk 1, o:1, dev:sdf1 [Mon Oct 21 19:51:01 2019] disk 2, o:1, dev:sdb1 [Mon Oct 21 19:51:01 2019] disk 3, o:1, dev:sda1 [Mon Oct 21 19:51:01 2019] disk 4, o:0, dev:sdc1 [Mon Oct 21 19:51:01 2019] RAID conf printout: [Mon Oct 21 19:51:01 2019] --- level:6 rd:5 wd:4 [Mon Oct 21 19:51:01 2019] disk 0, o:1, dev:sde1 [Mon Oct 21 19:51:01 2019] disk 1, o:1, dev:sdf1 [Mon Oct 21 19:51:01 2019] disk 2, o:1, dev:sdb1 [Mon Oct 21 19:51:01 2019] disk 3, o:1, dev:sda1 [Mon Oct 21 19:51:01 2019] RAID conf printout: [Mon Oct 21 19:51:01 2019] --- level:6 rd:5 wd:4 [Mon Oct 21 19:51:01 2019] disk 0, o:1, dev:sde1 [Mon Oct 21 19:51:01 2019] disk 1, o:1, dev:sdf1 [Mon Oct 21 19:51:01 2019] disk 2, o:1, dev:sdb1 [Mon Oct 21 19:51:01 2019] disk 3, o:1, dev:sda1 [Mon Oct 21 19:51:01 2019] disk 4, o:1, dev:sdd1 [Mon Oct 21 19:51:01 2019] md: unbind<sdc1> [Mon Oct 21 19:51:01 2019] md: export_rdev(sdc1) [Mon Oct 21 19:51:01 2019] md: recovery of RAID array md127 [Mon Oct 21 19:51:01 2019] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. [Mon Oct 21 19:51:01 2019] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. [Mon Oct 21 19:51:01 2019] md: using 128k window, over a total of 1953382400k.