RAID6-2018: Unterschied zwischen den Versionen
Zur Navigation springen
Zur Suche springen
Root (Diskussion | Beiträge) |
Root (Diskussion | Beiträge) |
||
(15 dazwischenliegende Versionen von 2 Benutzern werden nicht angezeigt) | |||
Zeile 1: | Zeile 1: | ||
== Info == | == Info == | ||
=== Hardware === | |||
* Supermicro X10SRA-F | |||
** https://www.supermicro.com/en/products/motherboard/X10SRA-F | |||
* 32 GB ECC Memory | |||
=== Raid === | |||
Version : 1.2 | Version : 1.2 | ||
Zeile 52: | Zeile 58: | ||
D1 | D1 | ||
D2 | D2 | ||
* Resync dauert 16 Stunden | |||
[Sat Aug 21 23:19:21 2021] md: requested-resync of RAID array md127 | |||
[Sun Aug 22 15:17:38 2021] md: md127: requested-resync done. | |||
* Mainboard | |||
Supermicro X10SRA-F | |||
BIOS 1.0a | |||
=== aktuelle Device-Names === | === aktuelle Device-Names === | ||
Number Major Minor RaidDevice State | Number Major Minor RaidDevice State Parition HDD-Serial Location | ||
10 8 113 0 active sync /dev/sdh1 | ZA1CY63J | A.3 | 10 8 113 0 active sync /dev/sdh1 | ZA1CY63J | A.3 | ||
12 8 33 1 active sync /dev/sdc1 | WKD1RM6S | D.3 | 12 8 33 1 active sync /dev/sdc1 | WKD1RM6S | D.3 | ||
Zeile 83: | Zeile 100: | ||
[T3MVCJ|D2PZZS|1CY63J|JK7XTK] [D3LLY8|<s>HGDRYC</s>|HYT9XS|T40SH2] | [T3MVCJ|D2PZZS|1CY63J|JK7XTK] [D3LLY8|<s>HGDRYC</s>|HYT9XS|T40SH2] | ||
[T3MVCJ|D2PZZS|1CY63J|JK7XTK] [D3LLY8|001TWL|<s>HYT9XS</s>|T40SH2] | [T3MVCJ|D2PZZS|1CY63J|JK7XTK] [D3LLY8|001TWL|<s>HYT9XS</s>|T40SH2] | ||
[T3MVCJ|D2PZZS|1CY63J|JK7XTK] [D3LLY8|001TWL|000N57|T40SH2] | [T3MVCJ|D2PZZS|1CY63J|JK7XTK] [D3LLY8|001TWL|000N57|<s>T40SH2</s>] | ||
* unterhalb von Tokio | * unterhalb von Tokio | ||
Zeile 90: | Zeile 107: | ||
[ | | | ] [ | |D1RM6S|KDZDDS] | [ | | | ] [ | |D1RM6S|KDZDDS] | ||
[ | | | ] [ | |D1RM6S|<s>KDZDDS</s>] | [ | | | ] [ | |D1RM6S|<s>KDZDDS</s>] | ||
[ | | | ] [ | |D1RM6S|001CVK] | |||
== Beschaffung == | == Beschaffung == | ||
Zeile 171: | Zeile 189: | ||
05.08.2021 Partition grow, nun 42,7 TB | 05.08.2021 Partition grow, nun 42,7 TB | ||
resize2fs hängt bei 100%, siehe [[Linux.raid#resize2fs_100.25_CPU_Usage]] | resize2fs hängt bei 100%, siehe [[Linux.raid#resize2fs_100.25_CPU_Usage]] | ||
06.08.2021 [Fri Aug 6 02:49:37 2021] sd 1:0:0:0: [sdb] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT | |||
[Fri Aug 6 02:49:37 2021] sd 1:0:0:0: [sdb] tag#18 CDB: Read(16) 88 00 00 00 00 03 8e 3f c0 08 00 00 05 40 00 00 | |||
[Fri Aug 6 02:49:37 2021] blk_update_request: I/O error, dev sdb, sector 15271444488 | |||
+12x | |||
[Sun Aug 8 02:59:12 2021] md/raid:md127: read error corrected (8 sectors at 3934061840 on sdb1) | |||
+12x | |||
sdb war bisher die langsamste Platte, jetzt hat sie fehlerhafte Sektoren | |||
11.08.2021 [Wed Aug 11 02:41:28 2021] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen | |||
[Wed Aug 11 02:41:28 2021] ata2.00: failed command: SMART | |||
[Wed Aug 11 02:41:28 2021] ata2.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 19 pio 512 in res 40/00:82:82:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) | |||
[Wed Aug 11 02:41:28 2021] ata2: hard resetting link | |||
[Wed Aug 11 09:37:54 2021] ata2.00: failed command: FLUSH CACHE EXT | |||
[Wed Aug 11 09:37:54 2021] ata2: hard resetting link | |||
... | |||
[Wed Aug 11 09:38:41 2021] ata2: hard resetting link | |||
[Wed Aug 11 09:38:46 2021] ata2: link is slow to respond, please be patient (ready=0) | |||
[Wed Aug 11 09:38:51 2021] ata2: COMRESET failed (errno=-16) | |||
[Wed Aug 11 09:38:51 2021] ata2: hard resetting link | |||
... | |||
[Wed Aug 11 09:40:15 2021] blk_update_request: I/O error, dev sdb, sector 2064 | |||
[Wed Aug 11 09:40:15 2021] md: super_written gets error=-5, uptodate=0 | |||
[Wed Aug 11 09:40:15 2021] md/raid:md127: Disk failure on sdb1, disabling device. md/raid:md127: Operation continuing on 7 devices. | |||
wir haben sdb verloren, der resync mit dem Spare läuft | |||
16.08.2021 wir laufen ohne spare, Array ist clean | |||
ZCT40SH2 (B4) kann entnommen werden | |||
25.08.2021 - T40SH2 (B4) | |||
+ /dev/disk/by-id/ata-ST8000VN004-3CP101_WP001CVK als spare (D4) | |||
??.??.2021 via Knoppix, resize2fs durchführen | ??.??.2021 via Knoppix, resize2fs durchführen |
Aktuelle Version vom 11. November 2021, 17:53 Uhr
Info
Hardware
- Supermicro X10SRA-F
- 32 GB ECC Memory
Raid
Version : 1.2 Creation Time : Sun Nov 8 00:30:12 2015 Raid Level : raid6 Array Size : 46883398656 (44711.49 GiB 48008.60 GB) Used Dev Size : 7813899776 (7451.92 GiB 8001.43 GB) Raid Devices : 8 Total Devices : 9 Persistence : Superblock is persistent Intent Bitmap : Internal Update Time : Sat Aug 7 12:32:17 2021 State : clean Active Devices : 8 Working Devices : 9 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 512K Name : Tokio:0 UUID : 39c0b55f:74c0ab19:5f939236:16921f79 Events : 399926 Cage Number Major Minor RaidDevice State dev id Size SinceA30 8 81 0 active syncata-WDC_WD4000FYYZ-01UL1B2_WD-WMC130F7EL9RA3 ata-ST8000VN0022-2EL112_ZA1CY63J 8TB 2018-12A41 8 65 1 active syncata-WDC_WD4000FYYZ-01UL1B2_WD-WCC134LLA4YE4TB 2015-12 D3 ata-ST8000VN004-2M2101_WKD1RM6S 8TB 2020-07A12 8 49 2 active syncata-WDC_WD4000FYYZ-01UL1B2_WD-WCC134HF0CFD4TB 2015-12 A2 ata-ST8000VN004-2M2101_WKD2PZZS 8TB 2020-12A23 8 33 3 active syncata-WDC_WD4000FYYZ-01UL1B2_WD-WCC133XLHK2N4TB 2015-12 A4 ata-WDC_WD80EFAX-68KNBN0_VDJK7XTK 8TB 2020-07 B1 6 8 113 4 active syncata-HGST_HDN724040ALE640_PK2338P4HGJ6RC4TB 2016-03 ata-ST8000VN004-2M2101_WKD3LLY8 8TB 2021-07 B2 5 8 97 5 active syncata-HGST_HDN724040ALE640_PK2338P4HGDRYC4TB 2016-03 ata-ST8000VN004-3CP101_WP001TWL 8TB 2021-08B44 8 17 6 active syncata-HGST_HDN724040ALE640_PK1334PCKNY90X4TB 2016-03 B4 ata-ST8000DM004-2CX188_ZCT40SH2 8TB 2021-07 A1 ata-ST8000DM004-2CX188_ZCT3MVCJ 8TB 2020-12 B3 7 8 1 7 active syncata-HGST_HDN724040ALE640_PK1334PEHYT9XS4TB 2017-11 D4 11 8 129 - spareata-HGST_HDN724040ALE640_PK1334PEKDZDDS4TB 2020-07 ata-ST8000VN004-3CP101_WP000N57 8TB 2021-08
- ohne SATA Anschluss
C1 C2 C3 C4 D1 D2
- Resync dauert 16 Stunden
[Sat Aug 21 23:19:21 2021] md: requested-resync of RAID array md127 [Sun Aug 22 15:17:38 2021] md: md127: requested-resync done.
- Mainboard
Supermicro X10SRA-F BIOS 1.0a
aktuelle Device-Names
Number Major Minor RaidDevice State Parition HDD-Serial Location 10 8 113 0 active sync /dev/sdh1 | ZA1CY63J | A.3 12 8 33 1 active sync /dev/sdc1 | WKD1RM6S | D.3 14 8 65 2 active sync /dev/sde1 | WKD2PZZS | A.2 13 8 97 3 active sync /dev/sdg1 | VDJK7XTK | A.4 15 8 17 4 active sync /dev/sdb1 | ZCT40SH2 | B.4 16 8 145 5 active sync /dev/sdj1 | WKD3LLY8 | B.1 8 8 81 6 active sync /dev/sdf1 | ZCT3MVCJ | A.1 9 8 129 7 active sync /dev/sdi1 | WP001TWL | B.2 17 8 1 - spare /dev/sda1 | WP000N57 | B.3
Lage der Platten
- oberhalb von Tokio
[A.1 |A.2 |A.3 |A.4 ] [B.1 |B.2 |B.3 |B.4 ] [HF0CFD|XLHK2N|F7EL9R|LLA4YE] [HGJ6RC|HGDRYC|HYT9XS|KNY90X] [HF0CFD|XLHK2N|1CY63J|LLA4YE] [HGJ6RC|HGDRYC|HYT9XS|KNY90X] [HF0CFD|XLHK2N|1CY63J|JK7XTK] [HGJ6RC|HGDRYC|HYT9XS|KNY90X] [HF0CFD|D2PZZS|1CY63J|JK7XTK] [HGJ6RC|HGDRYC|HYT9XS|KNY90X] [T3MVCJ|D2PZZS|1CY63J|JK7XTK] [HGJ6RC|HGDRYC|HYT9XS|KNY90X] [T3MVCJ|D2PZZS|1CY63J|JK7XTK] [HGJ6RC|HGDRYC|HYT9XS|T40SH2] [T3MVCJ|D2PZZS|1CY63J|JK7XTK] [D3LLY8|HGDRYC|HYT9XS|T40SH2] [T3MVCJ|D2PZZS|1CY63J|JK7XTK] [D3LLY8|001TWL|HYT9XS|T40SH2] [T3MVCJ|D2PZZS|1CY63J|JK7XTK] [D3LLY8|001TWL|000N57|T40SH2]
- unterhalb von Tokio
[C.1 |C.2 |C.3 |C.4 ] [D.1 |D.2 |D.3 |D.4 ] [ | | | ] [ | |D1RM6S|KDZDDS] [ | | | ] [ | |D1RM6S|KDZDDS] [ | | | ] [ | |D1RM6S|001CVK]
Beschaffung
alle Platten sind 8 TB Exemplare
https://www.alternate.de/html/listings/1458214498740?order=ASC&lk=8323&showFilter=false&hideFilter=false&disableFilter=false&filter_-1=3500&filter_-1=142900&filter_1021=8000.0&filter_2147482612=1037
Ereignisse
May 09 01:22:50 tokio kernel: blk_update_request: I/O error, dev sdf, sector 7010129740 May 09 01:22:53 tokio kernel: blk_update_request: I/O error, dev sdf, sector 7010129740 Jul 26 01:22:52 tokio kernel: blk_update_request: I/O error, dev sdf, sector 6521277045 Jul 26 01:22:54 tokio kernel: blk_update_request: I/O error, dev sdf, sector 6521277045 Jul 26 01:22:57 tokio kernel: blk_update_request: I/O error, dev sdf, sector 6521278897 Jul 26 01:22:59 tokio kernel: blk_update_request: I/O error, dev sdf, sector 6521278897 Oct 18 01:18:41 tokio kernel: blk_update_request: I/O error, dev sdf, sector 6620009643 Oct 18 01:18:44 tokio kernel: blk_update_request: I/O error, dev sdf, sector 6620009643 27.11.2018 9 Uhr Ausfall von 2 Platten, sieht aus wie ein kurzes "Power Fail" Event 11 Uhr Anfrage wegen zu wenig Platz, Reduziere Sicherungen von 10 auf 7 Lösche die Sicherungsverzeichnisse 7+8+9 stelle den Ausfall von 2 Platten fest rebuild von Platte sdd 28.11.2018 rebuild von Platte sdg I/O Fehler bei Platte sdf aus der Vergangenheit entdeckt Beschaffung einer neuen 8TB Platte, soll "sdf" ersetzen 30.11.2018 neue Platte als 9. Platte hinzugehängt --add-spare & --replace angestossen + ata-ST8000VN0022-2EL112_ZA1CY63J (sdi) - ata-WDC_WD4000FYYZ-01UL1B2_WD-WMC130F7EL9R (sdf) 06.06.2020 sde, ata7.00: failed command: READ FPDMA QUEUED, blk_update_request: I/O error, dev sde, sector 6211109969 blk_update_request: I/O error, dev sde, sector 6211111867 einbau eines Spare vorgeschlagen 22.07.2020 +Spare: HGST HDN724040AL (4 TB), Serial=PK1334PEKDZDDS, /dev/sdi eingebaut 24.07.2020 neue Device-Names vom System nach einem Neustart vergeben 27.07.2020 [Sat Jul 25 23:58:00 2020] md/raid:md127: read error corrected (8 sectors at 42366136 on sdf1) [Sat Jul 25 23:58:00 2020] md/raid:md127: read error corrected (8 sectors at 42366144 on sdf1) [Sat Jul 25 23:58:00 2020] md/raid:md127: read error corrected (8 sectors at 42366152 on sdf1) [Sat Jul 25 23:58:00 2020] md/raid:md127: read error corrected (8 sectors at 42366160 on sdf1) [Sat Jul 25 23:58:00 2020] md/raid:md127: read error corrected (8 sectors at 42366168 on sdf1) [Sat Jul 25 23:58:00 2020] md/raid:md127: read error corrected (8 sectors at 42366176 on sdf1) [Sat Jul 25 23:58:00 2020] md/raid:md127: read error corrected (8 sectors at 42366184 on sdf1) [Sat Jul 25 23:58:00 2020] md/raid:md127: read error corrected (8 sectors at 42366192 on sdf1) [Sat Jul 25 23:58:00 2020] md/raid:md127: read error corrected (8 sectors at 42366200 on sdf1) [Sat Jul 25 23:58:00 2020] md/raid:md127: read error corrected (8 sectors at 42366208 on sdf1) [Sat Jul 25 23:58:39 2020] md/raid:md127: read error corrected (8 sectors at 46314736 on sdf1) [Sat Jul 25 23:58:39 2020] md/raid:md127: read error corrected (8 sectors at 46314744 on sdf1) [Sat Jul 25 23:58:39 2020] md/raid:md127: read error corrected (8 sectors at 46314752 on sdf1) [Sat Jul 25 23:58:39 2020] md/raid:md127: read error corrected (8 sectors at 46314760 on sdf1) [Sat Jul 25 23:58:39 2020] md/raid:md127: read error corrected (8 sectors at 46314768 on sdf1) -> A4, LLA4YE ist Tauschkandidat 29.07.2020 ST8000VN004-2M21 (WKD1RM6S) wurde eingebaut, /dev/sdj maximale 8TB Partition erstellt, sdj1, als spare eingebunden # mdadm /dev/md127 --replace /dev/sdf1 --with /dev/sdj1 mdadm: Marked /dev/sdf1 (device 1 in /dev/md127) for replacement mdadm: Marked /dev/sdj1 in /dev/md127 as replacement for device 1 sdf1 zero superblock, partition entfernt anschliessendes "scrub" 30.07.2020 +ata-WDC_WD80EFAX-68KNBN0_VDJK7XTK wurde eingebaut, /dev/sdf -ata-WDC_WD4000FYYZ-01UL1B2_WD-WCC133XLHK2N 31.07.2020 Ausbau von "XLHK2N", da "removed" 11.12.2020 +ata-ST8000VN004-2M2101_WKD2PZZS wurde eingebaut, /dev/sdj -ata-WDC_WD4000FYYZ-01UL1B2_WD-WCC134HF0CFD Replace von "HF0CFD" (sde), da 5,2 Jahre alt 14.12.2020 +ata-ST8000DM004-2CX188_ZCT3MVCJ wurde eingebaut, /dev/sde -ata-HGST_HDN724040ALE640_PK1334PCKNY90X Replace von sdb, da Betrieb bei 54 °C und heisseste Platte 29.07.2021 +ata-ST8000DM004-2CX188_ZCT40SH2 (B4) -ata-HGST_HDN724040ALE640_PK2338P4HGJ6RC (B1) Replace von sdi, da über 5 Jahre in Betrieb 30.07.2021 +ata-ST8000VN004-2M2101_WKD3LLY8 (B1) -ata-HGST_HDN724040ALE640_PK2338P4HGDRYC (B2) Replace von sdh, da über 5 Jahre in Betrieb 02.08.2021 +ata-ST8000VN004-3CP101_WP001TWL (B2) -ata-HGST_HDN724040ALE640_PK1334PEHYT9XS (B3) Replace von sda, da 4 Jahre und noch 4 TB 03.08.2021 +ata-ST8000VN004-3CP101_WP000N57 (B3) -ata-HGST_HDN724040ALE640_PK1334PEKDZDDS (D4) Replace von sdd, da 4 TB aber nur 9065 Betriebsstunden 05.08.2021 Partition grow, nun 42,7 TB resize2fs hängt bei 100%, siehe Linux.raid#resize2fs_100.25_CPU_Usage 06.08.2021 [Fri Aug 6 02:49:37 2021] sd 1:0:0:0: [sdb] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT [Fri Aug 6 02:49:37 2021] sd 1:0:0:0: [sdb] tag#18 CDB: Read(16) 88 00 00 00 00 03 8e 3f c0 08 00 00 05 40 00 00 [Fri Aug 6 02:49:37 2021] blk_update_request: I/O error, dev sdb, sector 15271444488 +12x [Sun Aug 8 02:59:12 2021] md/raid:md127: read error corrected (8 sectors at 3934061840 on sdb1) +12x sdb war bisher die langsamste Platte, jetzt hat sie fehlerhafte Sektoren 11.08.2021 [Wed Aug 11 02:41:28 2021] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [Wed Aug 11 02:41:28 2021] ata2.00: failed command: SMART [Wed Aug 11 02:41:28 2021] ata2.00: cmd b0/d0:01:00:4f:c2/00:00:00:00:00/00 tag 19 pio 512 in res 40/00:82:82:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [Wed Aug 11 02:41:28 2021] ata2: hard resetting link [Wed Aug 11 09:37:54 2021] ata2.00: failed command: FLUSH CACHE EXT [Wed Aug 11 09:37:54 2021] ata2: hard resetting link ... [Wed Aug 11 09:38:41 2021] ata2: hard resetting link [Wed Aug 11 09:38:46 2021] ata2: link is slow to respond, please be patient (ready=0) [Wed Aug 11 09:38:51 2021] ata2: COMRESET failed (errno=-16) [Wed Aug 11 09:38:51 2021] ata2: hard resetting link ... [Wed Aug 11 09:40:15 2021] blk_update_request: I/O error, dev sdb, sector 2064 [Wed Aug 11 09:40:15 2021] md: super_written gets error=-5, uptodate=0 [Wed Aug 11 09:40:15 2021] md/raid:md127: Disk failure on sdb1, disabling device. md/raid:md127: Operation continuing on 7 devices. wir haben sdb verloren, der resync mit dem Spare läuft 16.08.2021 wir laufen ohne spare, Array ist clean ZCT40SH2 (B4) kann entnommen werden 25.08.2021 - T40SH2 (B4) + /dev/disk/by-id/ata-ST8000VN004-3CP101_WP001CVK als spare (D4) ??.??.2021 via Knoppix, resize2fs durchführen