Hi All,
I have a VSAN setup with 4 nodes(all flash VSAN) each host has only one disk group(1 * 1 TB for cache & 2 *1 TB for capacity).
In one host---> we had an issue as "Propagated permanent disk failure in disk group".
in IPMI,we are not seeing any disk failures and all 3* 1 TB disks looks good and no errors reported. when we delete the disk group and create the disk group with same disks(3 * 1 TB disks),it fails again.I mean(after DG creation),it reports the same DG issue""Propagated permanent disk failure in disk group".(during resync components). Any idea to fix this issue ?
As per the screenshot,it looks like physical disks issues.how all these 3 disks gone bad ? any ideas ?
Host cluster screenshot
2019-12-18T06:20:10.373Z: [vSANCorrelator] 1414859421us: [esx.problem.vob.vsan.lsom.diskerror] vSAN device 52223d82-f6f3-940f-18d7-fb732e4f8afa is under permanent error.
2019-12-18T06:20:10.373Z: [vSANCorrelator] 1414850610us: [vob.vsan.lsom.diskerror] vSAN device 52223d82-f6f3-940f-18d7-fb732e4f8afa is under permanent error.
2019-12-18T06:20:10.373Z: [vSANCorrelator] 1414859556us: [esx.problem.vob.vsan.lsom.diskerror] vSAN device 52223d82-f6f3-940f-18d7-fb732e4f8afa is under permanent error.
2019-12-18T06:20:10.374Z: [vSANCorrelator] 1414850692us: [vob.vsan.lsom.diskpropagatedpermerror] vSAN device 5216a066-08a1-76fd-ef8b-a71795499031 is under propagated permanent error.
2019-12-18T06:20:10.374Z: [vSANCorrelator] 1414859604us: [esx.problem.vob.vsan.lsom.diskpropagatedpermerror] vSAN device 5216a066-08a1-76fd-ef8b-a71795499031 is under propagated permanent error.
2019-12-18T06:20:10.374Z: [vSANCorrelator] 1414850711us: [vob.vsan.lsom.diskpropagatedpermerror] vSAN device 52348100-2e98-e5f1-5678-0175ec7b1b37 is under propagated permanent error.
2019-12-18T06:20:10.374Z: [vSANCorrelator] 1414859674us: [esx.problem.vob.vsan.lsom.diskpropagatedpermerror] vSAN device 52348100-2e98-e5f1-5678-0175ec7b1b37 is under propagated permanent error.
2019-12-18T06:20:10.372Z cpu14:2102350)WARNING: PLOG: DDPCompleteDDPWrite:3015: Throttled: DDP write failed I/O error callback PLOGDDPCallbackFn@com.vmware.plog#0.0.0.1, diskgroup 52348100-2e98-e5f1-5678-0175ec7b1b37
2019-12-18T06:20:10.372Z cpu14:2102350)WARNING: PLOG: PLOGDDPCallbackFn:239: Throttled: DDP write failed on device 52223d82-f6f3-940f-18d7-fb732e4f8afa :I/O error
2019-12-18T06:20:10.372Z cpu29:2098599)WARNING: PLOG: PLOGPropagateError:2899: DDP: Propagating error state from original device 52223d82-f6f3-940f-18d7-fb732e4f8afa
2019-12-18T06:20:10.372Z cpu29:2098599)WARNING: PLOG: PLOGPropagateError:2941: DDP: Propagating error state to MDs in device 52348100-2e98-e5f1-5678-0175ec7b1b37
2019-12-18T06:20:10.373Z cpu29:2098599)WARNING: PLOG: PLOGPropagateErrorInt:2840: Permanent error event on 52223d82-f6f3-940f-18d7-fb732e4f8afa
2019-12-18T06:20:10.373Z cpu5:2102401)LSOM: LSOMLogDiskEvent:5668: Disk Event permanent error for MD 52223d82-f6f3-940f-18d7-fb732e4f8afa (naa.600304801c924d01243f90db170293fa:2)
2019-12-18T06:20:10.373Z cpu5:2102401)WARNING: LSOM: LSOMEventNotify:6976: vSAN device 52223d82-f6f3-940f-18d7-fb732e4f8afa is under permanent error.
2019-12-18T06:20:10.373Z cpu5:2102401)LSOM: LSOMLogDiskEvent:5668: Disk Event permanent error propagated for MD 5216a066-08a1-76fd-ef8b-a71795499031 (naa.600304801c924d0123f85a0c2533d4ad:2)
2019-12-18T06:20:10.373Z cpu5:2102401)WARNING: LSOM: LSOMEventNotify:6987: vSAN device 5216a066-08a1-76fd-ef8b-a71795499031 is under propagated permanent error.
2019-12-18T06:20:10.373Z cpu29:2098599)WARNING: PLOG: PLOGPropagateErrorInt:2856: Error/unhealthy propagate event on 5216a066-08a1-76fd-ef8b-a71795499031
2019-12-18T06:20:10.373Z cpu5:2102401)LSOM: LSOMLogDiskEvent:5668: Disk Event permanent error propagated for SSD 52348100-2e98-e5f1-5678-0175ec7b1b37 (naa.600304801c924d0123f859d621fac674:2)
2019-12-18T06:20:10.373Z cpu5:2102401)WARNING: LSOM: LSOMEventNotify:6987: vSAN device 52348100-2e98-e5f1-5678-0175ec7b1b37 is under propagated permanent error.
2019-12-18T06:20:10.373Z cpu29:2098599)WARNING: PLOG: PLOGPropagateErrorInt:2856: Error/unhealthy propagate event on 52348100-2e98-e5f1-5678-0175ec7b1b37
Thanks,
Manivel RR