HPE Storage Users Group https://3parug.org/ |
|
3PAR 7200 replacement disk marked as Slow Drive and fails https://3parug.org/viewtopic.php?f=18&t=3748 |
Page 1 of 1 |
Author: | sivah [ Thu Feb 17, 2022 3:16 am ] |
Post subject: | 3PAR 7200 replacement disk marked as Slow Drive and fails |
Hi all, I have this weird issue with 3PAR 7200. I have a failed disk with specs 900GB FC 10K 6G Encrypted HDD. Each time I replace it, servicemag resume will succeed. However after a couple of hours, the disk will fail again. Also showing that servicemag start succeeds. I have tried 3 disks already, each with different DOM (2013, 2014, 2015) and it is still the same. I then further dig into the logs. Each replacement that I have, I noticed that after servicemag completes, the replacement disk is always marked as a candidate for check_slow_disk task. The IOPS for the replaced disk is between the range of 105 to 135. While the ideal should be 140 for a 10K HDD. This is the last extract of the check_slow_disk before failing, for the 4th time. 2022-02-05 20:07:01 +08 Updated Executing "check_slow_disk" as 0:29843 2022-02-05 20:07:01 +08 Updated RPM 100 -> Good IOPS 2000 2022-02-05 20:07:01 +08 Updated RPM 10 -> Good IOPS 140 2022-02-05 20:07:01 +08 Updated RPM 150 -> Good IOPS 2000 2022-02-05 20:07:01 +08 Updated RPM 15 -> Good IOPS 180 2022-02-05 20:07:01 +08 Updated RPM 7 -> Good IOPS 60 2022-02-05 20:07:01 +08 Updated Running at interval 840 for 3360 seconds 2022-02-05 20:21:01 +08 Updated 2022-02-05 20:21:01 +08 Updated Starting next iteration 2022-02-05 20:21:01 +08 Updated 2022-02-05 20:21:01 +08 Updated Checking speed 7 drives 2022-02-05 20:21:01 +08 Updated Candidate:PDID: 27, adj_svct: 7.0, idle%: 99.7, iops: 0.5, kbps: 15.4, svct: 7.2 2022-02-05 20:21:01 +08 Updated Next:PDID: 19, adj_svct: 6.6, idle%: 99.8, iops: 0.4, kbps: 12.6, svct: 6.8 2022-02-05 20:21:01 +08 Updated Checking speed 10 drives 2022-02-05 20:21:01 +08 Updated Candidate:PDID: 64, adj_svct: 59.4, idle%: 7.6, iops: 109.7, kbps: 3027.2, svct: 98.3 2022-02-05 20:21:01 +08 Updated Next:PDID: 11, adj_svct: 15.3, idle%: 19.6, iops: 122.1, kbps: 3355.2, svct: 58.6 2022-02-05 20:35:01 +08 Updated 2022-02-05 20:35:01 +08 Updated Starting next iteration 2022-02-05 20:35:01 +08 Updated 2022-02-05 20:35:01 +08 Updated Checking speed 7 drives 2022-02-05 20:35:01 +08 Updated Candidate:PDID: 26, adj_svct: 4.2, idle%: 99.7, iops: 0.8, kbps: 33.2, svct: 4.6 2022-02-05 20:35:01 +08 Updated Next:PDID: 19, adj_svct: 3.9, idle%: 99.9, iops: 0.3, kbps: 11.6, svct: 4.1 2022-02-05 20:35:01 +08 Updated Checking speed 10 drives 2022-02-05 20:35:01 +08 Updated Candidate:PDID: 64, adj_svct: 113.1, idle%: 1.8, iops: 129.2, kbps: 3842.5, svct: 159.6 2022-02-05 20:35:01 +08 Updated Next:PDID: 36, adj_svct: 45.9, idle%: 10.5, iops: 143.5, kbps: 3871.3, svct: 96.7 2022-02-05 20:49:02 +08 Updated 2022-02-05 20:49:02 +08 Updated Starting next iteration 2022-02-05 20:49:02 +08 Updated 2022-02-05 20:49:02 +08 Updated Checking speed 7 drives 2022-02-05 20:49:02 +08 Updated Candidate:PDID: 19, adj_svct: 4.9, idle%: 99.8, iops: 0.4, kbps: 13.5, svct: 5.1 2022-02-05 20:49:02 +08 Updated Next:PDID: 27, adj_svct: 4.1, idle%: 99.8, iops: 0.5, kbps: 17.3, svct: 4.4 2022-02-05 20:49:02 +08 Updated Checking speed 10 drives 2022-02-05 20:49:02 +08 Updated Candidate:PDID: 64, adj_svct: 96.6, idle%: 2.0, iops: 128.5, kbps: 3936.1, svct: 143.0 2022-02-05 20:49:02 +08 Updated Next:PDID: 36, adj_svct: 29.2, idle%: 11.7, iops: 136.4, kbps: 3825.2, svct: 77.8 2022-02-05 21:03:02 +08 Updated 2022-02-05 21:03:02 +08 Updated Starting next iteration 2022-02-05 21:03:02 +08 Updated 2022-02-05 21:03:02 +08 Updated Checking speed 7 drives 2022-02-05 21:03:02 +08 Updated Candidate:PDID: 29, adj_svct: 12.5, idle%: 99.2, iops: 1.6, kbps: 289.4, svct: 13.7 2022-02-05 21:03:02 +08 Updated Next:PDID: 21, adj_svct: 12.5, idle%: 99.2, iops: 1.5, kbps: 282.6, svct: 13.6 2022-02-05 21:03:02 +08 Updated Checking speed 10 drives 2022-02-05 21:03:02 +08 Updated Candidate:PDID: 64, adj_svct: 105.8, idle%: 1.6, iops: 130.9, kbps: 4184.9, svct: 153.4 2022-02-05 21:03:02 +08 Updated Next:PDID: 35, adj_svct: 22.6, idle%: 12.0, iops: 142.0, kbps: 4152.3, svct: 73.5 2022-02-05 21:03:02 +08 Updated 2022-02-05 21:03:02 +08 Updated FOUND SLOW DRIVE: PDID: 64, adj_svct: 105.8, idle%: 1.6, iops: 130.9, kbps: 4184.9, svct: 153.4 2022-02-05 21:03:02 +08 Updated Marking slow disk 64 failed 2022-02-05 21:03:02 +08 Updated Failed PDID 64 2022-02-05 21:03:02 +08 Updated 2022-02-05 21:03:02 +08 Completed. The latest servicemag start 2022-02-06 00:30:36 +08 Updated Executing "sstart_pd_64" as 1:15777 2022-02-06 00:30:36 +08 Updated servicemag start -wait -pdid 64 2022-02-06 00:30:36 +08 Updated ... servicing disks in mag: 3 0 2022-02-06 00:30:36 +08 Updated ... normal disks: 2022-02-06 00:30:36 +08 Updated ... not normal disks: WWN [5000C5007F6EFABC] Id [64] diskpos [0] 2022-02-06 00:30:36 +08 Updated ... relocating chunklets to spare space... 2022-02-06 00:30:47 +08 Updated ... bypassing mag 3 0 2022-02-06 00:31:27 +08 Updated ... bypassed mag 3 0 2022-02-06 00:31:27 +08 Updated servicemag start -wait -pdid 64 -- Succeeded 2022-02-06 00:31:27 +08 Completed scheduled task. I noticed that the replacement disk is a candidate for checking for 10 consecutive times then the system will mark it as Failed. Has anyone experienced this same issue? Is there a way to not make the disk on the specific slot not to be slow? |
Author: | MammaGutt [ Thu Feb 17, 2022 4:50 pm ] |
Post subject: | Re: 3PAR 7200 replacement disk marked as Slow Drive and fail |
Just asking, could the issue be the cage slot and not PDs? Are you seeing SAS errors or such on the slot? From what I see, the drive has very high svct (service time or latency in plain english) which is probably why it is always a candidate. |
Author: | sivah [ Wed Mar 09, 2022 12:35 am ] |
Post subject: | Re: 3PAR 7200 replacement disk marked as Slow Drive and fail |
Hi, Just an update to this. I have searched and found that HPE actually phased out the 900GB Encrypted HDDs that we are currently using and gave an advisory of using 1.2TB Encrypted HDDs instead Advisory: (Revised) HPE 3PAR StoreServ 7000 Storage And HPE 3PAR StoreServ 10000 Storage - Transitioning From HCBRE, HCEP, And Certain SLTN HDD Spare Parts To Alternate Replacement HDD Spare Parts https://support.hpe.com/hpesc/public/do ... 28695en_us I finally ordered the 1.2TB disk instead which have a DOM of 2018 and now finally works after replacement for 5 days with no signs of being a "slow drive" It seems those 900GB Encrypted HDDs we were using for replacement were just old and bad. Even though those parts were bought from multiple suppliers. |
Author: | MammaGutt [ Wed Mar 09, 2022 3:12 am ] |
Post subject: | Re: 3PAR 7200 replacement disk marked as Slow Drive and fail |
I was told back in the days that 900 GB drives were discontinued as no vendor continued to make them when they released new series of drives. |
Page 1 of 1 | All times are UTC - 5 hours |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |