we've encountered iscsi performance issues when having open multiple sessions per server NIC - while analyzing the packet dumps we noticed that the storage seems to drop frames causing client-triggered retransmits.
While looking at the 3par we noticed consistent errors on all our ethernet/iscsi interfaces, MACRxFramesDroppedCount hovers on a constant 1.9/sec per device:
Storage:
3Par 7200, Release version 3.1.2 (MU3)
NIC: QLOGIC QLE8242 58 4.11.124
Code: Select all
1:2:2 MACTxFramesCount 795.9 329.2 27.8
1:2:2 MACTxBytesCount 732199.6 202178.3 1857515.0
1:2:2 MACTxMulticastFrameCount 0.8 1.5 1.9
1:2:2 MACTxBroadcastFrameCount 0.0 0.0 0.0
1:2:2 MACTxPauseFrameCount 0.0 0.0 0.0
1:2:2 MACTxControlFrameCount 0.0 0.0 0.0
1:2:2 MACTxDeferralsCount 0.0 0.0 0.0
1:2:2 MACTxExcessDeferralsCount 0.0 0.0 0.0
1:2:2 MACTxLateCollisionsCount 0.0 0.0 0.0
1:2:2 MACTxAbortsCount 0.0 0.0 0.0
1:2:2 MACTxSingleCollisionsCount 0.0 0.0 0.0
1:2:2 MACTxMultipleCollisionsCount 0.0 0.0 0.0
1:2:2 MACTxCollisionsCount 0.0 0.0 0.0
1:2:2 MACTxFramesDroppedCount 0.0 0.0 0.0
1:2:2 MACTxJumboFramesCount 0.0 0.0 0.0
1:2:2 MACRxFramesCount 1183.1 652.7 117.4
1:2:2 MACRxBytesCount 1908654.1 1069065.6 1289127.8
1:2:2 MACRxUnknownControlFramesCount 0.0 0.0 0.0
1:2:2 MACRxPauseFramesCount 0.0 0.0 0.0
1:2:2 MACRxControlFramesCount 0.0 0.0 0.0
1:2:2 MACRxDribbleCount 0.0 0.0 0.0
1:2:2 MACRxFrameLengthErrorCount 0.0 0.0 0.0
1:2:2 MACRxJabberCount 0.0 0.0 0.0
1:2:2 MACRxCarrierSenseErrorCount 0.0 0.0 0.0
1:2:2 MACRxFramesDroppedCount 1.6 1.0 1.9
1:2:2 MACRxCRCErrorCount 0.0 0.0 0.0
1:2:2 MACRxEncodingErrorCount 0.0 0.0 0.0
1:2:2 MACRxLengthErrorLargeCount 0.0 0.0 0.0
1:2:2 MACRxLengthErrorSmallCount 0.0 0.0 0.0
1:2:2 MACRxMulticastFrameCount 0.0 0.0 0.0
1:2:2 MACRxBroadcastFrameCount 0.0 0.0 0.0
1:2:2 IPTxPacketsCount 325.1 189.7 262.8
1:2:2 IPTxBytesCount 707851.0 191637.5 1821209.5
1:2:2 IPTxFragmentsCount 0.0 0.0 0.0
1:2:2 IPRxPacketsCount 2179.0 528.9 381.4
1:2:2 IPRxBytesCount 4329169.0 1102505.0 1247142.3
1:2:2 IPRxFragmentsCount 0.0 0.0 0.0
1:2:2 IPDatagramReassemblyCount 0.0 0.0 0.0
1:2:2 IPInvalidAddrErrorCount 0.0 0.0 0.0
1:2:2 IPErrorPacketCount 0.0 0.0 0.0
1:2:2 IPFragmentReceivedOverlapCount 0.0 0.0 0.0
1:2:2 IPFragmentReceivedOutOfOrderCount 0.0 0.0 0.0
1:2:2 IPDatagramReassemblyTimeoutCount 0.0 0.0 0.0
1:2:2 TCPTxSegmentsCount 325.1 189.7 262.8
1:2:2 TCPTxBytesCount 688217.3 182712.0 1794061.3
1:2:2 TCPRxSegmentsCount 2179.0 528.9 381.4
1:2:2 TCPRxBytesCount 4285589.8 1091926.9 1239514.5
1:2:2 TCPPersistTimerExpiredCount 0.0 0.0 0.8
1:2:2 TCPRetransmitTimerExpiredCount 0.0 0.3 0.8
1:2:2 TCPRxDuplicateACKCount 48.6 8.9 23.1
1:2:2 TCPRxPureACKCount 0.0 0.0 0.0
1:2:2 TCPTxDelayedACKCount 0.0 0.0 0.8
1:2:2 TCPTxPureACKCount 78.1 21.7 41.9
1:2:2 TCPRxSegmentErrorCount 0.0 0.0 0.0
1:2:2 TCPRxSegmentOutOfOrderCount 0.0 0.0 0.5
1:2:2 TCPRxWindowProbeCount 0.0 0.0 0.0
1:2:2 TCPRxWindowUpdateCount 1900.9 515.4 629.3
1:2:2 ECCErrorCorrectionCount 0.0 0.0 0.0
1:2:2 iSCSITxPDUCount 442.2 125.0 140.4
1:2:2 iSCSITxBytesCount 208848.9 46889.9 1776633.4
1:2:2 iSCSIRxPDUCount 881.1 240.3 191.5
1:2:2 iSCSIRxBytesCount 3468907.0 938496.7 1230323.7
1:2:2 iSCSIIOsCompletedCount 435.8 119.4 100.3
1:2:2 iSCSIRxUnexpectedIOCount 0.0 0.0 0.0
1:2:2 iSCSIFormatErrorCount 0.0 0.0 0.0
1:2:2 iSCSIHeaderDigestErrorCount 0.0 0.0 0.0
1:2:2 iSCSIDataDigestErrorCount 0.0 0.0 0.0
1:2:2 iSCSISequenceErrorCount 0.0 0.0 0.0
This is accompanied by a large amount of pause frames on the switch side:
Switch: HP A5800-24G, Ver 5.20.105 Rel 1808P02
Code: Select all
Peak value of input: 268107625 bytes/sec, at 2013-05-29 10:51:50
Peak value of output: 170180225 bytes/sec, at 2000-04-27 14:11:59
Last 300 seconds input: 811 packets/sec 112135 bytes/sec 0%
Last 300 seconds output: 438 packets/sec 1216808 bytes/sec 0%
Input (total): 42387958132 packets, 99283502231747 bytes
38095148083 unicasts, 22746 broadcasts, 1182806 multicasts, 4291604497 pauses
Input (normal): 38096353635 packets, - bytes
38095148083 unicasts, 22746 broadcasts, 1182806 multicasts, 4291604497 pauses
Input: 0 input errors, 0 runts, 0 giants, 0 throttles
0 CRC, 0 frame, - overruns, 0 aborts
- ignored, - parity errors
Output (total): 38911347477 packets, 177770882637100 bytes
38860888889 unicasts, 18314621 broadcasts, 32143967 multicasts, 0 pauses
Output (normal): 38911347477 packets, - bytes
38860888889 unicasts, 18314621 broadcasts, 32143967 multicasts, 0 pauses
Output: 0 output errors, - underruns, - buffer failures
0 aborts, 0 deferred, 0 collisions, 0 late collisions
0 lost carrier, - no carrier
And here we are a bit stumped on how to investigate this issue further, are there any other tools/commands which could provide more insight on what's happening?
best,
Michael