This saga could also be named “Why my very expensive LSI 9305-24i drops disks?”.
I recently acquired (from ebay as I cannot justify giving $2000NZ to local scumbags) an LSI 9305-24i controller to drive a 24bay NAS.
The reason switching from LSI 9112-8i with RES2CV360 SAS2 expander is because the LSI 9112-8i is bottlenecked by the 8x PCIe 2.0, thus limiting the throughput of a 24 disk array to about ~100MB/s per disk.
The LSI 9305-24i made a significant boost in the performance, with ~200MB/s (maxing out Seagate Ironwolfs) per disk.
Unfortunately because I needed SFF-8643 to SFF-8087 cables I had limited options of sourcing. I got 6x SFF-8643 to SFF-8087 from Amazon (HiFibre branded).
This whole setup worked wonders until a random disk would drop out of the array.
When that would happen I would see errors like these:
[ 1059.345207] mpt3sas_cm0: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000)
[ 1060.094729] mpt3sas_cm0: log_info(0x31110635): originator(PL), code(0x11), sub_code(0x0635)
[ 1065.399121] mpt3sas_cm0: log_info(0x31120311): originator(PL), code(0x12), sub_code(0x0311)
[ 1065.399128] mpt3sas_cm0: log_info(0x31120311): originator(PL), code(0x12), sub_code(0x0311)
[ 1170.818427] mpt3sas_cm0: mpt3sas_transport_port_remove: removed: sas_addr(0x300062b20299153f)
[ 1170.818429] mpt3sas_cm0: removing handle(0x0024), sas_addr(0x300062b20299153f)
[ 1170.818429] mpt3sas_cm0: enclosure logical id(0x500062b202991533), slot(23)
[ 1170.818430] mpt3sas_cm0: enclosure level(0x0000), connector name( )
[ 3758.857760] sd 0:0:49:0: [sdl] tag#4877 Add. Sense: Information unit iuCRC error detected
[ 3758.857761] sd 0:0:49:0: [sdl] tag#4877 CDB: Read(16) 88 00 00 00 00 00 00 ce 38 e0 00 00 01 00 00 00
[ 3758.857762] blk_update_request: I/O error, dev sdl, sector 13514976 op 0x0:(READ) flags 0x80700 phys_seg 17 prio class 0
After numerous swapping cables, disks backplanes the issue was isolated to a particular port on the card, thus making me believe the card was faulty.
I needed confirmation thus I bought another cable from the Amazon (CableCreation brand).
The port problem persisted. While I was poking about, and the issue was very intermittent, I discovered that I can trigger the issue by gently moving the cable while it was plugged in!.
Not only that, but the issue could be replicated with other cables and ports.
Here is a clip of the issue that is causing I/O errors:
This motion could be caused by someone walking, or even thermal expansion.
Here is a position where it is less likely to encounter error:
here is the position where it is most likely to drop a disk or two:
So, at this stage I have sunk hundreds of dollars in cables and have an awesome card I can’t use.
I decided to buy Dell cables (this time 75cm long, so I have less tension on the plugs), and will probably make some clips (or even hotglue) to secure them in the adapter.
At this stage, while I wait, I went back to old configuration, I would rather have sub-optimal performance than data loss.
did you find a solution to this problem? Other than going back to your earlier setup.
Funny (not funny.) I have the exact same issue. Bought new cables, decided that 2 ports were faulty, but upon reading this. I think, It may have something to do with this.
SFF8643 mini hd, is a very poorly created solution for enterprise work.
But there must be a workaround since we have these otherwise fine HBS cards?
I have actually shelved the adapter; waste of $700 to be honest.
I have retried again recently, when the 9211-8i died, with Dell cables and still there were errors. I tried to cable tie and hot glue the cables in place, no luck either way.
I don’t see these problems in the enterprise hardware so I suspect maybe it is the connector on the card or card is faulty (since it is from ebay, as no way in hell I am paying $2000+ for an HBA).
Most of the servers I deal with run these connectors between the card and back plane and in hundreds of servers there are absolutely zero issues.
Seems like a pretty cool HBA for using with SFF-8643 backplanes