Problems with iSCSI connection to SAN

Having an issue with our iSCSI SAN constantly doing logical unit resets which kills the performance of it and causes issues with our exchange databases dismounting because they lose sync. Also causing an issue with live-migrating data (VM storage) to the SAN which is a cluster shared volume.

Our setup:

Three Dell R710's all running the same Server 2012 R2 build, all up to date, all current firmware on BIOS, the Perc 6/i controllers, drivers, NIC's have the latest firmware and drivers, and they're all setup identically the same, and they're all in a cluster on our domain.

Of the 4 NIC's on each of the servers, the first two NIC's have been teamed, then Hyper-V is using that NIC team as a Hyper-V virtual NIC interface. The vNIC's all have unique static IP's. The other two NIC's I have reserved as dedicated for iSCSI, each has its own unique static IP address on a separate iSCSI subnet, and there is no gateway to the main subnet or the internet, we wanted to keep the two completely separate. Jumbo frames is turned on and set to 4088.

For the iSCSI switch, we are using a Dell 5424 which has the latest firmware, I have setup a LAG for the first 8 ports which go to our main SAN which has 8 ethernet connections, and I also setup a LAG on the SAN (a DLink 3200) for all 8 ports, and that was assigned a static IP. I also have the switch running in iSCSI mode, I have spanning tree turned off, and storm control turned on. All three servers connect directly to this switch with those 2 dedicated NIC's. Jumbo frames is turned on.

The SAN, like I said, is a DLink 3200 15 bay unit. Firmware up to date, setup with the LAG connected to the LAG on the switch. To start, I created a larger RAID 5 volume at 4TB for our main VM storage, and a small 1 GB RAID 5 array for our Cluster Witness volume. Jumbo frames for each volume set to 4088.

Back on each of the servers, I used iSCSI initiator to connect them all, and chose MPIO. After adding the first session, I went back in and made a second session, connected the second NIC to the IP for the SAN, checked MPIO and it is showing the two connections as it should. I then went to disk management on one of the servers, brought both of the new volumes online, and created a volume on each without assigning them a drive letter. Then, took them both offline. The disks were showing up on each of the servers successfully. Everything APPEARS to be setup correctly from all the documentation I've been able to find online as far as the entire setup from SAN to servers.

I then created the cluster, added the three servers, the configuration validation test went ok with only a few warnings which I can post if needed but they were mainly about the two sets of NIC's being on separate subnets, and the cluster manager successfully found and added the two SAN volumes. I made the 1 GB in a Cluster Witness Disk, and the other a CSV.

With the three servers already running Hyper-V with multiple VM's on each, I went to the Failover Cluster Manager and created VM roles for high availability, and imported them all. The only warning was that the VM's storage was local, and not on the CSV. So my next task was to then get the VM storage transferred to the CSV on the SAN. As a test, I transferred our four VM's which are our email servers running in a DAG (each running Server 2012) to the SAN one by one. The first two transferred pretty quickly, each are about 100 GB in size and transferred within a few minutes each successfully, no problem, live migration, never went down, it went perfectly. But with the third and fourth servers, as soon as I begin trying to transfer one of them to the CSV, it goes extremely slowly. I open up the Performance Monitor, go to network, and look at my two dedicated DLink NIC's, they're both mirroring eachother, but very very little traffic is moving, it'll stop completely, then pick up some, then stop again.

Confused, I open up the DLink Xlink manager app to look at the SAN's log. I see these constant "LU reset occurred" warnings. And these coincide with the sudden drop in speed, plus while these warnings are happening if I try to connect to one of the VM's being hosted via the SAN, the VM is non-responsive, almost like it's frozen, then a minute or two later it "wakes up" and starts responding normally:

Image may be NSFW.
Clik here to view.

Those constant LU (logical unit) resets are causing me a lot of headaches right now and I am really trying to figure out the source of the issue. I have been fighting these resets for weeks now, tried multiple configurations, re-done the cluster over and over again, tried firmware updates and driver updates, tried different settings on the switch, I even updated the firmware on the on-board storage on each of the servers and their drivers, but no luck.

Is anyone familiar with these types of errors who could give me some ideas on where else to look? I see the log keeps referencing initiator ID 1023, it's always that same exact ID number, and I assume that is one of the servers, but how can I find out which one it is?

I would be happy to provide any more information that you might want, or post screen shots. I am pretty stumped as to what the issue might be. Also, we recently upgraded to the Dell 5424 switch because our 2824 which we were using was apparently not meant for iSCSI, but we get the same LU reset errors with the new one.

Thank you for your help!

Problems with iSCSI connection to SAN

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112