Hi,
We're using multiple fibre channel cards in our servers (both 2008 R2 and 2012R2, but MPIO looks quite similar), and our SAN also has multiple controllers and multiple ports, so we're using MPIO, the Round Robin With Subset to be specific. We are using the
Microsoft DSM for the MPIO to our storage device, and left everything at default to start with:
* Path Verify Period set to 30, but Path Verify not enabled so it should have no effect.
* Retry Count: 3, Retry Interval 1
* PDO Remove Period: 20
* As described in http://blogs.msdn.com/b/san/archive/2011/09/01/the-windows-disk-timeout-value-understanding-why-this-should-be-set-to-a-small-value.aspx, the disk timeout is on the default 60 seconds.
Now, we do a test: we let a benchmark run to generate disk-I/O and we unplug one of the fibre channel cables on the server-side to simulate a broken link. We can see that for 30 seconds there is no disk-I/O, and after this 30 seconds, I/O is resumed, over the
other fibre channel port.
We want to lower this time-out so we can achieve faster failover to the other (still working, still connected) connection, for instance after 10 seconds, and thereby reducing the chance of timeouts. But this doesn't seem to work. I enabled the Path Verify-option and modified the values through the GUI, all without effect. I modified these through the registry (HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\mpio\Parameters), but that has no effect either. Even stranger: the values in the registry don't match the values in the GUI. Checking with Get-MPIOSetting I get the values from the GUI (differing from the registry values), but all these have no effect: still about 30 seconds without I/O before it's resumed.
So... what's the right way to change this behaviour? Or is it impossible to modify and will I/O always wait for 30 seconds?
Many thanks in advance!
Johan