Hi all, looking for some discussion and advice on a few questions I have regarding storage for our next cluster upgrade cycle.
Our current system for a bit of background:
- 3x Clustered Hyper-V Servers running Server 2008 R2 (72TB Ram, dual cpu etc...)
- 1x Dell MD3220i iSCSI with dual 1GB connections to each server (24x 146GB 15k SAS drives in RAID 10) - Tier 1 storage
- 1x Dell MD1200 Expansion Array with 12x 2TB 7.2K drives in RAID 10 - Tier 2 storage, large vm's, files etc...
- ~25 VM's running all manner of workloads, SQL, Exchange, WSUS, Linux web servers etc....
- 1x DPM 2012 SP1 Backup server with its own storage.
Reasons for upgrading:
- Storage though put is becoming an issue as we only get around 125MB/s over the dual 1GB iSCSI connections to each physical server. (tried everything under the sun to improve bandwidth but I suspect the MD3220i Raid is the bottleneck here.
- Backup times for vm's (once every night) is now in the 5-6 hours range.
- Storage performance during backups and large file syncronisations (DPM)
- Tier 1 storage is running out of capacity and we would like to build in more IOPS for future expansion.
- Tier 2 storage is massively underused (6tb of 12tb Raid 10 space)
- Migrating to 10GB server links.
- Total budget for the upgrade is in the region of £10k so I have to make sure we get absolutely the most bang for our buck.
Current Plan:
- Upgrade the cluster to Server 2012 R2
- Install a dual port 10GB NIC team in each server and virtualize cluster, live migration, vm and management traffic (with QoS of course)
- Purchase a new JBOD SAS array and leverage the new Storage Spaces and SSD caching/tiering capabilities. Use our existing 2TB drives for capacity and purchase sufficient SSD's to replace the 15k SAS disks.
On to the questions:
- Is it supported to use storage spaces directly connected to a Hyper-V cluster? I have seen that for our setup we are on the verge of requiring a separate SOFS for storage but the extra costs and complexity are out of our reach. (RDMA, extra 10GB NIC's etc...)
- When using a storage space in a cluster, I have seen various articles suggesting that each csv will be active/passive within the cluster. Causing redirected IO for all cluster nodes not currently active?
- If CSV's are active/passive its suggested that you should have a csv for each node in your cluster? How in production do you balance vm's accross 3 CSV's without manually moving them to keep 1/3 of load on each csv? Ideally I would like just a single CSV active/active for all vm's to sit on. (ease of management etc...)
- If the CSV is active/active am I correct in assuming that DPM will backup vm's without causing any re-directed IO?
- Will DPM backups of VM's be incremental in terms of data transferred from the cluster to the backup server?
Thanks in advance for anyone who can be bothered to read through all that and help me out! I'm sure there are more questions I've forgotten but those will certainly get us started.
Also lastly, does anyone else have a better suggestion for how we should proceed?
Thanks