Sizing the inter-site link for replication and multi-site environments
Hello friends. I've done a poor job this year of posting content regularly but hope this one helps make up for it. One of the frequently asked questions I get from administrators, technical sellers, and other engineers is how much bandwidth is required to perform replication or to sustain a multi-site environment. For the purposes of this conversation I am going to assume the use of Fibre Channel for the network.
The first part of this answer is very easy. Synchronous replication (and synchronous data patterns in the case of Global Mirror) require enough bandwidth to move the peak write throughput of the production workloads. If your peak write rate is 3Gigabytes per second (GB/s) then the math works out as:
---------- X ------- = 24Gbps, 2X16Gbps links or 3X8Gbps links
For each Gbps on a standard port you can only achieve effectively 100MB/s as the standard interface is only capable of about 80% link efficiency on average. Take a moment and factor this into the bandwidth calculation using the ratio 100MB/s to 1Gbps.
16Gbps = 1600MB/s
3GB/s = 3072MB/s
3072 / 1600 = 1.92 = 2 Links to meet the specification
8Gbps = 800MB/s
3GB/s = 3072MB/s
3072 / 800 = 3.84 = 4 Links to meet the specification
If we simply meet the minimum requirement and give 4X8Gbps links then there is no room for link failure or workload growth. If we select 2X16Gbps there is still no room for link failure and we have minimal room for workload growth. This takes us to the second part of this answer, the expected lifespan for the solution and the expected workload growth. This is going to be a very intimate question to have as there is no one size fits all answer. Remember, we are talking about peak throughput and not capacity stored.
For the sake of my limited knowledge of a given situation, let us assume 15% growth in 5 years. The justification behind 5 years is that spinning drives and NAND flash generally have a lifespan of 5 years and after that point the failure rate increases significantly. As such, I would expect to have to refresh the solution around this time.
For the 15% growth please recall that this is a discussion on peak write throughput and not capacity stored. This is on my part a wild guess and not representative of any statistical analysis - I just needed a value to show the math. This value should be derived from the history of the environment and any historical performance data that is available. If there is nothing in the environment tracking this today, please consider implementing IBM Storage Insights or IBM Spectrum Control to track this going forward.
3072MB/s X 1.15 (115%) = 3532.8MB/s
Following the guidance above for MB/s per Gbps we now have to recalculate for growth.
For 16Gbps 3532.8 / 1600 = 2.2 = 3 Links to meet the specification
For 8Gbps 3532.8 / 800 = 4.42 = 5 Links to meet the specification
Now at this point we come to the third consideration. How much redundancy is required to sustain the demand in the event of a failure. Assuming you want to be tolerant of network failures and avoid an increase in latency to production systems the answer is simply double it so that either the A fabric or B fabric can single-handedly meet the demand.
For 16Gbps 3 X 2 (A+B fabric) = 6 Links to meet the demand, or 96Gbps for a 3GB/s workload
For 8Gbps 5 X 2 (A+B fabric) = 10 Links to meet the demand, or 80Gbps for a 3GB/s workload
Of course all of this assumes that 100% of the data is going to be replicated. In Stretched Clusters and HyperSwap environments this is the case. Stretched Clusters will use Volume Mirroring in SVC to write the data to both copies. In HyperSwap environments there is and active-active replication between volume copies that is similar to Metro Mirror. Of course the scenario where this assumption may not be true is for sizing remote copy (Metro Mirror and Global Mirror).
As always I hope y'all found this informative and helpful. For more on this subject my friend The Guy In the Hat is going to be posting some articles on SAN Design for HyperSwap over at Inside Storage Networking. You may also want to check out the IBM RedBooks production as well IBM Spectrum Virtualize HyperSwap SAN Implementation and Design Best Practices.