Why do I need isolated ports for inter-node and how many do I need?

Hello friends. I know it has been a while but I am back now. I was having a conversation with someone on troubleshooting the performance of a SVC cluster and the question came up as to why isolated ports for inter-node are needed and exactly how many are needed - the answer to which I would like to share more broadly.

Recall from the IBM Spectrum Virtualize Software Overview that there are many forwarding layers in the i/o stack. These forwarding layers are used to perform cache mirroring of all write i/o during various stages of i/o processing. Any delay in forwarding results in queuing of write i/o because the software enforces cache mirroring in order to both acknowledge the write completion to the host and to manipulate said data and destage it to the back-end. Failure to do this would result in data loss and/or corruption.

This cache mirroring depends on the inter-node ports (also referred to as local node ports or intracluster ports). Any bottleneck to these ports by extension will bottleneck write performance. This may sound trivial but ports can get congested or slowed down frequently by old (lower tech) hosts, slow drain devices, and workload spikes from hosts and/or replication resulting in adverse write performance. This is why the general guidance I give is to isolate these ports on the fabric (don't zone anything else to these ports). Don't just take my word for it... check out section 1.3 of the best practices book.

It is also important to note that these intracluster ports are used for cluster maintenance which include maintaining the configuration of the system and the node leases. As a result, slow inter-node ports can result in slowing the whole system in maintaining the configuration status and (more so in multi-site scenarios) lease expiry asserts.

Now that the why is covered, take a moment to consider how many inter-node ports you need. This is a fairly simple exercise if you know your expected write data rate. At a minimum, you will want 1 port per fabric per node. Assuming a single iogrp (node pair) and a dual fabric topology this means you would be using 2 ports per node for inter-node. If you expect the write data rate to exceed 80% of the ports combined expected bandwidth, you need more ports.

The general rule is you can use 100MB/s for each 1Gbps the port is capable of. If you have 2 8Gb ports, this turns into 1600MB/s X 80% is about 1200MB/s or 1.2GB/s of write throughput. If you expect to be using more than this, you can expand the ceiling by adding more ports that are dedicated to intracluster i/o OR by increasing the port speed. If you have a larger configuration - say 4 intracluster ports that are 16Gb then this changes to about 5GB/s as a theoretical max write throughput per node at a port level, though at this configuration other bottlenecks may exist.

Hopefully you all found this informative and helpful. If you have any questions please feel free to comment, follow me on Twitter @fincherjc, or on LinkedIn.

Comments

Popular posts from this blog

Why you should always use DRAID

What is a 1920 and why is it happening?

Troubleshooting volume performance in IBM Storage Insights