Why do I need isolated ports for inter-node and how many do I need?

Hello friends. I know it has been a while but I am back now. I was having a conversation with someone on troubleshooting the performance of a SVC cluster and the question came up as to why isolated ports for inter-node are needed and exactly how many are needed - the answer to which I would like to share more broadly.

Recall from the IBM Spectrum Virtualize Software Overview that there are many forwarding layers in the i/o stack. These forwarding layers are used to perform cache mirroring of all write i/o during various stages of i/o processing. Any delay in forwarding results in queuing of write i/o because the software enforces cache mirroring in order to both acknowledge the write completion to the host and to manipulate said data and destage it to the back-end. Failure to do this would result in data loss and/or corruption.

This cache mirroring depends on the inter-node ports (also referred to as local node ports or intracluster ports). Any bottleneck to these ports by extension will bottleneck write performance. This may sound trivial but ports can get congested or slowed down frequently by old (lower tech) hosts, slow drain devices, and workload spikes from hosts and/or replication resulting in adverse write performance. This is why the general guidance I give is to isolate these ports on the fabric (don't zone anything else to these ports). Don't just take my word for it... check out section 1.3 of the best practices book.

It is also important to note that these intracluster ports are used for cluster maintenance which include maintaining the configuration of the system and the node leases. As a result, slow inter-node ports can result in slowing the whole system in maintaining the configuration status and (more so in multi-site scenarios) lease expiry asserts.

Now that the why is covered, take a moment to consider how many inter-node ports you need. This is a fairly simple exercise if you know your expected write data rate. At a minimum, you will want 1 port per fabric per node. Assuming a single iogrp (node pair) and a dual fabric topology this means you would be using 2 ports per node for inter-node. If you expect the write data rate to exceed 80% of the ports combined expected bandwidth, you need more ports.

The general rule is you can use 100MB/s for each 1Gbps the port is capable of. If you have 2 8Gb ports, this turns into 1600MB/s X 80% is about 1200MB/s or 1.2GB/s of write throughput. If you expect to be using more than this, you can expand the ceiling by adding more ports that are dedicated to intracluster i/o OR by increasing the port speed. If you have a larger configuration - say 4 intracluster ports that are 16Gb then this changes to about 5GB/s as a theoretical max write throughput per node at a port level, though at this configuration other bottlenecks may exist.

Hopefully you all found this informative and helpful. If you have any questions please feel free to comment, follow me on Twitter @fincherjc, or on LinkedIn.

Comments

  1. Interesting topic but how do you apply these rules to FS9100s where each node cannister has only 4 ports, so 8 per cluster ? Assuming no back-end storage to virtualize and no fibre channel devices to metro mirror partner, I have all ports zoned for node to node and also all ports splt into IOgroup pairs for host connectivity . Do you still think that in these circumstances it is possible to reserve 2 ports per node for interconnect thereby leaving only 2 ports per node for host ?

    ReplyDelete
    Replies
    1. Hello Bob. I actually did some work recently to get this revised in the Knowledge Center and the best practices documentation. If we assume a single iogrp the node-to-node traffic takes place across the PCIe bus and there is no true need for the use of fabric ports. With that you could use all available FC ports for host bandwidth.

      The conversation gets more interesting if you plan to scale out to multiple iogrps, where I would generally recommend the 8 port per node configuration. I would encourage you to check out the general guidance we published here:

      https://www.ibm.com/support/knowledgecenter/en/STSLR9_8.3.1/com.ibm.fs9200_831.doc/svc_planning_morethanfourfabricports.html

      Delete

Post a Comment

Popular posts from this blog

Why you should always use DRAID

Remote Copy Data Patterns and Partnership Tuning

What is a 1920 and why is it happening?