Category Archives: Hardware

NetApp updates their Mid and High-end A-Series

NetApp’s flagship unified storage platform is the All Flash FAS or AFF, the A-Series is different from the C-Series in that it addresses TLC flash vs the C-Series’ QLC.

Today, Tuesday May 14, 2024 NetApp has announced three new AFF A-Series models, the A70, A90 and A1K which will likely supplant the A400, A800 and A900 respectively, are here they are:

AFF A70
NetApp AFF A70
AFF A90
NetApp AFF A90
AFF A1K
NetApp AFF A1K

What’s interesting about these latest models is that they are all basically the same but with different RAM and CPU complements, the A1K does differentiate itself in that it is a return to a single-node chassis design whereas the A70 and A90 reuse the chassis from the A800, with a slight upgrade I’ll get to at the end of this article. Upgrading from an A800 to an A90 is a PCM-swap and not an entire chassis swap.

The new platform is a serious upgrade in the IO department, with nine usable PCIe 5.0 slots available per controller, up from five PCIe 3.0 on the A400 and A800 but down slightly from the A900’s ten PCIe 4.0 IO modules. PCIe 4.0 doubled the available bandwidth of PCIe 3.0 per PCIe lane and PCIe 5 doubled it again, bringing available lane bandwidth up to ~4GB/s. Up until this release, all existing controllers used a switched PCIe architecture but now, direct access to the bus is available to all slots. These improvements pave the way for faster PCIe cards such as quad port, 64G FC cards and dual port, 200GbE and 400GbE Ethernet adapters but be sure to pay attention to the slot priority assignment if you’re putting the card in yourself. Slots 2,3,9 and 10 have eight lanes vs sixteen on 1,4-8 and 11; all eleven slots on the A1K have sixteen lanes.

Powering these new controllers will be Intel’s 4th generation Xeon Scalable processors with QuickAssist Technology (QAT) allowing for integrated compression offload. Current platforms perform inline compression in 8K chunks and a scanner would perform compression on cold data in 32k chunks, QAT allows for inline compression at 32k chunks all without any performance impact. Any data moved to a QAT-enabled controller, or on controllers upgraded to QAT-enabled models will realize this efficiency improvement. However, if data is moved from a QAT-enabled controller to an older, non-QAT model, that efficiency goes away. Speaking of older models, all controller upgrades are non-disruptive and the A800 is an in-chassis upgrade. I have to wonder if NetApp has any plans to use the other accelerators available in this CPU, specifically DLB and IAA.

Have a look at the raw numbers, I’m not sure how scientific my aggregated GHz column is, but I find it helpful when comparing horsepower. I’ve included the existing A400, A800 and A900 to see where the new models land.

ModelMAX DrivesInternal DrivesCPU SpeedCPU/NodeCores/CPUAggr GHzNVRAMRAM
A400480N/A2.2GHz21044GHz16GB128GB
A70240482.0GHz21664GHz32GB128GB
A800240482.1GHz224100.8GHz32GB640GB
A90240482.0GHz232128GHz64GB1024GB
A900480N/A2.2GHz232140.8GHz64GB1024GB
A1K480N/A1.7GHz252176.8GHz64GB1024GB
By the Numbers: Node Specifications

As for performance improvement expectations, NetApp claims the following:

  • A70 provides 2x the performance of the A400
  • A90 provides 1.6x the performance of the A800
  • A1K provides 1.4x the performance of the A900

While this article covers the new A-Series, up until today the C-Series have been the exact same controllers, differing only in their ability to address TLC media vs the C-Series’ QLC media. Having said that, I’ve become a big fan of the C800 for performance and form factor. Since the A70 and A90 reuse the x800 chassis, that means the ability to stick 48 internal drives in it. With the current max drive sizes for the A800 and C800 being 15.3TB and 30.7TB respectively, they can provide up to 570TiB and 1.11PiB of usable space, all in 4RU of rack space; this is before taking into account ONTAP’s aggressive data efficiency capabilities. All three of the new models are described as having a max raw capacity of 3.7PB which means 240×15.3TB so no 30.7TB TLC media at this time.

Today’s announcement has the A-series properly differentiating itself from it’s C-series sibling; however, I can’t imagine the C-series wouldn’t get some sort of QAT-related update soon, but I guess we’re going to have to wait on that for now.

Last but not least, if you’re a fan of the new bezels we’ve been seeing, you’ll be happy to hear that NetApp has finally acquiesced to our demands that the NetApp logo light up. That’s right, the three new models all get a fancy light up bezel that can be controlled via the GUI, CLI or even API, sadly they are one-colour LEDs and not RGBW…at least not yet. If you have an existing A800 and you perform an in-chassis upgrade, you will not benefit from this one as the pre-existing chassis lack the requisite 8-pin connector to the bezel.

These new models have NetApp seriously improving their offering, first by beefing up the basic CPU/RAM components but also around the physical design and architecture, the modular nature of which is very interesting.

C-Series lineup

NetApp Announces A Whole New Line

Up until today, if you were looking for a physical ONTAP array for your environment, your choices were the hybrid flash, FAS array offering around 5-10ms of latency or the sub-ms AFF A-series. Sure there was one anomaly in there, the QLC-based FAS 500f, but that AFF in FAS clothing was just that, an anomaly. While I have no evidence to point to here, but my theory is that the 500f was NetApp’s way of dipping their toe in the water of QLC-based arrays. Upon launch, the 500f was pricey and the configurations limited and restricted, both of which were addressed at some point after launch. As an employee at a partner that sells a lot of NetApp, I looked at the 500f when it first launched and then basically never looked at it again because of those two points.

Today, NetApp is announcing the all new C-Series of QLC-based arrays, the “C” being for “Capacity Enterprise Flash”. While the controllers themselves aren’t new, the fact that they only support QLC media is what is different. While I won’t go into the details of what QLC, or Quad Layer Flash is in this post, the fact of the matter is that it is more affordable than Triple Layer Flash (TLC) and almost as performant. What this means for those purchasing NetApp arrays is that they can get near the performance of an AFF system at a fraction of the cost. Most of us in the storage world know that 10k and 15k RPM SAS drives are slowly going to be phased out in favour of high-capacity SATA drives and high-performance NAND storage, leaving a void. QLC-based arrays will fill that void, and at a higher performance level. If you start to research QLC vs TLC, you’ll find lots of concerns around durability which are not completely unfounded, but you would have also found these concerns when the industry went from Multi-cell (MLC) to TLC and that seems to have gone well enough. Technology of the storage devices themselves improve over time and software-based mitigation strategies such as write avoidance also improve. I’m not knowledgeable enough on this latter point to go into details, but ONTAP is a beast and has all sorts of tricks up its sleeve.

So without further ado, I present NetApp’s Enterprise Capacity Flash line, the AFF C800, AFF C400 and AFF C250:

AFF C800
AFF C400
AFF C250

Quick Specs:

AFF C800AFF C400AFF C250
Max drive count (15.3TB NVMe QLC)1449648
Max effective capacity (5:1 efficiencies)8.8 PB5.9 PB2.9 PB
Max Usable capacity (1:1)1.6 PiB1.06 PiB540.37 TiB
Minimum configurations12 × 15.38 × 15.38 × 15.3
100GbE ports per HA pair20164
25GbE ports per HA pair1612 onboard / 16 HBA4 onboard / 16 HBA
32Gb FC ports323216
By the numbers

Now some of you may have thought, “I thought there was already a C-series with the C190?”, and you’d be right. NetApp is repurposing the C-series branding as well as introducing a successor to the C190, the AFF A150. While the new A150 will still have some restrictions, it won’t be nearly as restrictive as the C190. The physical form-factor remains the same as the C190, but the A150 will allow for up to two expansion shelves for a total of 72 SAS SSDs including the internal ones in capacities of 960GB, 3.8TB and 7.6TB, coming to a max usable capacity of ~402TiB, or 2.2PB at an efficiency level of 1:5.

Back to the new C-Series conversation, they bring with them a new default licensing model, ONTAP One. ONTAP One is something I have personally been asking for many years at this point, and it includes all of the licenses; Core, Data Protection, Hybrid Cloud and Security & Compliance. Personally I’m looking forward to not having to worry about what features are available with a certain license offering, instead, C-Series with ONTAP One as the default licensing model will ensure you or your customers will never be left wondering if their array has a given feature.

The C-Series should be available to quote as of March 27, 2023 and should start shipping by the end of April. This statement as well as all of the information above is based on pre-release information I received and may be subject to change at press time. I will endeavour to add corrections below should any of the above change at launch.

Migrating from the CN1610 to the BES-53248 for cluster interconnect

In my continuing effort to make the adoption of the BES-53248 more streamlined, I figured I would also write a migration guide as I personally had to read the documentation more than once to understand it completely. If you haven’t already checked it out, it might be helpful to first consult my first timers’ guide as the following guide starts with the assumption that your new switches are racked and Inter-Switch Links (ISL) connected and initial configuration has been performed.

Another quick caveat, this is by no means a replacement for the official documentation and the methods below may or may not be supported by NetApp. If you want the official procedure, that is documented here.

Now that we’ve got the above out of the way, I’ll get down to brass tacks. To keep things simple, we’re going to start with a simple two-node switched cluster which should look like this:

You should also have your new BES switched setup as so:

Next step, lets make sure we don’t get a bunch of cases created by kicking off a pre-emptive auto support:

system node autosupport invoke -node * -type all -message MAINT=2h

Elevate your privilege level and confirm all cluster LIFs are set to auto-revert:

set advanced
network interface show -vserver Cluster -fields auto-revert

If everything above looks good, it’s time to login to your second BES which NetApp wants you to name cs2 and configure a temporary ISL back to your first CN1610. Personally I feel the temporary ISL is optional, but can provide a bit of added insurance to your change:

(cs2) # configure
(cs2) (Config)# port-channel name 1/2 temp-isl
(cs2) (Config)# interface 0/13-0/16
(cs2) (Interface 0/13-0/16)# no spanning-tree edgeport
(cs2) (Interface 0/13-0/16)# addport 1/2
(cs2) (Interface 0/13-0/16)# exit
(cs2) (Config)# interface lag 2
(cs2) (Interface lag 2)# mtu 9216
(cs2) (Interface lag 2)# port-channel load-balance 7
(cs2) (Config)# exit

(cs2) # show port-channel 1/2
Local Interface................................ 1/2
Channel Name................................... temp-isl
Link State..................................... Down
Admin Mode..................................... Enabled
Type........................................... Static
Port-channel Min-links......................... 1
Load Balance Option............................ 7
(Enhanced hashing mode)

Mbr     Device/        Port      Port
Ports   Timeout        Speed     Active
------- -------------- --------- -------
0/13    actor/long     10G Full  False        
        partner/long
0/14    actor/long     10G Full  False 
        partner/long
0/15    actor/long     10G Full  False
        partner/long
0/16    actor/long     10G Full  False
        partner/long

At this point, we’re going to disconnect any of the connections to the second CN1610 and run these to the second BES-53248. You may need different cables to ensure they are supported, check Hardware Universe. When you’re done this recabling step, it should look like this:

Note: It’s this step here that made me realize the temporary ISL is optional since we now have our two sets of LIFs isolated from each other.

Next, let’s put the (optional) temporary ISL into play. At your first CN1610, disconnect the cables connected to ports 13-16 and once they’re all disconnected, assuming these cables are supported by both switches, plug them into ports 13-16 on your second BES, so it looks like this:

Now on the second BES-53248, verify the ISL is up:

show port-channel

Assuming the port-channel is up and running, let’s check the health of our cluster LIFs by issuing the following commands at the cluster command line:

network interface show -vserver Cluster -is-home false
network port show -ipspace Cluster

The first command shouldn’t produce any output, give the LIFs time to revert however. The second command, you want to make sure all ports are up and healthy. Once all the LIFs have reverted home, you can now move all the links from the first cluster node as well as removing the temporary ISLs so you end up with this:

Run the same two commands as before:

network interface show -vserver Cluster -is-home false
network port show -ipspace Cluster

Provide everything looks good, you’re free to remove the CN1610s from the rack as they are no longer in use. The final step is to clean up the configuration on your second BES-53248 by tearing down the temporary ISL configuration, done like this:

(cs2) # configure
(cs2) (Config)# deleteport 1/2 all
(cs2) (Config)# exit
(cs2) # write memory

This guide is by no means a replace for the official documentations but rather a companion to it. You should always consult the official documentation, I purposely cut out some of the steps I felt gave the docs a bit of a TL;DR feel but it doesn’t mean I wouldn’t personally run those steps if I were doing the work. This document is only my attempt to clarify the official docs, hopefully it does so for you.

NetApp releases a new AFF and a new FAS(?)

While we ramp up for NetApp INSIGHT next week, (the first virtual edition, for obvious reasons), NetApp has announced a couple of new platforms. First off, the AFF A220, NetApp’s entry-level, expandable AFF is getting a refresh in the AFF A250. While the 250 is a recycled product number, the AFF A250 is a substantial evolution of the original FAS250 from 2004.

The front bezel looks pretty much the same as the A220:

AFF A250 – Front Bezel

Once you remove the bezel, you get a sneak peak of what lies within from those sexy blue drive carriers which indicate NVMe SSDs inside:

AFF A250 – Bezel Removed

While the NVMe SSDs alone are a pretty exciting announcement for this entry-level AFF, once you see the rear, that’s when the possibilities start to come to mind:

AFF A250 – Rear View

Before I address the fact that there’s two slots for expansion cards, let’s go over the internals. Much like its predecessor, each controller contains a 12-core processor. While the A220 contained an Intel Broadwell-DE running at 1.5GHz, the A250 contains an Intel Skylake-D running at 2.2GHz providing roughly a 45% performance increase over the A220, not to mention 32, [*UPDATE: Whoops, this should read 16, the A220 having 8.] third generation PCIe lanes. System memory gets doubled from 64GB to 128GB as does NVRAM, going from 8GB to 16GB. Onboard connectivity consists of two 10GBASE-T (e0a/e0b) ports for 10 gigabit client connectivity with two 25GbE SFP28 ports for ClusterNet/HA connectivity. Since NetApp continues to keep HA off the backplane in newer models, they keep that door open for HA-pairs living in separate chassis, as I waxed about previously here. Both e0M and the BMC continue to share a 1000Mbit, RJ-45 port, and the usual console and USB ports are also included.

Hang on, how do I attach an expansion shelf to this? Well at launch, there will be four different mezzanine cards available to slot into one of the two expansion slots per controller. There will be two host connectivity cards available, one being a 4-port, 10/25Gb, RoCEv2, SFP28 card and the other being a 4-port, 32Gb Fibre Channel card leveraging SFP+. The second type of card available is for storage expansion: one is a 2-port, 100Gb Ethernet, RoCEv2, QSFP28 card for attaching up to one additional NS224 shelf, and the other being a 4-port, 12Gb SAS, mini-SAS HD card for attaching up to one additional DS224c shelf populated with SSDs. That’s right folks, this new platform will only support up to 48 storage devices, though in the AFF world, I don’t see this being a problem. Minimum configuration is 8 NVMe SSDs, max is 48 NVMe SSDs or 24 NVMe + 24 SAS SSDs, but you won’t be able to buy it with SAS SSDs. That compatibility is being included only for migrating off of or reusing an existing DS224x populated with SSDs. If that’s a DS2246, you’ll need to upgrade the IOM modules to 12GB prior to attachment.

Next up in the hardware announcement is the new FAS(?)…but why the question mark you ask? That’s because this “FAS” is all-flash. That’s right, the newest FAS to hit the streets is the FAS 500f. Now before I get into those details, I’d love to get into the speeds and feeds as I did above. The problem is that I would simply be repeating myself. This is the same box as the AFF A250, much like how the AFF A220 is the same box as the FAS27x0. The differences between the AFF 250 and the FAS500f are in the configurations and abilities or restrictions imposed upon it.

While most of the information above can be ⌘-C’d, ⌘-V’d here, this box does not support the connection of any SAS-based media. That fourth mez card I mentioned, the 4-port SAS one? Can’t have it. As for storage device options, much like Henry Ford’s famous quote:

Any customer can have a car painted any color that he wants so long as it is black.

-Henry Ford

Any customer can have any size NVMe drive they want in the FAS500f, so long as it’s a 15.3TB QLC. That’s right, not only are there no choices to be made here other than drive quantity, but those drives are QLC. On the topic of quantity, the available configurations start at a minimum 24 drives and can be grown to either 36 or 48, but that’s it. So why QLC? By now, you should be aware that the 10k/15k SAS drives we are so used to today for our tier 2 workloads are going away. In fact, the current largest spindle size of 1.8TB is slated to be the last drive size in this category. NetApp’s adoption of QLC media is a direct result of the sunsetting of this line of media. While I don’t expect to get into all of the differences between Single, Multi, Triple, Quad or Penta-level (SLC, MLC, TLC, QLC, or PLC) cell NAND memory in this post, the rule of thumb is the more levels, the lower the speed, reliability, and cost are. QLC is slated to be the replacement for 10k/15k SAS yet it is expected to perform better and only be slightly more expensive. In fact, the FAS500f is expected to be able to do 333,000 IOPS at 3.6ms of latency for 100% 8KB random read workloads or 170,000 IOPS at 2ms for OLTP workloads with a 40/60 r/w split.

Those are this Fall’s new platforms. If you have any questions put it in a comment or tweet at me, @ChrisMaki, I’d love to hear your thoughts on these new platforms. See you next week at INSIGHT 2020, virtual edition!

***UPDATE: After some discussion over on Reddit, it looks like MetroCluster IP will be available on this platform at launch.

Gartner’s new Magic Quadrant for Primary Storage

Hot off the presses is Gartner’s new Magic Quadrant (GMQ) for Primary Storage and it’s great to see NetApp at the top-right, right where I’d expect them to be. This is the first time Gartner has combined rankings for primary arrays and not separated out all-flash from spinning media and hybrid arrays, acknowledging that all-flash is no longer a novelty.

As you can see on the GMQ below, the x-axis represents completeness of vision while the y-axis measures ability to execute, NetApp being tied with Pure on X and leading on Y.

As mentioned, this new MQ marks the retiring of the previous divided GMQs of Solid-State Arrays and General-Purpose Disk Arrays. To read more about NetApp’s take on this new GMQ, head over to their blog post on the subject or request a copy of the report here.

There’s a new NVMe AFF in town!

Yesterday, NetApp announced a new addition to the midrange tier of their All-Flash FAS line, the AFF A320. With this announcement, end-to-end NVMe is now available in the midrange, from the host all the way to the NVMe SSD. This new platform is a svelte 2RU that supports up to two of the new NS224 NVMe SSD shelves, which are also 2RU. NetApp has set performance expectations to be in the ~100µs range.

Up to two PCIe cards per controller can be added, options are:

  • 4-port 32GB FC SFP+ fibre
  • 2-port 100GbE RoCEv2* QSFP28 fibre (40GbE supported)
  • 2-port 25GbE RoCEv2* SPF28 fibre
  • 4-port 10GbE SFP+ Cu and fibre
    *RoCE host-side NVMeoF support not yet available

A couple of important points to also note:

  • 200-240VAC required
  • DS, SAS-attached SSD shelves are NOT supported

An end-to-end NVMe solution obviously needs storage of some sort, so also announced today was the NS224 NVMe SSD Storage Shelf:

  • NVMe-based storage expansion shelf
  • 2RU, 24 storage SSDs
  • 400GB/s capable, 200Gb/sec per shelf module
  • Uplinked to controller via RoCEv2
  • Drive sizes available: 1.9TB, 3.8TB and 7.6TB. 15.3TB with restrictions.

Either controller in the A320 has eight 100GbE ports on-board, but not all of them are available for client-side connectivity. They are allocated as follows:

  • e0a → ClusterNet/HA
  • e0b → Second NS224 connectivity by default, or can be configured for client access, 100GbE or 40GbE
  • e0c → First NS224 connectivity
  • e0d → ClusterNet/HA
  • e0e → Second NS224 connectivity by default, or can be configured for client access, 100GbE or 40GbE
  • e0f → First NS224 connectivity
  • e0g → Client network, 100GbE or 40Gbe
  • e0h → Client network, 100GbE or 40Gbe

If you don’t get enough client connectivity with the on-board ports, then as listed previously, there are myriad PCIe options available to populate the two available slots. In addition to all that on-board connectivity, there’s also MicroUSB and RJ45 for serial console access as well as the RJ-45 Wrench port to host e0M and out-of-band management via BMC. As with most port-pairs, the 100GbE ports are hosted by a single ASIC which is capable of a total effective bandwidth of ~100Gb.

Food for thought…
One interesting design change in this HA pair, is that there is no backplane HA interconnect as has been the case historically; instead, the HA interconnect function is placed on the same connections as ClusterNet, e0a and e0d. This enables some interesting future design possibilities, like HA pairs in differing chassis. Also, of interest is the shelf connectivity being NVMe/RoCEv2; while currently connected directly to the controllers, what’s stopping NetApp from putting these on a switched fabric? Once they do that, drop the HA pair concept above, and instead have N+1 controllers on a ClusterNet fabric. Scaling, failovers and upgrades just got a lot more interesting.

Raw AutoSupport, tried and true – sysconfig

While NetApp keeps improving the front end that is ActiveIQ, for both pre-sales and support purposes, I constantly find myself going into the Classic AutoSupport and accessing the raw autosupport data; most often it’s sysconfig -a. Recently I was trying to explain the contents to a co-worker and I realized that I should just document it as a blog post. So here is sysconfig explained.

The command sysconfig -a is the old 7-mode command to give you all the hardware information from the point of view of ONTAP. All the onboard ports are assigned to “slot 0” whereas slot 1-X are the physical PCIe slots where myriad cards can be inserted. Here’s one example, I’ll insert comments as I feel it is appropriate. Continue reading

ONTAP 9.4 – Improvements and Additions

While the actual payload hasn’t hit the street yet, here’s what I can tell you about the latest release in the ONTAP 9 family which should be available here any day now. **EDIT: RC1 is here. 9.4 went GA today.

FabricPool

Lots of improvements to ONTAP’s object-tiering code in this release, it appears they’re really pushing development here:

  • Support for Azure Blob, both hot and cool tiers, no archive tier support
    • This adds to the already supported AWS-S3 and StorageGRID Webscale object stores
  • Support for cold-data tiering policies, whereas in 9.2,9.3 it was backup and snapshot-only tiering policies
    • Default definition of cold data is 31 days but can be adjust to anywhere from 2-63 days.
    • Not all cold blocks need to be made hot again, such as snapshot-only blocks. Random reads will be considered application access, declared hot and written back into performance tier whereas sequential reads are assumed to be indexers, virus scanners or other and should be kept cold and therefore will not be written back into performance tier.
  • Now supported in ONTAP Select, in addition to the existing ONTAP and ONTAP Cloud. Wherever you run ONTAP, you can now run FabricPools, SSD aggregate caveat still exists.
  • Inactive Data Reporting by OnCommand System Manager to determine how much data would be tiered if FabricPools were implemented.
    • This one will be key to clients thinking about adopting FabricPools
  • Object Store Profiler is a new tool in ONTAP that will test the performance of the object store you’re thinking of attaching so you don’t have to dive in without knowing what your expected performance should be.
  • Object Defragmentation now optimizes your capacity tier by reclaiming space that is no longer referenced by the performance tier
  • Compaction comes to FabricPools ensuring that your write stripes are full as well as applying Compression, Deduplication

Continue reading

ONTAP 9.3 is out soon, here’s the details you need.

It’s that time of year again, time for an ONTAP release…or at least an announcement. When 9.3 drops, not only will it be an LTS (Long Term Support) version, but NetApp continues to refine and enhance ONTAP.

Simplifying operations:

  • Application-Aware, data management for MongoDB
  • Adaptive QoS
  • Guide cluster setup and expansion
  • simplified data protection setup, much simpler.

Efficiencies:

Not so long ago, in ONTAP 9.2, NetApp introduced inline, aggregate-level dedupe. What many people may not have realized, due to the nature of way ONTAP coalesces writes in NVRAM prior to flushing them to the durable layer is that this inline aggregate dedupe’s domain was restricted to the data in the NVRAM. With 9.3, a post-process aggregate scanner has been implemented to provide true, aggregate-level dedupe.

Continue reading

What you need to know about NetApp’s 40GbE options

­With the introduction of the new NetApp platforms back in September 2016, came 40GbE as well as 32Gb Fibre Channel connectivity.

I had my first taste of 40GbE on the NetApp side back in January when I got to install the first All Flash FAS A700 in Canada. The client requested a mix of 40GbE and 16Gb FC with some of the 40GbE being broken out into 4 × 10GbE interfaces and some being used natively.

NetApp is deploying two flavours of 40GbE cards: the X1144A for the AFF A300, AFF A700s and FAS8200, and the X91440A for the AFF A700 and FAS9000 storage systems. At first glance, you might be tempted to assume that those are the same PCIe card since the part numbers are very similar (the latter just being in some sort of carrier to satisfy the I/O module requirement for the blade-style chassis that is home to the A700 and FAS9000), Upon further inspection the two are not exactly equal.

The ports on most PCIe cards and onboard interfaces are deployed in pairs, with one shared application-specific integrated circuit (ASIC) on the board behind the physical ports. On the X1144A, both external ports share one ASIC with an available combined bandwidth of 40Gb/s, whereas the X91440A has two ASICs. Each has two ports, but one is internal and not connected to anything, giving you 40Gb/s per external port.

The ASIC (or controller) in question is the Intel XL710. What’s important about this is that both external ports on an X91440A can be broken out to 4 × 10GbE interfaces for a total of eight, or one can remain at 40GbE while the other is broken out. On the X1144A however, you can either connect both ports to your switch using 40GbE connections or you can break-out port A to 4 × 10GbE and port B gets disabled. According to Intel, if you connect both ports via 40GbE, “The total throughput supported by the 710 series is 40 Gb/s, even when connected via two 40 Gb/s connections.”

Now before we get all up in arms about this, lets really get into the weeds here. Both the FAS8200/FAS9000 and the AFF A300/700 are using PCIe 3.0. Each PCIe 3.0 lane can carry 8 Gigatransfers per second (GT/s). For the purposes of this post, that is close enough to 8Gb/s. The FAS8200/AFF A300 has an Intel D-1587 CPU with a maximum eight lanes per slot, so roughly 64Gb/s of throughput, whereas the FAS9000/AFF A700 has an Intel E5-2697 with a maximum 16 lanes per I/O slot which gives it about 128Gb/s of throughput. So even if NetApp included a network interface card for the A300/FAS8200 with two XL710’s on it, the PCIe slot it’s connected to couldn’t provide 80Gb/s of throughput, whereas the the I/O modules in the A700/FAS9000 can.

Say you want to change between 40GbE and 10GbE. Unlike modifying UTA2 profiles (as explained here), with the XL710, you need to get into maintenance mode first and use the nicadmin command. Here’s an example:

sysconfig output before:

slot 1: 40 Gigabit Ethernet Controller XL710 QSFP+
                 e1a MAC Address:    00:a0:98:c5:b2:fb (auto-40g_cr4-fd-up)
                 e1e MAC Address:    00:a0:98:c5:b2:ff (auto-unknown-down)

At this point I already had the breakout cable installed. That’s why the second link shows as down.

Conversion example:

*> nicadmin
 nicadmin convert -m { 40G | 10G } <port-name>
 
 
 *> nicadmin convert -m 10g e1e
 Converting e1e 40G port to four 10G ports
 Halt, install/change the cable, and then power-cycle the node for
 the conversion to take effect.  Depending on the hardware model,
 the SP (Service Processor) or BMC (Baseboard Management Controller)
 can be used to power-cycle the node.

sysconfig output after:

slot 1: 40 Gigabit Ethernet Controller XL710 QSFP+
                 e1a MAC Address:    00:a0:98:c5:b2:fb (auto-40g_cr4-fd-up)
                 e1e MAC Address:    00:a0:98:c5:b2:ff (auto-10g_twinax-fd-up)
                 e1f MAC Address:    00:a0:98:c5:b3:00 (auto-10g_twinax-fd-up)
                 e1g MAC Address:    00:a0:98:c5:b3:01 (auto-10g_twinax-fd-up)
                 e1h MAC Address:    00:a0:98:c5:b3:02 (auto-10g_twinax-fd-up)

Unfortunately I don’t have access to either a FAS8200 nor an AFF A300 with 40GbE otherwise I’d provide the sysconfig output before and after there as well.

Now, there’s a bit of a debate going on around the viability of 40GbE over 100GbE. While 40GbE is simply a combined 4 × 10GbE; 100GbE is only a combined 4 × 25GbE. With regards to production costs, apparently to make a 40GbE QSFP+, you literally combine 4 lasers (hence the Q in QSFP) into the module; well, the same goes for 100GbE. You only need one laser to produce the wavelength for 25GbE, and while that still means you need four for 100GbE, four times the production cost still yields 250% of the throughput of 40GbE which makes me wonder where it will end up in a year.

So there you go, more than you ever wanted to know about NetApp’s recent addition of 40GbE into the ONTAP line of products as well as my personal philosophical waxing around the 40 versus 100 GbE debate.