Category Archives: AFF

What you need to know about NetApp’s 40GbE options

­With the introduction of the new NetApp platforms back in September 2016, came 40GbE as well as 32Gb Fibre Channel connectivity.

I had my first taste of 40GbE on the NetApp side back in January when I got to install the first All Flash FAS A700 in Canada. The client requested a mix of 40GbE and 16Gb FC with some of the 40GbE being broken out into 4 × 10GbE interfaces and some being used natively.

NetApp is deploying two flavours of 40GbE cards: the X1144A for the AFF A300, AFF A700s and FAS8200, and the X91440A for the AFF A700 and FAS9000 storage systems. At first glance, you might be tempted to assume that those are the same PCIe card since the part numbers are very similar (the latter just being in some sort of carrier to satisfy the I/O module requirement for the blade-style chassis that is home to the A700 and FAS9000), Upon further inspection the two are not exactly equal.

The ports on most PCIe cards and onboard interfaces are deployed in pairs, with one shared application-specific integrated circuit (ASIC) on the board behind the physical ports. On the X1144A, both external ports share one ASIC with an available combined bandwidth of 40Gb/s, whereas the X91440A has two ASICs. Each has two ports, but one is internal and not connected to anything, giving you 40Gb/s per external port.

The ASIC (or controller) in question is the Intel XL710. What’s important about this is that both external ports on an X91440A can be broken out to 4 × 10GbE interfaces for a total of eight, or one can remain at 40GbE while the other is broken out. On the X1144A however, you can either connect both ports to your switch using 40GbE connections or you can break-out port A to 4 × 10GbE and port B gets disabled. According to Intel, if you connect both ports via 40GbE, “The total throughput supported by the 710 series is 40 Gb/s, even when connected via two 40 Gb/s connections.”

Now before we get all up in arms about this, lets really get into the weeds here. Both the FAS8200/FAS9000 and the AFF A300/700 are using PCIe 3.0. Each PCIe 3.0 lane can carry 8 Gigatransfers per second (GT/s). For the purposes of this post, that is close enough to 8Gb/s. The FAS8200/AFF A300 has an Intel D-1587 CPU with a maximum eight lanes per slot, so roughly 64Gb/s of throughput, whereas the FAS9000/AFF A700 has an Intel E5-2697 with a maximum 16 lanes per I/O slot which gives it about 128Gb/s of throughput. So even if NetApp included a network interface card for the A300/FAS8200 with two XL710’s on it, the PCIe slot it’s connected to couldn’t provide 80Gb/s of throughput, whereas the the I/O modules in the A700/FAS9000 can.

Say you want to change between 40GbE and 10GbE. Unlike modifying UTA2 profiles (as explained here), with the XL710, you need to get into maintenance mode first and use the nicadmin command. Here’s an example:

sysconfig output before:

slot 1: 40 Gigabit Ethernet Controller XL710 QSFP+
                 e1a MAC Address:    00:a0:98:c5:b2:fb (auto-40g_cr4-fd-up)
                 e1e MAC Address:    00:a0:98:c5:b2:ff (auto-unknown-down)

At this point I already had the breakout cable installed. That’s why the second link shows as down.

Conversion example:

*> nicadmin
 nicadmin convert -m { 40G | 10G } <port-name>
 
 
 *> nicadmin convert -m 10g e1e
 Converting e1e 40G port to four 10G ports
 Halt, install/change the cable, and then power-cycle the node for
 the conversion to take effect.  Depending on the hardware model,
 the SP (Service Processor) or BMC (Baseboard Management Controller)
 can be used to power-cycle the node.

sysconfig output after:

slot 1: 40 Gigabit Ethernet Controller XL710 QSFP+
                 e1a MAC Address:    00:a0:98:c5:b2:fb (auto-40g_cr4-fd-up)
                 e1e MAC Address:    00:a0:98:c5:b2:ff (auto-10g_twinax-fd-up)
                 e1f MAC Address:    00:a0:98:c5:b3:00 (auto-10g_twinax-fd-up)
                 e1g MAC Address:    00:a0:98:c5:b3:01 (auto-10g_twinax-fd-up)
                 e1h MAC Address:    00:a0:98:c5:b3:02 (auto-10g_twinax-fd-up)

Unfortunately I don’t have access to either a FAS8200 nor an AFF A300 with 40GbE otherwise I’d provide the sysconfig output before and after there as well.

Now, there’s a bit of a debate going on around the viability of 40GbE over 100GbE. While 40GbE is simply a combined 4 × 10GbE; 100GbE is only a combined 4 × 25GbE. With regards to production costs, apparently to make a 40GbE QSFP+, you literally combine 4 lasers (hence the Q in QSFP) into the module; well, the same goes for 100GbE. You only need one laser to produce the wavelength for 25GbE, and while that still means you need four for 100GbE, four times the production cost still yields 250% of the throughput of 40GbE which makes me wonder where it will end up in a year.

So there you go, more than you ever wanted to know about NetApp’s recent addition of 40GbE into the ONTAP line of products as well as my personal philosophical waxing around the 40 versus 100 GbE debate.

NetApp Volume Encryption, The Nitty Gritty

It all begins in the configuration builder tool

This article focuses on the implementation and management of encryption with NetApp storage. Data at Rest Encryption (NetApp Volume Encryption or NVE for short) is one of the ways that you can achieve encryption with NetApp, and it’s one of the most exciting new features of ONTAP 9.1. Here’s how you go about implementing it.

If you’re a partner or NetApp SE, when building configurations, as long as the cluster software version is set to 9.x, there is a checkbox that lets you decide which version of ONTAP gets written to the device at the factory. As of 9.1, ONTAP software images will either be capable of encryption via a software encryption module, or not. There are laws around both the import and export of software that is capable of encryption, but that is beyond the scope of this article. I do know you can use the encryption-capable image in Canada (where I am located), so I’m covered. If you’re unsure about the laws in your country, consult your legal adviser on this matter.

Once this cluster-level toggle has been set and you add hardware into the configuration, there are two more checkboxes in the software section:

  1. NetApp Volume Encryption (off by default)
  2. Trusted Platform Module (TPM, on by default) ***Clarification Update*** – TPM NOT REQUIRED FOR NVE

The first one triggers the generation of the license key for NVE and the second one activates a piece of hardware dedicated to deal with cryptographic keys. One thing I’m still not sure of is (should you choose to remove the checkmark)  if the TPM is simply disabled or doesn’t physically exist in your NetApp controller, I have an email into NetApp to confirm this. [Update: The module is integral to the controller and disabled in firmware if being shipped to certain countries. Shout out to @Keith_Aasen for tracking that down for me.]

Okay, now for the more customer-relevant information…

To get started with NVE, you’re going to need a few things:

  1. A encryption-capable platform
  2. A encryption-capable image of ONTAP
  3. A key manager
  4. A license key for NVE

Encryption-capable platform

The following platforms are currently capable of encryption: FAS6290, FAS80xx, FAS8200, and AFF A300. This is limited by the CPU in the platform as it must have a sufficient clock-speed and core-count with support for the AES instruction set. I’m sure this list will be ever-expanding, but be sure to check first if you’re hoping to use NVE. [UPDATE: After some digging, I can confirm that all the new models support NVE, the entry-level FAS2650 included.]

Encryption-capable image of ONTAP

Provided you’re not in a restricted country as per the above, your image will be the standard nomenclature of X_q_image.tgz where X is the version number. The non-encryption-capable version will be X_q_nodar_image.tgz which I’ll simply refer to as nodar(e) (No Data At Rest Encryption) for the rest of this article. The output of version -v will tell you if you’re nodar or standard.

NetApp Release 9.1RC1: Sun Sep 25 20:10:49 UTC 2016 <1O>

NetApp Release 9.1RC1: Sun Sep 25 20:10:49 UTC 2016 <1Ono-DARE>

Key manager

The on-board key manager introduced in ONTAP 9.0 enables you to manage keys for use with your NSE drives, helping you avoid costly and possibly complex external solutions. Currently, NVE only supports using the on-board manager, so if you’re going to use NVE layered on top of NSE, you need to use the on-board one.

Setting this up is exactly one command:

security key-manager setup

You’ll be prompted for a passphrase, and that’s it, you’re done.

License key for NVE

If you didn’t get this license key at time of purchase, talk to your account representative or SE over at NetApp (though, hopefully, if you’ve bought one of the new systems announced at Insight 2016, they decided to include it since, at least for now, it is a no-cost license).

What next?

Now that you’ve got all the prerequisites covered, encrypting your data is very simple. As the name implies, encryption is done at the volume level, so naturally it’s a volume command that encrypts the data (a volume move command, in fact):

volume move start -volume vol_name -destination-aggregate aggr_name -encrypt-destination true

The destination aggregate can even be the same aggregate that the volume is already hosted on. Don’t want that volume encrypted anymore for some reason? Change that last flag to false.

If you’re creating a new volume that you want encrypted, that’s just as simple:

volume create -volume vol_name -aggregate aggr_name -size 1g -encrypt true

Wrapping up

NetApp Volume Encryption is pretty easy, but since it’s so new, OnCommand System Manager doesn’t support it just yet. You’ll have to stick to the CLI for now, although I’m sure the GUI will catch up eventually, if that’s your preferred point of administration. It should also be noted that while NSE solutions are FIPS 140-2 compliant, NVE has yet to go through the qualifications. Also, if FIPS is a requirement, the on-board key manager isn’t compliant yet either. Since with the on-board key manager the keys are literally stored on the same hardware using them, NVE only protects you from compromised data on individual drives removed from your environment through theft or RMA. If someone gained wholesale access to the HA pairs, the data would still be retrievable. Also, this is for data-at-rest only. You must follow other precautions for data-in-flight encryption.

Into the weeds

I did all my tests for this post using the simulator, and I learned a lot, but your mileage may vary. In the end, only you are responsible for what you do to your data. I had heard that if you have the wrong software image then you’d have to do a complete wipe of your HA pair in order to convert it. I have since proven this wrong (at least in the simulator) and I definitely can’t guarantee the following will be supported.

For my tests I had two boot images loaded: one standard and one nodar. What I learned is that you can boot into either mode, provided you don’t have any encrypted data. Even if you have the key manager setup and NVE is licensed, you can still boot back and forth. The first time you boot your system using the nodar image with encrypted data on the system, however, you’ll hose the whole thing. I did test first encrypting data, then decrypting it, then converting to nodar, and the simulator booted fine. When I booted into nodar with an encrypted volume, even going back to standard didn’t work. Booting into maintenance mode shows the aggregates with a status of partial and the boot process hints that they are in some sort of transition phase (7MTT?). Either way, I was unable to recover my simulator once I got it to this state, so I definitely advise against it in production. Heck, I’d advise you just to use the proper image to start with.

I hope you learned something. If you have any questions or comments, either post them below or reach out on twitter. I’m @ChrisMaki from the #NetAppATeam and Solution Architect @ScalarDecisions.

ADP(v1) and ADPv2 in a nutshell, it’s delicious!

Ever since clustered Data ONTAP went mainstream over 7-Mode, the dedicated root aggregate tax has been a bone of contention for many, especially for those entry-level systems with internal drives. Can you imagine buying a brand new FAS2220 or FAS2520 and being told that not only are you going to lose two drives as spares, but also another six to your root aggregates? This effectively left you with four drives for your data aggregate, two of which would be devoted to parity. I don’t think so. Now, this is a bit of an extreme example that was seldom deployed. Hopefully you had a deployment engineer who cared about the end result and would use RAID-4 for the root aggregates and maybe not even assign a spare to one controller, giving you seven whole disks for your active-passive deployment. Still, this was kind of a shaft. In a 24-disk system deployed active-active, you’d likely get something like this:

Traditional cDOT

Enter ADP.

In the first version of ADP introduced in version 8.3, clustered Data ONTAP gained the ability to partition drives on systems with internal drives as well as the first two shelves of drives on All Flash FAS systems. What this meant was the dedicated root aggregate tax got a little less painful. In this first version of ADP, clustered Data ONTAP carved each disk into two partitions: a small one for the root aggregates and a larger one for the data aggregate(s). This was referred to as root-data or R-D partitioning. The smaller partition’s size depended on how many drives existed. You could technically buy a system with fewer than 12 drives, but the ADP R-D minimum was eight drives. By default, both partitions on a disk were owned by the same controller, splitting overall disk ownership in half.

8.3 ADP, R-D

 

You could change this with some advanced command-line trickery to still build active-passive systems and gain two more drive partitions’ worth of data. Since you were likely only building one large aggregate on your system, you could also accomplish this in System Setup if you told it to create one large pool. This satisfied the masses for a while, but then those crafty engineers over at NetApp came up with something better.

Enter ADPv2.

Starting with ONTAP 9, not only did ONTAP get a name change (7-Mode hasn’t been an option since version 8.2.3), but it also gained ADPv2 which carves the aforementioned data partition in half, or R-D2 (Root-Data,Data) sharing for SSDs. Take note of the aforementioned SSDs there, as spinning disks aren’t eligible for this secondary partitioning. In this new version, you get one drive back that you would have allocated to be a spare, and you also get two of the parity drives back, lessening the pain of the RAID tax. With a minimum requirement of eight drives and a maximum of 48, here are the three main scenarios for this type of partitioning.

12 Drives:

ADPv2, R-D2 ½ shelf

24 Drives:

ADPv2, R-D2 1 shelf

48 Drives:

ADPv2, R-D2 2 shelves

As you can see, this is a far more efficient way of allocating your storage that yields up to ~17% more usable space on your precious SSDs.

So that’s ADP and ADPv2 in a nutshell—a change for the better. Interestingly enough, the ability to partition disks has lead to a radical change in the FlashPool world called “Storage Pools,” but that’s a topic for another day.

NetApp refreshes entire line of FAS and AFF platforms

Today NetApp announced a complete revamping of both the FAS and AFF lines and with it a divergence in model numbers. My favourite improvement is that NetApp has changed the way FlashCache gets delivered; now all FAS platforms can take advantage of FlashCache using an M.2 NVMe device onboard the controller, even the entry-level models; in fact, it’s standard on all models. In the realm of connectivity, both the top-end FAS as well as all AFFs can now offer not only 40GbE but 32Gb FC as well, first to market for both of these.

Without further ado, here are the new models in the FAS line:

  • FAS2620 and FAS2650
    • Appears to be the same 2RU enclosure as the FAS2240-2, FAS2552, and DS2246, likely with an upgraded mid-plane.
    • FAS2620 holds 12 large form factor (3.5″ NL-SAS/SSD) drives internally
    • FAS2650 holds 24 small form factor (2.5″ SAS/SSD) drives internally
    • Both models come with 1TB of FlashCache
  • FAS8200
    • Appears to be the same 3RU enclosure as the FAS8020
    • 1TB of FlashCache is now standard, upgradeable to 4TB
  • FAS9000
    • This all-new chassis separates the I/O from the controller so there are no more onboard ports and all I/O is done using PCIe cards, 10 slots per node.
    • 2TB of FlashCache are now part of the standard configuration, upgradeable to 16TB.

And the new AFF line now consists of:

  • A300 (Same chassis as FAS8200)
  • A700 (Same chassis as FAS9000)

Strictly the numbers*:

Model RU RAM NVRAM (NVMEM) Max HDD (SDD) Max Flash Cache Max Flash Pool Onboard UTA2 Onboard 10GbE Onboard 10GbE Base-T Onboard 12GB SAS PCIe Expansion Slots Cores
FAS
FAS2620 2 64GB (8GB) 144 1TB 24TB 8 4 4 N/A N/A 12
FAS2650 2 64GB (8GB) 144 1TB 24TB 8 4 4 N/A N/A 12
FAS8200 3 256GB 16GB 480 4TB 48TB 8 4 4 4 4 32
FAS9000 8 1024GB 64GB 1440 (480) 16TB 144TB N/A N/A N/A N/A 20 72
AFF
A300 3 256GB** 16GB 384 N/A N/A 8 4 4 4 4 32
A700 8 1024GB 64GB 480 N/A N/A N/A N/A N/A N/A 20 72
  • *Numbers are per HA pair
  • **16GB Carved out for NVLOGS

Performance Improvements

The FAS2600 comes with 3 times as many cores, twice as much memory and >3 times the NVMEM than that of the FAS2500 and brings 12Gb SAS and 1TB of NVMe FlashCache is expected to perform 200% faster than its predecessor running 8.3.x, making the entry-level line of controllers smoking fast. The 8200 has twice as many cores and four times as much memory as the FAS8040 and also comes with 12Gb SAS and 1TB of FlashCache, making it roughly 50% faster.

The new top-end model, the FAS9000 goes modular, decoupling I/O from the controllers. This performance monster which has 2TB of FlashCache standard and 20 PCIe slots for I/O is expected to run 50% faster than the FAS8080 on 8.3.x. A cluster of 24 FAS9000 nodes (12 HA pairs) scales up to as much as 172PB.

FAS9000 AFF A700 Chassis

Here’s how the new models map to the old:

New FAS platforms

As for the new AFF models, the A300 should get about 50% more throughput than AFF8040 running 8.3.1 while the A700 aims to replace the dual chassis AFF8080, saving four precious rack units but still providing 100% more IOPS, in fact it should be able to handle about double the workload at half the latency.

Oracle testing

And here’s how the new AFFs line up with the existing ones:

AFF model alignmentThe new lineup, both FAS and AFF are definitely addressing some concerns; FlashCache not only available throughout the FAS line but standard as well is a move in the right direction as is the addition of 12Gb SAS. The introduction of both 40GbE and 32Gb FC into the mid-range and upper models of both lines should provide the fire hose required to deliver all that new controller and storage back-end performance. The two new AFF model numbers lead me to believe that they may be leaving room in the middle to add models to the line.

While ADP has been around for a while and is a great work around to dedicated root aggregates, I would love to see NetApp move away from root aggregates completely and do something with M.2. I’ll keep my fingers crossed for this one, but won’t hold my breath either.