Category Archives: Uncategorized

ONTAP 9.8 has been announced

Timed perfectly with NetApp INSIGHT 2020 is the annual ONTAP payload announcement. Once again, there’s a lot in this payload, so I will simply deliver a list of bulleted sections, addressing as many of the changes as I’m able. I’ll provide additional detail on the ones I feel are the most interesting. For a full run down, please consult the release notes or start a conversation with me on twitter.

FlexGroup Volume Enhancements

  • Async Delete
    • Delete large datasets rapidly from the CLI.
      • This is great for those high file count deployments.
  • Backup enhancements
    • 1,023 snapshots supported
    • NDMP enhancements
  • FlexVol to FlexGroup in-place conversion enhancements
  • VMware datastore support
  • Proactive resizing of constituent volumes

FlexCache Volumes, a true global namespace

  • SMB support added with distributed locking
  • 10x origin to cache fan-out ratio, now 1:100
  • Caching of SnapMirror secondary volumes
  • Cache pre-population

Data Visibility

  • File system analytics, viewable in System Manager
    • Enabled on a per-volume basis
    • Can also be queried via API access
  • QoS for Qtrees
    • IOPS and throughput policies available per qtree object
    • Managed via REST API or CLI
    • Qtree-level statistics
    • NFS only in this release, no adaptive QoS

All-SAN Array (ASA) enhancements

  • Persistent FC Ports
    • Symmetric active/active host-to-LUN access
    • Each node on the ASA will maintain a “shadow FC LIF”, reducing SAN failover times even further.
  • Larger Capacities
    • Max LUN = 128TB LUNs
    • Max FlexVol = 300TB
      • These limit increases are on the ASA only
  • MCC-IP support
  • Priced ~20% less than unified platforms
Before Persistent FC Ports
With Persistent FC Ports

ONTAP S3

  • Preview-only in 9.7, GA in 9.8
  • System manager integration
  • Bucket access policies
  • Multiple buckets per volume
  • TLS 1.2 support
  • Multi-part upload
    ONTAP S3 is not a replacement for a dedicated, global object store

Storage Efficiency Enhancements

  • FabricPool
    • Tiering from HDD aggregates
    • Object tagging (For information life cycle policies)
    • Increased cooling period (max 183 days)
    • Cloud retrieval
  • Storage efficiencies
    • Differentiation of hold and cold data for application of different compression methods, 8k compression group for hot, 32k for cold
    • Deduplication prior to compression

Simplification

  • Upgrade directly to two versions newer without passing via intermediary version
  • Headswaps using nodes running the latest version of ONTAP can be used on nodes running versions of ONTAP up to two versions behind
  • REST API enhancements
    • ZAPI to REST mapping documentation
    • ONTAP version information in API documentation
  • System Manager Improvements
    • Single-click firmware upgrades
    • File system analytics
      • Granular details about your NAS file systems
    • Hardware and Network visualization
    • Data Protection Enhancements
      • Reverse resync
  • Simpler Compliance
    • Volume move support, no second copy required
    • WORM as the default

Security and Data Protection Enhancements

  • Secure purge
    • crypto shred individual files
  • IPSec
    • encrypted network traffic, regardless of protocols
      • Simplifies secure NFS, no need for Kerboros
      • iSCSI traffic on the wire can now be encrypted
  • Node root volume encryption
  • MetroCluster
    • Unmirrored aggregate support
  • SnapMirror
    • SnapMirror Business Continuity (SM-BC) provides automated failover of synchronous SnapMirror relationships for application-level, granular protection
      • These are non-disruptive
      • SM-BC is preview-only in 9.8 and SAN-only.
    • SnapMirror to Object Store
      • Google Cloud, Azure, or AWS
      • Meta Data included so Object Store data is a complete archive
      • Efficiencies maintained
SnapMirror to Object Store

Virtualization Enhancements

  • FlexGroup volumes as VMware datastores
  • SnapCenter backup support
  • 64TB SAN datastore on the ASA
  • SRA support for SnapMirror Synchronous
  • Support for Tanzu storage

That sums up the majority of the improvements, looking forward to this release coming out. See you at NetApp INSIGHT 2020!

What’s going on with Intel’s X710 Ethernet controller?

I’ve previously written about this Ethernet controller back when 40GbE Ethernet was relatively new to NetApp’s FAS and AFF controllers. Since that article, I’ve started to come across various oddities with this Ethernet controller.

Last Fall, I had a customer who was experiencing problems with LACP during an ONTAP upgrade (9.1 → 9.3 → 9.5P6) on their AFF A700s using the X1144A, dual port 40GbE card, which uses the Intel X710 Ethernet controller. We had the first 40GbE port broken out into 4x10GbE links, 2-each to either half of a pair of Cisco Nexus N9K-C9396PX in the same vPC Domain. During a controller reboot, we noticed that the interface group using multimode_lacp, most or all of the ports wouldn’t come up and on the Cisco-side, the port(s) would become disabled due to too many link up/down events. Immediately we wanted to look at potential cable problems but quickly dismissed that idea as well. After some digging, it looked as though NetApp was referencing Cisco Bug ID CSCuv87644 as potentially related. This led me down a long path of investigating the changes made to the networking stack in ONTAP over the past couple of years, and I’ve still got a post I’m working on around that. The workaround was to increase the debounce timer value on the Cisco 9k to 525ms, the default value is 100ms.

The port debounce time is the amount of time that an interface waits to notify the supervisor of a link going down. During this time, the interface waits to see if the link comes back up. The wait period is a time when traffic is stopped.

Source: Cisco

Recently, a different customer of mine was trying to buy a Nimble HF20 and they wanted to include the Q8C17B, a four port, 10GbE NIC, also based on the Intel X710 Ethernet controller. The vendor came back to me and said they needed to know if the customer was going to be using VLAN tagging on the Q8C17B, because if they needed VLAN tagging, they’d have to choose a two port NIC instead. This confused me, but after some emails back and forth, HPE Nimble Storage Alert # EXT-0061 was referenced as the reason for this. At some point Nimble will release a patch that updates the firmware on this NIC, hopefully bringing back VLAN functionality. A bit of looking around, and the same VLAN issue has been identified by VMware in KB2149781.

Lastly, I also came across a NIST vulnerability from 2017 regarding the same Ethernet controller, it seems that has since been addressed in a firmware update.

While the above doesn’t necessarily imply a huge problem with the X710, I simply found it interesting and thought I’d include them all in one post.

ONTAP Fall 2019 Update – 9.7

Right on schedule, to coincide with NetApp INSIGHT 2019 is the announcement of the next release of NetApp’s ONTAP, 9.7. Going over the list of improvements, much of what is expected in 9.7 seems incremental. The themes for this release are High Performance, Simplicity and Data Protection. This release will also bring support for a few new platforms, the FAS8300, 8700 and the AFF A400. Also, a new twist on the A220 and A700, the first models in the new All SAN Array(ASA) versions of the all flash FAS’.

FlexCache, the most recent feature to be brought back from the depths of 7-mode gets a bit more attention. First up, both FC and IP MetroCluster support, allowing you to extend a volume namespace across MCC sites and per-site load-balancing for NFS clients. Also, FlexGroups can now be the origin volume for FlexCache, allowing for origin volumes greater than 100TB and higher file counts. 

In the realm of security, data-at-rest encryption is on by default for all newly created volumes provided there is a key manager configured. ONTAP will encrypt the data using hardware encryption if the drives are available, otherwise it will leverage software-based encryption. Setting up the onboard key manager is now extra simple with a setup wizard available in System Manager.

MetroCluster network can now co-exist on your data access switches provided they comply with specifications. MCC’s with either an A220 or FAS2750 do not qualify. 

There’s an interesting new bit of engineering coming in the new AFF A400 platform where compression will be offloaded to a PCI network card.

FlexGroup improvements include NDMP support, allowing backup by any 3d party application that supports NDMP. ONTAP 9.7 brings NFS v4.0 and v4.1 to FlexGroups, including support for pNFS. The long awaited conversion in-place from FlexVol to single-member FlexGroup is here, allowing you to scale capacity and performance without having to perform a client-based copy. While VMware datastores will work on FlexGroups, this isn’t supported quite yet. If you’re a NetApp partner and you have a customer who would like to use FlexGroups as a VMware datastore, contact your SE.

Another oft-request feature, this one of FabricPools, is the ability to tier to more than one object store. In 9.7, FabricPool Mirrors is announced, allowing you to tier to two separate object stores. FabricPool mirrors can be used to add resiliency, or change providers, perhaps to re-patriate your data to an on-premises StorageGRID deployment. Keeping on the topic of FabricPool, customers wanting to tier to an object store that isn’t officially qualified no longer need an FPVR, though they must perform their own testing to ensure the object store meets their needs. The officially qualified object stores are: Alibaba Cloud Object Storage Services, Amazon S3, Amazon Commercial Cloud Services, Google Cloud Storage, IBM Cloud Object Storage, Microsoft Azure Blog Storage and StorageGRID.

FabricPool Mirrors

Wrapping up the 9.7 updates, ONTAP Select gets NVMe device support, 12-node clusters and NSX-T support on ESXi.

Rubrik and NetApp, did that just happen?

I wasn’t sure I’d ever see the day where I’d be writing about not only the partnership of NetApp and Rubrik, but actual technological integration, this always seemed somewhat unlikely. While there have been some rumours flying around in the background for some time, the first real sign of cooperation between the two companies was when we saw the publication of a Solution Brief around combining NetApp StorageGRID with Rubrik Cloud Data Management (CDM) to automate data lifecycle management through Rubrik’s simple control plane while using StorageGRID as a cloud-scale object-based archive target. And then…nothing, not even the sound of crickets.

As Summer started to draw to a close and the kids were back in school, those in the inner circle started to hear things, interesting things. If you were to talk to your local Rubrik reps or sales engineers, the stories they had to tell were around NAS Backup with NAS Direct Archive as well as using older NetApp gear as a NAS target, nothing game changing. This backing up of the NAS filesystems still involved completely trolling the directory structure which was time consuming and performance impacting; something was still missing.

On September 24th this year, exactly one month ago, a new joint announcement hit the Internet, Rubrik and NetApp Bring Policy-Based Data Management to Cloud-Scale Architectures. While interesting, still not exactly what some of us were waiting for. Well, wait no longer, as of now, Rubrik has officially announced plans to integrate with NetApp’s SnapDiff API. What’s that you may ask? It is the ability to poll ONTAP via API call to leverage the internal meta-data catalogue to quickly identify the file and directory differences between two snapshots. This is a game changer for indexing NAS backups, since Rubrik will no longer need to scan the file shares manually, backup windows will shrink dramatically. Also, while other SnapDiff licensees can send data to another NetApp target, Rubrik is the first backup vendor to license SnapDiff and be able to send the data to standard public cloud storage.

Since the ink is just drying on Rubrik’s licensing of the SnapDiff API, it’s not quite ready in their code yet, but integration is being targeted for release 5.2 of CDM. Also, Rubrik will have a booth at INSIGHT (207) and be presenting on Tuesday, session number 9019-2, stop by to see what all the fuss is about. Also, be sure to look for me and my fellow A-Team members, there’s a good chance you’ll find us hanging around near the NetAppU booth where you’ll find a pretty cool surprise! You can also find me Wednesday, October 30th, at 11:30 am presenting 3009-2 Ask the A-Team – Building A Modern Data Platform, register for that today.

NetApp HCI Update

As NetApp continues to make its mark on and help define the Next Generation Data Centre, the need for more node types of their HCI offering has become apparent and they are responding in kind.

First up, staying current by using the latest generation of Intel Skylake processors in the new nodes is a given; as well as offering myriad combinations of both CPU and memory while maintaining interoperability with the current generation of HCI nodes.

First up, are a raft of new compute nodes, some of which are optimized around core count which you can use to satisfy various licensing models.

 

Model # Processor Memory
H410C-14020 2 x Xeon Silver 4110
(8 core @ 2.1GHz)
384 GB
H410C-15020 512 GB
H410C-17020 768 GB
H410C-25020 2 x Xeon Gold 5120
(14 core @ 2.2GHz)
512 GB
H410C-27020 768 GB
H410C-28020 1 TB
H410C-35020 2 x Xeon Gold 5122
(4 core @ 3.6GHz)
512 GB
H410C-37020 768 GB
H410C-57020 2 x Xeon Gold 6138
(20 core @ 2.0GHz)
768 GB
H410C-58020 1 TB

 

Next up, the much-requested GPU accelerated compute nodes have been announced, optimized for Windows 10 VDI deployments. This one moves away from the 2 RU chassis with 4 compute nodes and is one 2 RU server in itself consisting of:

  • 2 x NVIDIA Tesla M10 GPUs
  • An Intel Skylake Xeon 6130 (16 cores @ 2.1GHz)
  • 512MB RAM

On to the networking-side of things, your concerns have been heard. NetApp will soon begin offering their H-Series switch, the Mellanox SN2010 to help complete your HCI build-outs. This switch is a paltry 1RU, half-width consisting of 18 SFP+/28 ports with optional cable and transceiver bundles. Support for this switch will be NetApp-direct, so no worries around cross-vendor finger pointing.

Keeping in the network mindset, NetApp is making things simpler by reducing the required network port count and associated infrastructure by 40%. HCI compute nodes now only require two SFP28 connections, down from four, vSphere distributed switch is a requirement.

Tied closely to NetApp’s HCI offering is their Solidfire storage whose latest release, version 11, provides some great new features. Version 11 brings to the table the ability to SnapMirror to ONTAP Cloud, IPv6 management network, 16TB maximum volume size and protection domains. This last feature helps protect your HCI deployments against chassis failure by automatically detecting HCI chassis and node configuration. Solidfire’s double-helix data layout ensures that secondary blocks span domains.

All the above should allow you to build a truly Next Generation Data Centre for your employer or your customers.

8.3.1 and 8.3.2…dot releases never felt so good.

NetApp released ONTAP 8.3 over a year ago now, and since then two minor releases have come as well, and with them far more payload than you’d usually expect for dot releases. Typically the major releases get all the hype, but after you see all that has been included with the two minor releases of 8.3, you’ll see what all the fuss is about.

First of all, if you can’t remember what was included with 8.3, go over here and read about it. Highlights included but weren’t limited to:

  • Metro Cluster
  • Non-disruptive LUN migration
  • Serious performance improvements in the flash space
  • Version independent SnapMirror

When 8.3.1 came out in early September, it brought some pretty spectacular:

  • More flash performance improvements
  • Storage Virtual Machine Disaster Recovery (SVM DR)
    • This is the ability to replicated entire SVMs and not just volumes to another cluster. This has two modes, Identity Preserve True or False which can replicate all the network related info for those who’s DR site supports it, i.e.: L2 connectivity.
  • In-line compression and zero elimination
  • Two node MetroCluster, i.e.: one per site
    • Uses ATTO bridge to connect the disk
    • This is more of a “Stretch MetroCluster” and is suitable for campus level DR where the loss of a building is being protected against.
  • Some performance metrics now available in System Manager

8.3.2:

  • Copy Free Transition
    • This has got to be one of the coolest features so far, it lets you stand up a new cDOT system with minimum disk, then move your 7-mode disk over to it without having to do a data migration.
  • In-line deduplication
  • More performance improvements for SAN on AFF
  • In-place, adaptive compression
  • Fibre Channel over IP for MetroCluster
    • Up to 200km, between switches that support it, such as the Cisco 9250i
  • Quality of Service policies previously limited to 8 notes can now be applied to up to 24
  • System Manager Improvements:
    • Cluster performance charts with IOPS and latency available within System Manager
    • Manual IP assignment
      • Previously you had to create the subnet, that is no longer the case
    • SyncMirror (introduced in 8.3 with Metro Cluster) support in System Manager
      • This is not the same as synchronous SnapMirror, which is still not available in cDOT
    • You can now manage your MirrorVaults in the GUI
    • Various other System Manager improvements, far too many to list.

As you can see by the points I’ve covered off, the dot releases of cDOT 8.3 have been packing quite the payload, I’m sure that not having support 7-mode in the same release has helped speed up the development cycle for many features not to mention some of those engineers have probably been reassigned to cDOT work. I’ve left some of the more esoteric details out, but if you want to see them all, head over here to read the release notes for the individual versions.