NAND Flash and the Collapse of Storage Tiering

Shahar Frank September 12, 2017 blog, Flash memory, NAND, Storage Tiers, Tiering

Flash memory (and, most importantly, the introduction of MLC and eMLC NAND) revolutionized a large part of the storage industry. In this blog, I’ll describe how flash-based storage media not only improved the performance and usability of many storage solutions, but also had a significant impact on the way storage solutions are classified.

Storage Tiers Before the Flash Era

The classification of storage tiers was never a formal or well-defined one. Still, there was general agreement around classifying storage tiers to include a Tier 0 with the highest IOPS/GB (but also the highest $/GB) and to include subsequent tiers, each with progressively lower IOPS/GB and $/GB. It was been very common to reference the following tiers:

Tier 0: Acceleration - Typically a simple block array (i.e. without significant copy/advanced services) used mainly for a very performance-critical application.  In most cases these were OLTP applications (e.g. Oracle DB). Good examples of these types of solution were produced by Texas Memory System (TMS), who started by making RAM-based devices, then moved to hybrid flash and flash-only systems.

Tier 1: Mission-critical - Typically a block array that was able to service several concurrent mission-critical workloads. Tier 1 was about performance, scalability, feature set and very high reliability. A common practice was to define only 3 solutions as “real” Tier 1 - EMC VMAX (formally “Symmetrix”) , HDS high-end solution (USP/VSP), and IBM’s DS8000 (“Shark”).

Tier 2: Mid-range/general purpose - Typically a unified storage solution (i.e. block and file server) that provided a balance of features, performance, scalability and cost. There are many solutions in that category, but I think that the representative examples are EMC VNX (“Celerra”) and NetApp filers.

Tier 3: Capacity pool  - Typically a filer with a significant amount of storage for data that does not require high performance. The most representative example is probably EMC’s Isilon.

Tier 4: Backup/near line storage - Typically an appliance that could store a significant amount of cold data, but with limited (random read) access. EMC Data Domain is an example of this type of solution

Tier 5: ArchiveTypically a tape-based (VTL) solution - used mainly for data retention use cases (e.g. for regulatory compliance).

As I said before, there is no well-defined way to classify solutions, so there could be a legitimate debate regarding whether solution X is in fact a tier L or tier K solution. In addition, over the years, some solutions didn’t fit into the tiers defined above, so new tiers were created. For example, IBM XIV claimed to have Tier 1 performance and scalability, but due to lack of many features (mainframe support for example), the market classified it as a “Tier 1.5” solution. In 2012 terms, I would also classify EMC XtremIO as “Tier 1.5”.

In parallel to the above tiers, a simpler classification exists: primary storage (application-facing storage) and secondary storage (not application-facing and includes nearline, backup, archival, etc.)

The following diagram depicts the suggested tiering model:

internal blog image.pngLegacy Storage Tiers

Looking Beneath the Tiers

There are many technical differences between the different tiers and the different implementations of the various solutions within each tier.  However, in most cases, I think it’s fair to say that, after removing all of the smoke and mirrors, the main difference between the different tiers is the combination of the underlying storage media. Most, if not all storage tiers used some combination of RAM (later also flash), premium disks (15 RPM, FC disks), and normal disks (10K disks).

Here Comes the Flash!

Flash-based storage devices have been in the market for many years. Still, only towards the end of the last decade have these devices approached a price range that could justify significant use. The main impact on enterprise storage began with SLC-based devices in several formats (e.g PCIe add-in cards from FusionIO, sTec SSDs, Violin products, etc.). All of them started as a kind of accelerator (internal or external) and a as layer between memory and disk. Using the legacy tiering model we can say that flash started with Tier 0.  A short time later, solutions begin to appear that pushed flash devices to the lower levels. Most of them took the form of “All Flash Arrays” (aka AFAs). The most known examples are of course XtremIO (now Dell-EMC), Pure Storage and SolidFire (now NetApp). These AFAs were designed to service Tier 1 applications (but frankly, should be considered Tier 1.5). The main advantage that these solutions had over the previous “accelerators” was that they offered a full fledged disk array with management and at least some critical subset of copy services (i.e. snapshots, clones, replication for DR, etc.). In fact, one of their main advantages over the previous generation of storage solutions was simplicity. All of them were radically simpler to setup and manage, and offered more predictable performance. After the success of the trailblazing AFAs, most legacy storage vendors reacted in one or more of the following ways:

  1. Acquired an AFA company (EMC, NetApp, IBM to some extent)
  2. Released all-flash models (mainly based on the existing HDD-based models)
  3. Incorporated a flash tier into the existing models (hybrid solutions)

In fact, as flash became less expensive and the maturity of the devices increased, flash was able to address more and more use cases in more tiers.

But wait, there's more!

As detailed above, the impact of flash on enterprise storage goes far beyond performance and extends to improved usability, flexibility, simplicity, and even feature richness. Before the flash era, most legacy storage solutions suffered from several problematic attributes - mainly the following:

  1. Due to the limited performance hard drives can provide (in terms of IOPS) and the bimodal nature of the bandwidth (random access vs. sequential), most solutions had to use large read/write caches and complex algorithms to manage both the cache and the underlying physical layout. Much effort is spent transforming as many random I/Os to (semi) sequential I/Os as possible (i.e. via elevator algorithms, log based data structures, btrees, etc.)
  2. Due to this internal complexity, configuring and managing legacy storage was also complex and required a significant understanding of both the storage techniques and the application I/O profiles.
  3. The read/write caching architecture, leveraged by most solutions, caused unpredictable performance if the applications data set size or patterns changed over time (which they almost always do).

On the other hand, flash is:

  1. Efficient random-access media (i.e. much less sensitive to data access pattern), and
  2. Flash latency is significantly lower and much more predictable than spinning rust (or “spinning disk”, if you prefer).

These attributes lend themselves to more efficient and simpler implementations of RAID subsystems, snapshots, deduplication, and many other mechanisms. For example, XtremIO and Pure Storage “RAID 6” variants are so efficient that there is no reason to let the user configure and tune the RAID subsystem.

Another important impact of flash stems from its basic high performance and low latency profile.  As a result, the need for caching is eliminated or at least highly reduced. Moreover, even if caches exist, the impact of overrunning the cache is much less dramatic than the impact with HDD-based solutions. In other words, the system performance becomes much more stable and predictable.

Still the most important impact of flash on the storage industry is indirect: flash completely overhauled the storage solutions’ internal balance between the I/O subsystem, compute, and networking.  Understanding the subsystem bottleneck is crucial and, before flash, the answer was simple - the I/O subsystem was almost always the bottleneck. With flash, the answer is more complex, but in many cases the real bottleneck is CPU (and/or CPU memory access). The main impact of this shift is the realization that flash storage far from being the bottleneck in modern storage subsystems.  In fact, flash performance puts storage so far ahead of the curve, that there is no reason to spend money on expensive flash media (SLC, write intensive eMLC, etc.). In other words, you can build a Tier 0 performance solution using relatively entry-level flash media. In addition, you don’t need as much as (read/write) RAM cache as you needed before to achieve good performance. I hope you already get what I am trying to say - but I will spell it out. The technical differences that justified the existence of several storage tiers (especially the lower ones - 0, 1, 2, and 3) are gone.

Meanwhile, in the World of Spinning Rust...

Flash is also having a significant impact on the HDD industry. The simple fact is that HDDs cannot compete with SSDs in use cases focused on IOPS performance. In fact, flash is on a whole other level.  In terms of $/IOP, HDDs are ridiculously expensive. This caused HDDs to get kicked out of all places where IOPS matter much more than capacity. The direct impacts of this is that:

  • 15K RPM HDDs are dead (10K HDDs are next) and are replaced by SSDs.
  • Desktop/laptop primary (OS) disk is typically replaced by SSD.
  • Cloud instances’ block storage (local ephemeral and persistent networked) shifted mainly, and by default, to flash-based hardware.

In other words, the HDD industry lost huge (and/or premium) markets and had to react by focusing on capacity-based use cases where HDDs still have an advantage due to lower $/GB cost relative to SSDs, and better random access times relative to tape. Moreover, to keep that advantage the HDD manufacturers had to improve the density of the HDD media to account for their performance, both in terms of IOPS per device and in terms of IOPS/GB. This all made HDDs even less efficient as primary (application-facing) storage devices, but perhaps more efficient as secondary (backup, archive, near online, cold store, etc.) storage.

Cloud Blob Stores are Revolutionizing Secondary Storage

In addition to the flash revolution, the cloud revolution also impacted the enterprise storage world, but mainly on the secondary storage side.

The public cloud providers reinvented many aspects of modern IT to run much more efficiently (at least they claim so) and very differently. Specifically, the low level storage layers were redefined by AWS as follows:

  • Primary storage is block storage leveraging either ephemeral local (attached to the instance) or persistent networked (EBS) storage. Two years ago, AWS also added flash-based file services (EFS).
  • Secondary storage is modeled as an “object store” - S3 (Simple Scaleable Storage). Personally, I think that the Microsoft Azure term “blob store” is much more accurate, as it is really designed to be an efficient way of implementing storage that is not application-facing.

This simple (or even simplistic) storage layer model made its way to the enterprise world mainly in the form of secondary storage appliances that imitated the cloud-based interfaces (“S3 compatible”). In retrospect, the reason S3-like interfaces succeeded in the enterprise is that they appeared to be much cleaner and simpler than legacy backup and archiving (tape) interfaces.  The new solutions seem to be much more cost effective, scaleable, and also capable of functioning as the “first step into the cloud (world)”. Moreover, S3-like appliances cover a nice range of secondary use cases, starting with simple capacity pools and ending with long-range archival. As you probably expect, these solutions are very HDD-oriented, and can use other media for caching and cold storage. Technically speaking, these solutions know to to efficiently use massive amounts of HDDs for massive amounts of cold storage.

The Two-layer Approach vs. Hybrid (Media) Systems

The simple two-layer (“cloud”) approach should be compared to the legacy enterprise storage approach. Many legacy storage solutions internally support several media types. For example, it was pretty common to have a flash tier, fast HDD tier (e.g. 15k FC HDDs), and a capacity pool tier (e.g. 7200 SATA HDDs).

My personal view is that the simple primary/secondary separation will replace these hybrid systems with two layers of solutions that communicate with each other and are managed as a unified system.

My reasons are as follows:

  • Primary/secondary solutions are simpler than hybrid solutions. Hybrid systems are very complex by nature, since each media has its own caveats, issues and optimizations affecting implementation and support. Flash-only and HDD-based systems are simpler, as they are tuned and optimized for one media.
  • Primary solutions can optimize application performance, stability and resource utilization by delivering consistent, predictable storage performance and service. Hybrid systems justify themselves by offering features such as automatic and sub-volume tiering (such as EMC FAST). Regardless of how good these mechanisms are, they ultimately introduce unpredictable performance and I/O patterns. Practically speaking, these mechanisms are functioning similarly to a cache and, as such, suffer from the same issues. In other words, your application may behave perfectly one day and then horribly the next...even if you didn’t change a thing (e.g. because some other application changed its I/O profile). On the other hand, all-flash solutions can offer better, more predictable, more stable performance even if many concurrent workloads are served.
  • Secondary solutions can optimize simplicity, scale and cost effectiveness. Modern “S3 compatible” and other systems are offering such attributes and more. For example, you can use S3 (and S3-like) cloud services and/or use on-premises solutions that can extend to the cloud.  If desired, you can opt to use the cloud as backend. Moreover, these solutions can be used for cold storage for media objects, and other similar objects, and be accessed directly from the applications. In fact, this makes them a little bit “primary” in that they are application-facing at least for some data. Most hybrid solutions are NOT offering similar capabilities and each of them has its own set of abilities, strengths and weaknesses.
  • Having two solutions - one primary and one secondary, enables the customer to compose the best possible solution for each layer and even to replace each layer without replacing the other.
  • The two-layer approach is much more cloud-compatible and much more cloud-ready. This approach helps the customer to dynamically lift and shift on-premises workloads to/from the cloud and back again.

From 6 Tiers to 2

To summarize, the justification for having several primary storage tiers has been rendered invalid by 1) the introduction of flash media as a primary storage device and 2) the introduction of object/blob/S3-like solutions for secondary storage. Modern 2-layer storage tiering is simpler,  more efficient and cloud compatible/ready...and therefore I believe that it will dominate in future storage solutions.