For IT departments, managing data storage can seem like a never-ending task. Businesses are capturing and managing more data than ever, keeping it longer, and relying entirely on it to run their businesses.
There are two approaches companies can take when designing storage to meet the need for increased capacity – they can increase or increase.
So, it’s helpful to know the difference, and in this article, we take a look at horizontal versus vertical scaling in storage, the pros and cons of each, and the scenarios in which you could perform all of them, be it SAN, NAS, hyper-converged, object, or in cloud storage.
Horizontal vs vertical scaling
IT systems can scale vertically, horizontally, and sometimes both. In general terms, vertical scaling, or scaling, involves installing more powerful systems or upgrading to more powerful components.
It could mean putting a new processor in a x86 system, deploying a more powerful server, or even moving workloads entirely to a new hardware platform. In the cloud, this can be done through processor and memory upgrades.
Meanwhile, horizontal scaling increases resources by increasing the number of servers or other processing units. Rather than relying on a more powerful system, scalable upgrades work by dividing the workload between a large number of merchandise units.
Google’s search system is an example of a massively parallel large-scale system. In fact, Google owns some of the key patents for Map Reduce systems that divide tasks into massive, parallel processing clusters.
Horizontal vs vertical scaling in storage
Scaling for storage follows a similar approach. IT departments can scale capacity through larger or more disks in the storage subsystem, or by distributing workloads across more devices.
Scaling, with larger disks in servers and hyperconverged infrastructure (HCI) or increased capacity in NAS and SAN systems, is technically relatively straightforward. However, even with the larger capacity NVMe, SSD, and conventional drives available today, those with larger systems can still experience bottlenecks.
Either the system will not adapt well as it approaches capacity limits, or other bottlenecks will emerge. Typically, bottlenecks in vertical-scale storage arise from the throughput limits of storage controllers, because most storage subsystems can only accommodate two controllers. Some systems of course allow you to upgrade the controllers themselves. On network storage, the network interface can also become a bottleneck.
The alternative is to scale the storage by adding more nodes to work in parallel. Here, the storage nodes work together in clusters, but present their capacity as a “pool” to the application.
Adding nodes removes controller and network interface bottlenecks, as each node has its own resources. HCI and computer storage take the idea a step further. HCI combines storage, networking, and compute in a single unit, while computer storage allows the storage subsystem itself to perform certain processing tasks, such as encryption or compression, close by. storage.
“Hyperconverged infrastructure has put this model of horizontal scaling in the spotlight,” says Naveeen Chhabra, analyst at Forrester. “This concept of horizontal scaling was introduced by hyperscalers and is used for the storage services they bring to the market.”
Scaling on-premises storage
Increasing storage in an on-premises environment can be relatively straightforward. At the most basic level, IT teams can simply add more or larger capacity drives. This applies to internal storage, direct attached storage, and storage in HCI systems.
For network storage, adding or swapping drives is also the easiest option. Tool-less upgrades are widely supported by hardware vendors, and storage management software is capable of automatically reconfiguring RAID configurations in NAS and SAN systems.
Changing or upgrading controllers or network interfaces will take more work and will likely require you to power down the array.
Either way, downtime will be an issue. Hardware upgrades involve taking systems offline, and RAID groups will need to be rebuilt. In addition, systems can only be upgraded if they are provisioned for additional capacity, such as hot spare arrays or swappable controllers, in advance. This may mean purchasing a larger painting than is initially needed.
The alternative – upgrading to a newer, larger system – can minimize downtime, but businesses need to plan for the time it takes to transfer data and the risk of data loss.
Scale-out systems may therefore seem easier. Modern NAS and SAN systems, and HCI, are designed to scale (as well as to some extent). Adding additional nodes or arrays expands the storage pool and should be possible with little or limited downtime. There is no need to tamper with the existing hardware, and the software will add the new capacity to the storage pool.
Sometimes scalability is the only way to handle the rapidly growing demand for storage, especially for unstructured data, but it has its limits. Scale-out systems are less suited to applications such as transactional databases, for example.
Evolving cloud storage
Cloud storage is based on scalable architectures. The building blocks – parallel base hardware and object storage – were designed from the ground up to accommodate ever-growing datasets.
Public cloud systems are therefore largely scalable systems. This works well for elastic workloads, where organizations want to start small and grow, and where applications can run on horizontally scaled systems, such as scalable databases.
Scalable cloud systems are typically built with x86 servers with direct attached storage that act as HCI nodes or clusters, each running object storage software and using erasure coding to create the equivalent of protection. RAID. All of this allows cloud users to add capacity quickly, even automatically.
But that doesn’t mean the only way to scale in a public cloud environment is to increase capacity. IT architects can specify different levels of performance from major cloud providers.
Amazon Web Services, Google Cloud Platform, and Microsoft Azure each offer a range of storage performance, depending on their SSD (and spinning disk) systems.
AWS, for example, has IOPS options ranging from 16,000 to 64,000 by volume via EBS. Azure Managed Disk reaches up to 160,000 IOPS and Azure Files reaches up to 100,000 IOPS.
GCP’s persistent disk runs up to 100,000 read IOPS and its local SSD runs up to 2,400,000 read IOPS. On all platforms, writing is generally slower.
Up or out?
Of course, costs increase with higher performance levels, so CIOs will need to balance capacity and performance across their cloud fleet.
Increasingly, hybrid architectures support the best of both worlds. Businesses can scale their hardware on-premises, but use the public cloud to scale with additional capacity that’s easy to deploy.
Processing and storage should not evolve in parallel either. It is quite possible, and increasingly common, to scale computing for performance and to scale storage, on-premises or through the cloud, to utilize the capacity and resiliency of object technology.