What Developers Need to Know about Storage
In the past, developers have had neither the power nor the responsibility to provision their own storage resources, or even to be part of the conversations about storage environments. There was little need for developers to understand the trade-offs associated with different storage options; they would be given a certain type of storage and expected to make it work with the application in question.
Power and Responsibility
In a cloud-native environment. this dynamic has totally shifted. Now, developers have a stunning array of choices when it comes to storage, but in many cases they don’t have the knowledge to understand how storage decisions can either help or hinder the application, or how to choose the best type of storage for a given use case.
There is no one type of storage that is always ‘best.’ Instead, storage systems have five main attributes: availability, scalability, performance, consistency and durability. The way that storage systems are optimized means that every storage option involves trade-offs. Systems that are optimized for performance may not be optimized for scaling capacity, for example. This means that the best storage option for one type of workload will not be the best option for another type of workload—and that storage is something that needs to be actively considered during the development process.
Never having had the ability to make storage decisions in the past—or having any transparency into how those decisions got made—has led many developers to make incorrect assumptions about storage. In a cloud-native storage environment where developers can (and must) make the storage decisions, those assumptions can lead to the wrong type of storage for the job.
One main source of confusion is that modern storage systems have many layers, often of storage systems layered on top of other storage systems. One common example is where a filesystem might be layered on an object store – the volume may have some of the attributes of a filesystem, like being easy to share, but will have the performance characteristics of an object store.
Once you understand that storage systems can have multiple layers, you understand that it’s not enough to think about the characteristics of the top layer, the layer you interface with directly. Understanding the complete storage stack is important to making effective storage decisions.
The Storage Stack
There are four main types of storage topographies: Centralized, distributed, sharded and hyperconverged. The type of storage architecture you choose for an application has implications for all of the availability, consistency, scalability, performance and durability of the storage system.
- Centralized topographies usually depend on vendor-specific hardware, and generally have a small number of tightly-packed nodes. Latency is very low, and this is a common architecture for block storage system. It’s usually more consistent than distributed storage architectures. However, centralised architectures are hard to scale horizontally.
- Distributed systems rely more heavily on software than hardware. This architecture is easier to scale horizontally and is more flexible—there are many variations on distributed systems, which just mean that the storage is spread over many nodes. This makes it possible to optimize the architecture for the workload in question. However, the added flexibility comes at the cost of added complexity.
- Sharded topographies involve partitioning a workload over multiple instances. Sharded topographies offer robust scaling for both storage and compute, but at the cost of highly complex architectures that can be difficult to debug if not set up properly at the beginning.
- Hyperconverged architectures place storage in the same node as the application, which maximized flexibility but can increase the risks from any individual failure, since failures can affect the underlying storage system.
There’s no right answer in selecting a storage architecture—it all depends on the application’s needs. It’s just essential to understand the options and some of the trade-offs involved with different systems.
What Makes Storage Cloud Native?
Here at StorageOS, we think there are eight principles of cloud-native storage. The bottom line is that cloud-native storage should behave in the same way as other parts of the cloud-native environment. The eight principles of cloud-native storage are:
- Application centric
- Application and platform agnostic
- Declarative and composable
- API-driven and self-managed
- Natively secure
- Consistently available
Any cloud-native storage system you consider should follow those eight principles. Storage should be held to the same standards as the compute part of an application, and developers should expect the same level of control over storage resources as they already have over the rest of the application. Without a storage system that behaves in a cloud-native way, it’s impossible to have a truly cloud-native application.
Reading the CNCF Storage Landscape White Paper will give you an even more in-depth insight into the terms used to discuss storage in a cloud-native environment, the trade-offs involved with different types of storage, how to handle data protection as well as a detailed dive into the differences between file, block and object storage.
Next time you’re selecting a storage option for a cloud-native workload, make sure to think about what your application needs most. Is it performance? Scalability? Then use that to find the storage option that best matches your use-case. Developers now have the power to control storage, but like all power this is really a responsibility, one that comes with a learning curve.
Author: Alex Chircop
Alex is the Founder and CEO of StorageOS, building software defined storage solutions for cloud native environments. Alex is also a co-chair of the CNCF Storage SIG. Before embarking on the start-up adventure, he spent over 25 years engineering infrastructure platforms for companies like Nomura and Goldman Sachs.