What does it mean for storage to be cloud native?

What does it mean for storage to be cloud native?

Especially for developers who haven’t had, until recently, to even think about storage, cloud native storage and container-attached storage can be confusing. But storage is essential for nearly all applications in the real world. It’s rare for business-critical applications to be truly stateless—in most cases, these applications must store state in some way. In an effort to help developers understand how storage fits into the cloud native environment, I recently gave a talk at DevOps Exchange London. Here are some highlights. 

What is cloud native storage? 

The characteristics that make a platform, solution or application cloud native ultimately have little to do with the environment the application is deployed to. Rather, cloud native technology should be application-centric, declarative, API driven and agile. Let’s unpack what that means for storage.  

Application-centric. In a traditional storage system, storage is presented through operating systems or virtual machines. This architecture does not work with cloud native applications that require storage resources to be able to move around in the same way that the applications themselves do. An application might store data on multiple virtual machines, and each virtual machine might have data from multiple data. Cloud native storage needs to organize storage resources for the application rather than by virtual machine. 

Declarative. One of the fundamental advantages of container orchestrators like Kubernetes is the ability to declare what the application needs are, from networking to configurations. Users need to have the same level of control over their storage resources—the ability to say this application need storage with X, Y and Z attributes—in order to fit into the cloud native ecosystem. 

API-driven. The only way to create a declarative storage system is to have API-driven storage. 

Agile. Cloud environments are dynamic, with lots of moving parts. Storage resources need to be able to scale, to move around clusters and to stay connected to the application as it moves as well. Cloud native storage should also provide data mobility between public and private clouds in the same way that an orchestrator like Kubernetes facilitates application mobility. 

Optimizing correctly

When you’re building a cloud native application and connecting it with storage, you’re designing a storage system based on a number of different attributes. The challenge, however, is that in practice many of the storage systems have many layers, so you can no longer think only about the top layer through which you are accessing the data. This requires complex understanding of what functionality is most important. Perhaps the best example of this is Ceph, which is fundamentally an object storage system but can be accessed as a file system. But because Ceph is built as an object store, it will have the high latency of an object store regardless of whether or not the application is interacting with the storage as if it were a pure file system. 

What is the Container Storage Interface? 

One of the key developments in cloud native storage was the release of the Container Storage Interface (CSI) for general availability in Kubernetes in early 2019. This provides a standard API for all storage orchestrators like StorageOS to connect to Kubernetes—or to any other container orchestrator. Before CSI, integrating a project like StorageOS with Kubernetes required putting integration code into the Kubernetes code directly. This meant we had to align release cycles and were in general dependent on the Kubernetes releases to fix bugs or release new functionality. CSI has made it dramatically easier to build API-driven, cloud native storage solutions that integrate with Kubernetes without being totally dependent on the Kubernetes release cycle. 

Beyond stateless

When you’re looking for ways to build stateful cloud native applications, it’s important to understand the different types of storage and the trade-offs they represent. It’s just as critical, though, to find ways to connect applications to storage in a way that enhances its performance, allowing the application to continue to move in the agile, cloud native way it was designed to. Developers also need to think about how the storage will behave on day two, as well as how it will meet business needs around availability, security and disaster recovery. 

This is where software-defined storage solutions like StorageOS come in. They provide a layer of abstraction between the application, generally integrating with the container orchestrator through CSI, to ensure that applications and data all behave in the same, cloud-native way. 

mm

Author: Alex Chircop

Experienced CTO with a focus on infrastructure engineering, architecture and strategy definition. Expert in designing innovative solutions based on a broad and deep understanding of a wide range of technology areas. Previously Global Head of Storage Platform Engineering at Goldman Sachs and Head of Infrastructure Platform Engineering at Nomura International.

Try for free