Containers – and container orchestration, most commonly via Kubernetes – are changing the way enterprises develop and run applications.
Containerised architectures allow organisations to develop, deploy and decommission applications quickly. Also, containerised applications are more easily portable between the cloud and on-premise systems. For some enterprises, this is their key advantage.
But as enterprises use containerised applications more widely, they are also using them to handle more critical data – and this data needs to be backed up.
One of the arguments in favour of containers has been that no backups are needed, because the architecture is stateless and applications are often designed to have a very short working life (most operate for less than a day). Any stateful components are spun up from the central, key-value store, known as etcd.
This works perfectly well for rapid application development and web-based operations. But as enterprises move containers into the core of operations and potentially use them to replace conventional applications, they need a higher level of protection. This means protecting the etcd database and any data stored in persistent volumes.
“Generally speaking, organisations aren’t backing up Kubernetes with native tools, if they are backing it up at all,” says Brent Ellis, a senior analyst at Forrester. “Many product teams back up the etcd configuration database for their clusters, then they back up the primary storage that the container images are stored in and any persistent volumes references in the yaml files.
“This is fine if you have a low degree of complexity and Kubernetes applications that have zero or minimal state. You need application awareness in order to back up an application’s state – and capture where in a particular step of an application that the transformation of data was left off in the case of a disaster.”
This is leading to two main approaches for Kubernetes backup – dedicated products, and broader-based backup and recovery tools that support container environments. This is a, by no means exhaustive, overview of the market.
Kasten positions its K10 software as a purpose-built, Kubernetes data management solution. The application runs in its own namespace on a Kubernetes cluster, and supports all the main cloud platforms as well as on-premise architecture. The tool scans for components that need backup, including persistent storage volumes and databases. Users can set their own data protection, back and disaster recovery (DR) policies.
In 2020, backup vendor Veeam bought Kasten.io.
Portworx was one of the first suppliers to develop persistent storage for containers, so is well placed to provide backup to Kubernetes environments. It does this through its PX-Backup software, which it claims is “container granular and app aware”. The tool supports block, file and object as well as cloud storage. It has storage discovery and provisioning tools, and backup, DR, security and migration features.
Pure Storage bought Portworx in 2020.
Verelo is an open source backup, restore, recovery and migration tool for Kubernetes. It can back up entire clusters, or parts of one using namespaces and label selectors. The tool can now also restore Kubernetes application programming interface (API) groups by priority level. Velero was previously Heptio Ark.
Although Velero is open source, it is supported by VMWare, and the vendor has a number of Velero resources in its Tanzu developer centre.
Red Hat OpenShift Container Storage
Red Hat – now part of IBM – introduced significant Kubernetes support to its Data Services line in 2020, replacing previous IBM offerings.
Red Hat OpenShift Container Storage adds the vendor’s data protection tools to container environments, without, says Red Hat, any additional technology or infrastructure. Features include snapshots via the container storage interface, and clones of existing data volumes, and support within OpenShift APIs to restore data and applications in container pods, and restoring connections between namespaces and persistent data.
The toolset also links to IBM’s Spectrum Protect Plus services and to TrilioVault and Kasten K10.
NetApp Astra Data Store
NetApp’s Astra Data Store is a file service for containers and virtual machines (VMs) based on a standard NFS client. Astra is positioned around simplifying storage across containers and VMs and making it more efficient, so it allows firms to use the same storage pool and backup tools across both architectures.
NetApp updated its Astra Control software earlier this year to support additional Kubernetes platforms, including Rancher and community Kubernetes. It uses NetApp’s back-end technologies for data protection, DR and migration.
Rancher provides its own backup and restore operator from v2.5 of its environment upwards. The operator has to be installed in the local Kubernetes cluster, and backs up the Rancher app. However, the Rancher UI allows etcd and cluster backups, including snapshots. These can be saved locally or to an S3-compatible cloud target.
Trilio positions its TrilioVault tool as cloud-native data protection for Kubernetes. Trilio claims to be application-centric, and has a wide range of Kubernetes platform and cloud support. The tool uses core Kubernetes APIs and the CSI framework, while the management console supports application discovery and backup and restore and DR policy management. The tool also supports snapshots.
TrilioVault is certified for a range of deployments, including on HPE, VMWare Tanzu and Rancher.
Cohesity also positions its Helios backup tool as a cloud-native service for containers. The vendor works with the three hyperscale platforms, and backs up applications’ persistent states, persistent volumes and operational metadata. Multicloud support means that backups and restores can be across a range of cloud providers for additional resilience.
Cohesity’s cloning tools also offers zero-cost clones so that DevOps teams can use backup data for application development.
Veritas’s NetBackup tools support a range of backup and recovery, and business continuity options for Kubernetes. As well as standard backups, Veritas supports ransomware protection, via immutable backups on AWS S3, and Kubernetes data management with integrated disaster recovery. Veritas also says its tools allow users to move between Kubernetes distributions for a “backup once, recover from and to anywhere” approach.
Catalogic’s Cloudcasa is relatively unusual in the market in that it operates as backup-as-a service. It provides cluster-level recovery and free snapshots, retained for 30 days, along with a range of paid-for options including Kubernetes Persistent Volume (PV) backups. Cloudcasa supports Amazon EBS snapshots and CSI snapshots.
Kubernetes-native vs general backup: Beware doubling-up
Choosing the best backup and recovery options for Kubernetes is not always straightforward, however, and firms may find they need more than one tool to protect their installations.
“Many of the standalone Kubernetes native backup tools are being acquired by DevOps teams directly,” says Forrester’s Ellis. “It is not uncommon for a purchase of TrillioVault or Kasten to be initiated by a product team. More comprehensive backup tools are still being purchased by the CIO and their team, and understanding the need for Kubernetes native backup in that part of the organisation is a little behind.”
CIOs need to balance the richer functionality and more granular controls of native Kubernetes tools with the better enterprise-wide view of applications and data provided by general-purpose but container-aware backup tools.
“In the comprehensive backup tools, I think Kubernetes native backup is viewed as tables stakes,” says Ellis. “Almost all enterprise-level vendors claim to be able to back up Kubernetes, but not all of them do it natively.”