Nearly all enterprise IT projects involve decisions about where to store data.
The public cloud holds perhaps more than half of all business data. That volume of data held on cloud-based systems, especially by the hyperscalers – Amazon Web Services (AWS), Google Cloud Platform and Microsoft Azure – has grown steadily over the past few years.
But traffic is not all one way. Increasingly, firms are now looking at hybrid cloud storage, where some data is stored on-premise and some in the cloud. Research by Aptum, a managed services and consulting firm, found that 77% of firms use the public cloud and 86% expect to use hybrid or multicloud services.
However, raw statistics mask what is often a complex decision-making process that involves assessment of workloads, performance, regulation and security, and costs. Here, we look at four key decisions about whether data should be stored on-site, or in the cloud – or a combination of both.
Is the data mostly processed on-prem or in the cloud?
For the best performance, system architects need to minimise latency between applications and storage. To access cloud storage via the public internet inevitably increases latency. Internet connections are also more prone to variable performance and general reliability issues.
This suggests that for best performance, data should be stored on-premise. For the most critical applications, this is still usually the case.
But the decision is not always clear cut.
“We know that if you start to run compute on a storage bucket across the wire, you are going to have a performance impact,” cautions Paul Mackay, regional vice-president for EMEA and APAC at cloud data firm Cloudera.
Different architectures may also be needed, depending on whether the data is being consumed by an application or a human analyst.
“The first case is when applications consume the data; the second is users that access file shares that sit on local storage appliances,” says Adrian Bradley, head of cloud transformation at KPMG UK. “In the first, it is best practice to keep applications and connected data in the same environment. For file shares, the decision of where to store data is mostly driven by analysis of operational costs and current data lifecycle management requirements.”
How frequently is data accessed?
For some use cases – such as archiving and “cold” storage – cloud makes a lot of sense.
With data accessed infrequently, performance is less critical. Cloud-based archiving and backup applications are designed to work in the background, so any performance issues should not affect users.
Rahul Gupta, PA Consulting
Problems occur when organisations lack a good understanding of their data assets and how they are used. Storing data on the wrong tier increases costs. There may also be performance issues.
“Management of large, frequently accessed datasets is economically and performance-wise more advantageous with on-premise storage solutions,” advises Rahul Gupta, a data expert at PA Consulting.
Again, it comes down to how well applications are optimised for the cloud. Firms that move existing workloads to the cloud can hit performance and cost issues, whereas those that use cloud-native applications fare better.
What are the cost considerations?
Cloud computing is often seen as a cost-saving measure, and that extends to storage.
In reality, moving to the cloud has more to do with a shift from capital investment to operational expenditure. Over time, cloud can easily cost more than on-premise technology or alternatives such as colocation.
That additional cost can be worthwhile. Sometimes, firms are willing to pay for flexibility – to tap into cloud providers’ innovation or because they need to preserve capital. Or they might have moved to cloud-based applications.
Even so, optimised on-premise storage can still be the cheaper option. As PA’s Gupta points out, much depends on how new the customer’s on-site infrastructure is, and how much life it has left.
Cloud storage also has hidden costs. Data egress is frequently cited as a reason for higher than expected bills, but firms can also find they pay more than expected because they store data for extended periods in expensive tiers rather than dedicated cloud archives.
Again, careful application design and a clear picture of data use will minimise this.
What are the security and compliance requirements?
The good news is that most security and compliance requirements can now be met in the cloud.
Patrick Smith, Pure Storage
“In the early days of the cloud, you didn’t put mission-critical systems, anything to do with finance, or anything with a regulatory impact in the public cloud,” says Patrick Smith, EMEA field CTO at storage supplier Pure. “But regulators now are happier with public cloud.”
Even the highest levels of security are possible with the right cloud infrastructure. The key for CIOs is to match security and compliance requirements to the sensitivity of data.
In some industries, such as financial services or health, there are additional compliance requirements that include where data is stored. This might restrict which cloud services can be used.
It remains vital to have a clear picture of the organisation’s data, before making the decision where to store it.
“From a security or compliance perspective, you have to be able to understand where your data is, who’s touched it, and what has happened to that dataset,” says Cloudera’s Mackay. “We see organisations struggle a lot here. They have a security posture on-premise, but need a whole new toolset to do that on the cloud.”