In this podcast, we look at snapshots and their role in data protection strategy with Shawn Meyers, field chief technology officer (CTO) at Tintri.
We talk about how we define snapshots, the level of infrastructure at which they are taken, why snapshots are not backups, the effect of the granularity of snapshot on recovery performance, recovery point objective (RPO) and recovery time objective (RTO).
Meyers also talks about the limitations of snapshots, such as use in databases, and the effects on data growth.
Antony Adshead: What are snapshots and their benefits and limitations?
Shawn Meyers: The thing with snapshots is you’ve got to worry at what level you take them.
A snapshot is basically a point-in-time collection of a data point. So, in other words, I can do this at different levels, ie the OS [operating system], the VM [virtual machine], at storage level.
It’s a point-in-time collection that allows me to roll back in time or restore to that point in time.
I make sure I’m very clear about this: A snapshot is not a backup. You will find people who use snapshots as backups. A snapshot is something you can add to your toolbox, you can recover from, but it is definitely not a backup because a backup allows you to roll forward, roll back in time to find a specific time frame; a snapshot is a specific defined point-in-time.
There are different ways of doing snapshots and it depends on what storage technologies you have. The coolest thing I like to talk about is that you’re going to have a LUN [logical unit number] or volume snapshot where everything on that one volume is snapped at the same time. Other times, it’ll be a per-VM or per-object snapshot when it’s a smaller collection of subsets.
The more granular the snapshot, the better the recovery interval is from a perspective of what’s impacted versus the broader snapshot that’s a wider protection process.
We tend in storage to use snapshots a lot for replication such as replicating data to a different site, so I take a snapshot and I replicate it. I still have that snapshot to roll back and recover from.
Shawn Meyers, Tintri
There are limitations, of course. Most of your databases are going to be in a locked state, so I can’t roll forward or roll back by transaction logs.
Also, data growth. That’s one of the biggest things out there. I have snapshots that as I make changes I have data growth. If I have a block that’s been written to and I take a snapshot, write to it again, take another snapshot again, all that data is stored.
That’s one of the biggest things limitations-wise, that we pay attention to the data growth from the snapshots, and the more granular the snapshot becomes, the less storage impact there is. So, the wider the snapshot, the more impact versus the more granular, the less impact.
Adshead: What is best practice for snapshots and how do they fit with a comprehensive data protection strategy?
Meyers: Snapshots should be part of every data protection strategy. And of course it comes down to that you have to go back to the drawing board.
We go back to our RPO – our recovery point objective – and our RTO – recovery time objective. We sit down and figure those out and how our technologies work with the snapshots to meet those needs.
[We also have to take into account] data stored in other locations. We also have to worry about impacts on production because a lot of time we use snapshots for backups today.So, we take a snapshot, I attach that snapshot to the backup system and do the backups there. That way, I’m not having any any performance impact on my production system for that backup.
Same thing as replication. I do the snapshot and I do the replication on the back side so I’m not replicating the active data so in my production system I’m not impacted.
One of the best things about snapshots is that I offload things at the storage layer away from the OS level so now production doesn’t get impacted as I’m doing production.
Of course, knowing when I need to recover to – ie, talking about recovery point; how much data can I lose versus how much data do I need, how fast I need it back online. I always describe that as being like a knob – you turn it up to where you want to and the fun part about when you turn the knob and get a smaller RPO and RTO, my costs tend to go up.
So, if I can live with less processes or longer RPOs and RTOs my costs go down. If I go up into the sub-minute or sub-second, my costs seem to go up stratospherically and make it more expensive.
It comes down to designing your strategy, how much money you have, to figure out what you can do. Or, you figure out what you want and how much it will cost. You have to go back and forth and weigh the pros and cons. Because just like everything else, it’s a business decision for your organisation to determine how much money you’re willing to invest in your data strategy.
I’ve worked as a consultant for a long time in the past, and I’ve worked with companies who lost datacentres, and their idea and take on data protection strategies is different from those who have never had a massive data loss.
There’s always going to be something to worry about, and if you’ve never had a data loss issue it means you’re just lucky; it’s going to happen some point in the future, so make sure you spend time on your data protection strategy.
For me, snapshots is one of the most critical parts of it, to know how I’m going to do it, where I’m going to store the data, how frequently and then what my recovery process is.
One of the greatest benefits of snapshots is my recovery time. If I’m using backups, a lot of the time I have to spin it back up and it takes some time to rehydrate. With snapshots, I can usually recover that object, that VM, whatever it is, in minutes if not seconds and get the system back up to, again, lower your RTO processes.