Introduction
Typically, snapshots are used to return a virtual machine to its previous state in case of any errors during updates or configuration changes. Thus, they will save your system from unpredictable failures. But please, do not consider a snapshot as a backup and vice versa!
Let’s be honest, snapshots are not backups. Each snapshot is associated with a certain set of indices (or a single index) to refer other blocks on the disk. If the corresponding storage goes down, you’ll lose all your data because you’ll be unable to restore everything from a snapshot. Based on this, be smart and do not rely on them and use the proper set of tools for backups. In other words, use a hammer for nails and screwdriver for screws.
Snapshots in VMware
Here are several recommendations regarding snapshots in vSphere environment:
- Do not use snapshots as backups
- The chain cannot be longer than 32 snapshots. Still, do not use more than three snapshots for better performance.
- Do not use one snapshot more than 72 hours.
The longer you store the snapshot, the bigger file you get. One day, you may even run out of storage thanks to one of those guys. You can find more information and recommendations in this article: https://kb.vmware.com/s/article/1025279.
Creating Snapshots themselves
In order to create a snapshot in VMware ESXi, go to vSphere Web Client, select a VM and follow Action=> Snapshots=> Take Snapshot…
Before taking a snapshot, choose the appropriate mode: Snapshot the virtual machine’s memory, or Quiesce guest file system. The latest is available only if VMware Tools are installed.
Taking a snapshot of a virtual machine’s memory allows returning to the running VM state. This option effectively “stuns” the OS, whereas a stun duration depends on the IO intensity, memory utilization and other factors.Such snapshots take the longest to complete and cause most “downtime” in regard to the OS. However, taking memory snapshots saves all volatile data and can be used to return a VM to the exact running state it was in when the snapshot was taken.
Snapshots without capturing VM’s memory (AKA “clean” snapshots) revert you to the powered-off VM state. Any data that was not committed will be lost.
Quiesce guest file system is a process of bringing the VM’s on-disk data to the state suitable for backup. Any application that is supported by VMware or built-in VSS writers will complete incoming transactions before the snapshot is taken, while I/O will remain frozen until the snapshot is completed. Again, to use this option, you need to have VMware Tools installed.
Snapshot is a complex thing. It consists of *.vmdk, delta.vmdk, vmsd, and vmsn files. *.vmdk is an original file. Once a snapshot is created, the initial disk becomes read-only. All changes there are recorded to the temporary delta.vmdk. In this way, delta captures the difference between the virtual disk at the moment of snapshot creation and its current state.
Now, let’s describe *.vmsd and *.vmsn files. The former contains metadata, while the later keeps the particular VM’s configuration and its active state. The figure below displays the stuff we’re talking about.
You can learn more from this article: https://kb.vmware.com/s/article/1015180
Snapshots deletion
In order to delete a snapshot, you need some free space on your storage. Accordingly, you need more space if you’re dealing with several snapshots.
Note that this goes only for VMs created on thin-provisioned disks that occupy more and more storage as they are filled with data.
During the VM creation on thick-provisioned disk, the disk size specified in VM parameters occupies the necessary storage space straight away. Based on this, it does not need any additional storage since after deleting snapshots, the data will be copied to the initial disk.
Let’s have a closer look
To ensure that things discussed above are clear for everyone, let’s study an example. We have the test virtual machine with 20GB storage and three snapshots: 2GB, 3GB, and 10GB.
Press Delete a VM Snapshot. Snapshot copies all data to the parent one and is removed afterwards.
The image below describes how that thing happens.
After pressing Delete All Snapshots, the snapshots are merged into an initial vmdk.
The merged snapshots occupy the storage space and are not deleted until merging is over.
The scheme below illustrates the discussed process.
For more info, check this article: https://kb.vmware.com/s/article/1023657
Snapshots’ consolidation
Snapshots’ consolidation is a process of combining a delta.vmdk file with the original disk, the *.vmdk file itself.
Some warnings:
Do not interrupt snapshot consolidation. Doing this may cause irretrievable corruption of consolidated virtual disks.
Get ready, snapshots’ deletion may turn out to be a long process. Its’ duration depends on the overall files’ size. Better prepare a cup of coffee.
Virtual machine’s performance may become slower during consolidation.
Delete a VM Snapshot and Delete all snapshots options are used to remove snapshots from Snapshot Manager. Later, delta.vmdk files consolidate and merge with virtual machine’s base disk. If disks consolidation fails, some virtual disks’ files may remain on the storage occupying space.
If that type of things happens, you may force the consolidation process by pressing Consolidate.
If you need some additional info on consolidation, have a look through this article: https://kb.vmware.com/s/article/2003638
Checkpoints in Hyper-V
In Hyper-V, snapshots are called “checkpoints”. They allow recording a specific VMs’ state, data, and configuration.
Here are some precautions from Microsoft:
Checkpoints decrease VMs’ disks performance. That’s caused by delta.avhdx files that appear during snapshots’ creation. These files record all the VM changes. The more checkpoints, the more files, the more is the load on the disk subsystem.
It is a bad idea to use checkpoints for VMs that deal with volatile services like Active Directory. Also, avoid using checkpoints where performance and storage availability are crucial.
Checkpoints Types
Production checkpoints
Taking a production checkpoint involves Volume Shadow Copying Service (VSS). It initiates backup integration service (Volume Shadow Copy) to ensure interaction with applications inside the VM.
While capturing a production snapshot, hypervisor informs the guest operating system about its actions via the integration service. Inside the VM, all VSS-friendly applications finish their current operations correctly and save data. Afterwards, the snapshot is captured and VM keeps on going.
Standard checkpoints
Standard checkpoints capture the CPU state, RAM, and current disk operations, of a running or a save-state VM and save this data on disk. This means that it records VM’s state, running apps, and opened files.
The guest operating system does not participate in creating a snapshot. Thus, a VM keeps on going from that moment when a snapshot has been captured. In some cases, this may result in application crashes inside a VM. Especially that happens to applications using transactions (i.e., SQL Server, Exchange, etc.).
You can always change the used checkpoint option in virtual machines’ properties.
For more info, see this dedicated article: https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/manage/choose-between-standard-or-production-checkpoints-in-hyper-v
Let’s create a checkpoint
In order to create a checkpoint, go to Hyper-V Manager, right-click on the VM, and press Checkpoint.
Checkpoint creation initiates creating two files in the VM folder: VMCX and VMRS. The former contains information about configuration, while the latter keeps everything about virtual machine’s state.
All new data about the changes are stored in the temporary delta.avhdx file which is created from the original *.vhdx file.
NOTE: Do not delete delta.avhdx from the checkpoint directory. Use Hyper-V Manager for checkpoint deletion.
Deleting a checkpoint
For checkpoints deletion, you’ll require additional disk storage. You need as much free space, as occupied by a checkpoint that is about to be deleted.
Going into details
Let’s study an example. The test VM has 20GB storage and three checkpoints (2GB, 3GB, 10GB).
After pressing Delete checkpoint, the checkpoint copies data to the parent one and is removed afterwards.
The scheme below illustrates this process:
If you press Delete Checkpoint Subtree (if it’s the initial checkpoint), the checkpoints data gets copied to the base disk and are deleted subsequently. First, the initial checkpoint is copied and subordinary checkpoints are copied next.
Here’s an example:
If you press Delete Checkpoint Subtree on one of the daughter checkpoints, its data will be copied to the parent checkpoint.
If you are looking for more details, take a look at Microsoft’s thoughts:
https://technet.microsoft.com/en-us/library/dn818483(v=ws.11).aspx
Conclusion
In this post I’ve tried to show you the difference between snapshots and backups and described some basic things you might find useful when working with snapshots in Hyper-V and vSphere environments. Just to emphasize, snapshots are not backups and they allow you only to roll the VM back to its previous state! If something happens to the VMs virtual disk or the corresponding storage, snapshots won’t save you since the major part of VM’s data is lost completely. If you want your data to be available no matter what happens to your storage, you should really take your time and plan some decent backup strategy.