If you want to save the storage space without deleting your VM data, VMware data deduplication can help you. In this article, I will introduce how VMware data deduplication works, what the data deduplication benefits are, and how to enable it on a vSAN cluster.
When backing up VMware ESXi VMs, if you have limited storage space or just want to save on VM storage costs, you definitely need to know about VMware data deduplication technique.
Data deduplication is a simple and practical space efficiency feature. In VMware environments, it is often used in combination with data compression. Deduplication removes redundant data blocks, while compression removes additional redundant data from each block. Both of them can effectively reduce the amount of physical storage required to store the data.
In this article, I will introduce how VMware data deduplication works, and the detailed steps on how to enable deduplication and compression on vSAN.
VMware data deduplication is not a function of ESXi itself, but provided by vSAN, a software-defined component that is fully integrated with vSphere. This means that you must have a valid license to enable vSAN deduplication and compression on your cluster.
vSAN is a two-tier distributed storage system made of a cache tier and a capacity tier. To provide a higher level of storage performance, active VM data is first written to the write buffer with write acknowledgements sent immediately to the guest. When the data is no longer active, the cold data will be destaged to the capacity tier at a time and frequency determined by the vSAN.
VMware vSAN deduplication occurs when cold data is sent to the capacity tier (after the write acknowledgments are sent to the VM). And it is only available on all-flash disk groups, on-disk format version 3.0 or later. Enabling deduplication and compression on vSAN cluster, the algorithm utilizes a fixed 4K block size to detect and remove redundant copies of data blocks within each disk group. However, redundant blocks across multiple disk groups are not deduplicated.
Before you start, you have to know that enabling VMware data deduplication requires a rolling reformat of all disks in the vSAN cluster. Therefore, it is best to set it up during the beginning, since it takes time to migrate, format and move back the data.
However, if you want to enable this vSAN deduplication feature with live data, please be sure to back up your VMs in advance, in case of data loss. With the native VMware backup solution, you can only back up one VM on the cluster at a time. To quickly back up all VMs on a cluster before enabling VMware data deduplication, some dedicated backup tool may be able to help you better.
Note: This change will require a rolling reformat of all disk in the vSAN cluster, so please back up the VMs in advance.
1. Launch vSphere web client, and navigate to vSAN Cluster.
2. Go to Configure page and click General on the left inventory.
3. Click Edit next to vSAN is Turned On.
4. Check Deduplication and Compression option in Services.
5. Click OK to save the changes. You can check the process in Recent Tasks.
Tip: After enabling VMware data deduplication, you can navigate to vSAN Cluster > Monitor > Capacity > Deduplication and Compression Overview to check the vSAN Deduplication Ratio and Savings status.
Efficiently saving the storage space without deleting important VM data is the most well-known VMware data deduplication benefit.
VMware data deduplication is a vSAN feature in combination with data compression. In this article, I introduced how VMware data deduplication works and how to enable it on a vSAN cluster.
However, to enable deduplication and compression vSAN will require a rolling reformat of all disk in the vSAN cluster, therefore, it is better to follow the golden 3-2-1 backup rule to back up VMs on the cluster in advance to protect your VM data from accidents.