Proxmox VE is a fantastic alternative to VMware, and with StarWind Virtual SAN (VSAN), you can build a highly available (HA) cluster with just two nodes.
Intro
The recent changes in Broadcom-VMware licensing (https://www.starwindsoftware.com/blog/changes-to-vmware-licensing-policy-at-the-end-of-2023 ) have left many small and medium businesses searching for alternatives. As discussed in one of our recent webinars (https://www.starwindsoftware.com/resource-library/proxmox-ve-vs-vmware-vsphere/ ), Proxmox Virtual Environment (VE) stands out as a primary option for such customers. Today, I’ll demonstrate how to build a highly available Proxmox VE cluster using StarWind VSAN with just 2 storage nodes.
Why not Ceph?
Proxmox includes Ceph as a software-defined storage solution. Ceph is open-source and offers a rich set of features, so why not use it? Well, you can! However, small and medium businesses often don’t require more than two cluster nodes, and this scale limits many of Ceph’s standout features: workload parallelization, resiliency, self-healing, high availability.
Additionally, small businesses may not have a dedicated team of experienced admins to configure, optimize, and maintain Ceph effectively. While Ceph is ideal for large-scale deployments, it often proves unsuitable for resource-constrained and budget-conscious environments.
StarWind is a way to go!
For these reasons, we’ll use StarWind Virtual SAN to configure a minimalist Proxmox hyperconverged (HCI) cluster.
Quick disclaimer: The minimal requirement for a highly available (HA) Proxmox VE cluster is actually 3 nodes. Proxmox Cluster uses the Corosync Quorum Device (QDevice) daemon installed on each Proxmox node to establish quorum in case of a node failure. The good news is that only 2 of these nodes need to be storage nodes. The third, witness node can be a diskless server running only Proxmox VE. We recommend making this third node your backup server to avoid wasting rack space and resources.
Node Specs
Here are the node specifications:
- DELL PowerEdge R640
- CPU 2 x Intel 4110 , 8 cores, 2.1GHz
- RAM 128GB (8 x 16GB)
- Perc H740P Raid Controller
- Storage 2 x 960GB SATA SSD Intel S4610 . RAID1 over Perc H740P
- Network Card 1 Intel i350, 4 x 1GbE, rNDC
- Network Card 2 Intel X550, 2 x 10 GbE
QDevice is deployed on old Intel NUC NUC6i5SYH, detailed specifications are here – https://www.starwindsoftware.com/blog/choosing-ideal-mini-server-for-a-home-lab
Solution Diagram
Here’s how the final solution looks:
10 GbE ports coming from X550 are directly connected between nodes with CAT6a cable. And both are used for storage traffic only. 1GbE ports of Proxmox nodes and QDevice node are connected to MicroTik switch.
Install Proxmox VE
Installation of Proxmox VE is pretty simple, and there are numerous resources on how to do it. I won’t show you how to install it. Instead, here’s a link to a very straightforward guide: https://www.linuxtechi.com/install-proxmox-ve-on-bare-metal/ or YouTube video: https://www.youtube.com/watch?v=UzNHno-viRk
Create Proxmox Cluster
A cluster should be created after deployment prior to any other configuration. Because Proxmox VE cannot put a node in cluster if any virtual machines are already running. The creation procedure is easy and consists of three steps:
1. Create a cluster on one of the nodes.
2. Obtain the necessary information to join the cluster.
3. Join the cluster with the second node.
Proxmox Cluster Configuration and StarWind VSAN deployment
Now the cluster is created, and we continue with Proxmox VE configuration, StarWind VSAN deployment and providing the cluster with high available shared storage. The next should be done:
1. Configure QDevice for Cluster Quorum
2. Configure Storage Networks on Hosts
3. Deploy StarWind VSAN CVM
4. Configure HA Networking in StarWind VSAN CVM
5. Create Storage Pool in StarWind VSAN CVM
6. Create HA LUN
7. Connect HA LUNs to Proxmox Nodes
8. Create LVM on HA iSCSI
9. Add LVM to Proxmox Cluster
All these steps are fully covered in our guide – https://www.starwindsoftware.com/resource-library/starwind-virtual-san-vsan-configuration-guide-for-proxmox-virtual-environment-ve-kvm-vsan-deployed-as-a-controller-virtual-machine-cvm-using-web-ui/ , so I skip this part to make this article more readable.
10. As result I’ve got a shared storage with name “ha-storage”, which is replicated in the real time connected to both hosts.
Deploy VMs and Enable Proxmox HA
Now our cluster is ready for production usage. I’m going to deploy a few Linux machines. Please keep in mind that shared storage should be used as placement location for virtual disks, in my case “ha-storage”. I’ve created 4 VMs with the names “prod-vm” and splitted them between nodes.
If we simulate a node’s failure, no failover will occur now, even though the VMs are on a shared high-available storage. The final configuration should be done in Proxmox Cluster to enable high availability for VMs: [Datacenter] > [HA]. We need to add only production VMs and not add StarWind VSAN VMs. My final configuration looks the following way:
Testing Failover
The last thing to do is test how it will work if the node fails. DELL iDRAC (IPMI) is used to ungracefully power off one of Proxmox nodes to simulate hardware failure. The VMs started to boot approximately 60 seconds after node failure, and all 4 VMs had failover successfully. I also check the cluster health with the console command “pvecm status”. It shows that only one node is running, but the quorum for the cluster persists because of QDevice.
StarWind VSAN resynchronizes storage automatically after recovery of failed node.
Conclusion
Awesome! Everything worked smoothly. Though it wasn’t as easy as with vSphere, the final result is a lightweight and super-efficient 2(3)-node high availability HCI cluster, which is even more efficient and lightweight than a VMware vSphere setup on the same hardware specs.