Introduction
For the first time in its history, the VMworld 2020 conference this September-October has taken place online. Many intriguing announcements have been made, the most impressive being the product line extension and new features for the containerized apps, targeting cloud infrastructure automation.
Today we are interested in cloud automation, namely VMware vRealize AI Cloud solution, designed for the automatic performance optimizations as a part of the Self-Driving Data Center concept (don’t confuse this with Software Defined Data Center concept).
Let’s Get to Details!
The vendors of various IT platforms are promoting this idea long enough already. The primary point is that the solutions inside a data center of the future (software AND hardware) are supposed to monitor environmental metrics independently and make any changes in the configuration if necessary to optimize the overall infrastructure performance.
VMware also considers another IT trend to be of importance, which is the development of hybrid infrastructures for large enterprises. The unification of management tools and technical tools is essential to a hybrid environment, and that’s something VMware has been working on for a long time now (for example, Cloud Director 10.2).
Long story short, in a hybrid environment, every single on-premises solution should have its cloud analog, but, then again, simultaneously, there are supposed to be purely cloud tools to optimize infrastructure as promised by the vendor. One of the aforementioned tools is vRealize AI Cloud:
You can get your hands on vRealize AI Cloud in a package deal with vRealize Operations Cloud as a part of vRealize Cloud Universal subscription. It applies machine learning algorithms to adapt to continually changing workloads for further performance improvement (such as optimizing storage KPIs like throughput and latency).
As of now, this technology is only compatible with vSAN storages, but I can see no reason why there shouldn’t be alternative solutions for other cloud platforms in the future.
As you can see for yourself, vRealize AI Cloud continuously generates and applies different performance optimizations settings. Not only that, but it also provides an admin with tools to monitor both changes made and improvements adjusted.
The vRealize AI Cloud console offers solutions to the 4 types of tasks:
- Performance optimization
- Capacity optimization
- Troubleshooting
- Configuration Management
If you are to move to the virtual data center level, we can see some of them with cluster optimization already activated (green-blue) and anothers with cluster optimization disabled. In both cases, you can see the quantitative metrics and how you can have them improved:
We can go even deeper to the cluster level (by selecting the respective point within the perimeter) and see certain ESXi hosts that could use some performance optimization:
In particular, we can actually see the optimization stream (higher line) and its essential parameters (to the right), such as throughput and latency:
When you are getting into the optimization details, it isn’t hard to find out what has been changed and when. In this case, it was the cache size because AI Cloud has suggested that by decreasing cache size, the write latency will be decreased by 25%:
Of course, such suggestions may turn out to be false. Well, if that’s the case, AI Cloud will just pull everything back the way it was to check if the KPIs have not lowered.
By monitoring AI Cloud’s actions, we can see in detail the timeline of changes made and performance graph on the level of every single one of the selected hosts:
If AI Cloud isn’t active in the cluster, you can still calculate the potential optimization. These numbers are impressive, I can guarantee that, so it does make sense to at least turn it on and see what it can do:
When you’re there, you get to choose how aggressive your optimization approach would be:
Conservative mode leaves a large number of resources available while improving performance according to optimization recommendations, while Aggressive mode isn’t so, let’s just say, self-limited. As usual, start with Conservative mode and slowly pump it up in the process. After AI Cloud is activated, the system will spend some time learning workload patterns. Only after that it can start optimizing performance.
If we are to speak on average, VMware test runs have shown that AI Cloud can optimize vSAN storage by 60%. After testing vRealize AI Cloud on a 4-node vSAN cluster, the results have demonstrated vSAN write-throughput increase by 18% and reduced read-latency between 40%-84%.
Moreover, AI Cloud is integrated with Storage Policy Based Management (SPBM). These policies and their settings are indeed affecting performance. For example, by disabling deduplication, you can substantially increase performance, for it would lower CPU load:
Conclusions
All in all, vRealize AI Cloud is a definite step forward in the development and improvement of the Self-Driving Data Center concept. Let’s all hope that it will be accessible for VMware Cloud infrastructures soon.
Also, on VMworld Online 2020, VMware showed how exactly the vRealize AI Cloud solution would look like:
P. S. You can find a lot more interesting details here. Just don’t forget to search by a session code:
- ETML1760 – VMware vRealize AI and the ML Drivers of the Self-Driving Data Center
- HCMB1761 – Transform your HCI Datacenter Operations with vRealize Operations Cloud and vRealize AI Cloud
- HCMB2357 – Executive Session #1: The Cloud Management of Tomorrow, as Seen From the CTO’s Office
- HCMB1311 – Executive Session #2: Two steps Ahead of the Future, the VMware Cloud Management Roadmap