VMware View Planner, is a software solution (appliance) which has been around for quite some time, but doesn’t get a great deal of publicity or traction. Sure if you follow a number of VMware blogs around performance testing of VSAN or reference architecture, View Planner is mentioned. Basically, it allows you to perform simulated workloads to perform all sorts of benchmark tests for your environment. Once you have the testing infrastructure up and running, you clone x number of virtual machines you want to test with and load the hosts or storage system with this load. Bear in mind the workload is fairly similar to a real user, however each environment is different, with different use cases, applications and working patterns. Therefore, no simulated tool could ever represent a real world, like for like workload. However, as mentioned before View Planner does an excellent job and is used for a number of VMware’s reference architecture testing and is based on real world testing\results from their labs.
The point of this post is to document the various pre-req’s involved from deploying this in the field, alongside capturing some of the gotcha’s that might crop up during setup or testing.
Note: For those not familiar with View Planner, the following is pulled from VMware docs to provide some context around the different terminology used throughout the post, although I’ve created my own notes from VMworld 2013 and 2014 sessions here and here These notes cover the importance and benefits of stress testing.
View Planner Overview
View Planner is a VDI workload generator that automates and measures a typical office user’s activity. The automated applications are Microsoft Office, PDF browse, watching a video and the operations on these applications are opening a file, browsing the web, modifying files, saving, closing, etc. Each View Planner run entails these operations run in an iterative fashion and each iteration is a randomly sequenced workload of these applications and operations. The results of a run consist of operational latencies collected for these applications/operations for all iterations.
View Planner consists of the following components:
- A number of desktop virtual machines running on one or more ESX hosts.
- A number of client virtual machines running on one or more ESX hosts
- Remote-mode and passive-mode runs; not used for local-mode runs.
- A single controller appliance running on an ESX host.
- Remote mode
In this mode there is a remote client virtual machine for each desktop virtual machine. The client controls the applications running in the desktop virtual machine and views the desktop. This mode requires the most hardware, but is also the most representative of real-world VDI deployments.
- Passive mode
In this mode the number of client virtual machines can be less than the number of desktop virtual machines. The desktop controls the applications running in the desktop virtual machine; the client is a passive viewer. This intermediate mode can use less hardware than the remote mode, but can be more representative of real-world VDI deployments than the local mode.
- Local mode
This mode uses no client virtual machines. The runs are initiated and run entirely on the desktop virtual machines. Because this mode doesn’t generate the network traffic of a real-world VDI deployment it is less representative of such deployments than the other two View Planner modes. However, it uses less hardware than either of the other modes to run the same number of desktop virtual machines
The standardized View Planner workload mix consists of nine applications running in the desktop virtual machines and performing a combined total of 44 user operations.
For response time characterisation, View Planner operations are divided into three main groups:
(1) Group A for interactive operations
(2) Group B for I/O operations
(3) Group C for background operations.
The score is determined separately for Group A user operations and Group B user operations, by calculating the 95th percentile latency of all the operations in a group. The default thresholds are 1.0 seconds for Group A, and 6.0 seconds for Group B.
During a View Planner run each desktop virtual machine performs a user-specified number of separate iterations (though benchmark runs must have five iterations). These iterations are divided into three phases:
- Ramp-Up (first iteration)
- Steady-State (the total number of iterations minus two)
- Ramp-Down (last iteration)
During each iteration, View Planner reports the latencies for each operation performed within each virtual machine.
When View Planner is used as a benchmark, the configuration of the workload virtual machines, the workload mix, and the versions of the View Planner controller appliance, operating systems, tools, and all other software used must conform to the specifications in the View Planner documentation.
To be used to generate a VDImark benchmark score, the 95th percentile of the Group A QoS results, and the 95th percentile of the Group B QoS results during the three iterations in the steady-state phase of a View Planner run, must each be at or below the default thresholds.
The View Planner VDImark benchmark score is the number of concurrent users (that is, the number of View Planner desktop virtual machines) participating in a compliant run.
Below outlines the intended goals and objectives for this particular stress test:-
- Validate performance of hardware in use, including Cisco UCS blade servers and Nimble storage
- Validate configuration in use, to ensure required stability and performance is met.
- Find the ‘sweet spot’ of consolidation in terms of number of virtual machines per host\cluster, before performance (user experience) begins to degrade
- Define at what point, user experience from the virtual desktop begins to degrade
- Scale testing up to and beyond 200 anticipated concurrent users, to identify future scalability and performance bottlenecks.
A number of items and configuration is required to run successful tests. Here is the list I created and modified, with additional items and checks from those found in the official docs. I’ve extracted this table from Excel, as it presents quite nicely.
Note: For Office 2010, we were able to use an evaluation key which we activated, instead of a Volume license key like MAK or KMS.
I’ll document this for interest, in case anyone is wondering about the infrastructure testing was carried out against. I won’t present the results or findings, as it’s beyond the scope of this post. However, the tests were successful and objectives met.
- System: x2 Cisco UCS Blade Servers
- CPU: x2 CPU @ 2.799GHz (20 cores total)
- BIOS settings configured optimally for VDI environment, as per online Cisco documentation
- RAM: 256GB RAM
- x2 10GB NIC
- Platform: vCenter and ESXi 5.5
- ESXi Power Management – High Performance
- Storage: Nimble CS220G array
- x2 10GB NIC
- Firmware – One version down from the latest
- vSphere Integration Storage – Nimble PSP Directed path policy
- VAAI Hardware Acceleration – Supported
To use View Planner as a benchmark, either Passive or Remote modes are required. These modes depend on Active Directory, in order to create the necessary user accounts that will login from the client machine(s) into the virtual desktop. Due to security restriction and segmentation of the network at the customer site, it was not possible to use Active Directory. Typically, a non-production (test) Active Directory could be used instead, however this was also unavailable, resulting in the decision to run all tests in ‘local’ mode as this is not dependent on Active Directory.
As a result of the above choice of local mode, several compliances during the runs failed (for example, no PCoIP protocol and no remote mode detected), therefore a benchmark result and VDImark could not be obtained. Using local mode anyway, immediately rules this out. Instead, local mode was used and the application response time (latencies) were measured, resulting in Group A and Group B operations being monitored to ensure the thresholds were not exceeded as the workload and testing scaled against the infrastructure. This was important to ensure as the workload increased, that the application latencies and user experience remained within the thresholds.
In addition to the above, other metrics were monitored using vCenter charts, ESXTOP and Nimble’s storage monitoring, to validate performance and identify if bottlenecks were present.
Mode – All tests were run in ‘local’ mode as detailed above in Constraints. With ‘local’ mode, the virtual desktops are run against the infrastructure, and no client connectivity is utilised or tested, therefore PCoIP and network traffic cannot be simulated or measured with local mode.
Thinktime is the time the process waits between operations. Typically a 5 second thinktime would reflect a very active user (power), whilst a thinktime of around 20 seconds would reflect a lighter user or task worker.
Thinktime of 5 seconds was selected for these tests, to simulate a very active user to put more pressure and workload onto the infrastructure. As a result, due to the workload and thinktime, you could think of this testing as a typical medium\heavy worker.
Iterations During a View Planner run each desktop virtual machine performs a user-specified number of separate iterations, which run through the full 44 user operations. You can choose however many iterations you want. Typically tests using 15 iterations took around 6-7 hours. To extend the period of testing and provide a continuous workload on the infrastructure, you can set this number to whatever you like, if you’ve plenty of time to test (2-4 days for example), set this to 100 for example or 1000 🙂
If you happen to deploy View Planner and test in a secure environment, with different security zones, you will need to open up various firewall ports. I couldn’t find any documentation on these, but used tools like netstat and monitoring logs from firewalls to try and understand the ports needed. Below is a summary, this is not a conclusive or official list but, it enabled my testing to complete successfully.
- View Planner > vCenter = HTTPS 443
- Account also required with privileges to power on\off VMs
- View Planner > Desktop VMs = Ping (ICMP)
- Desktop VMs > View Planner = Ping (ICMP)
- Desktop VMs > View Planner = 9200 (found this being blocked through firewall logs)
- Desktop VMs > View Planner = HTTP 80 and 8080
- Port 80 is required for access to the Apache server as part of the test runs
- Desktop VMs > DHCP\DNS services = Enabled
- Mgmt Station > View Planner = SSH 22
- Mgmt Station > View Planner = HTTP 80
View Planner Activity
A good place to monitor the activity of View Planner (harness), in terms of current status is via the CLI of the appliance.
- SSH into View Planner IP address
- cd /root/ViewPlanner
- tail -f viewplanner.log
Benchmark v Flexibility Mode
Two modes exist for View Planner, which has changed from previous 2.x versions. Today, Benchmark mode is only available to the public, and doesn’t provide flexibility, or ability to define custom workloads or use Local Mode with more than one virtual machine for example. The Partner version of View Planner provides this additional capability, which is very useful and in my instance, was needed (see above Constraints). You may come across either of the following two error messages when trying the following:-
- New (create) Workload profile
- Can’t edit workoad profile in benchmark mode
- New Run profile, choosing more than 1 Virtual Machine, for example 5
- Error in Local mode, only single VM run is supported in benchmark mode.
To resolve either of the issues, the following is required:-
- Download View Planner from VMware Partner Central. Although there is not much in terms of description to distinguish this from the public version, I guess apart form the build number.
- If you have downloaded the correct appliance from Partner Central and deployed using the OVF wizard, and networked the appliance using this wizard, and subsequently access View Planner via the management GUI – You still need to run the following command (including setting the network again)
- Log into the View Planner appliance via SSH
- Change path to:-
- cd /root/ViewPlanner
- Set the path for python
- source setup.sh
- python ./harness_setup.pyc -i <ip> -m <mask> -g <gateway> -d <domain> -n <dns>
This command will complete the setup for View Planner and install additional files, which also activates the flexibility mode.
Optional Configuration and Settings
For example, including increasing concurrent Power-Ons and Workload Starts if running large number of VMs
- Amend the adminops.cfg file
- Located in /root/ViewPlanner
Extended Reporting (Compliance file)
You have successfully run a workload and begin to view the PDF output report. However, you see the following error highlighted below:-
Please check Compliance file?
Where is this? You have to run an additional command in the appliance, which will present a more verbose set of reporting and logs files, including a compliance report. Be sure to include -b (additional validation checks to generate results) and -d (check for compliance file) switches as shown below.
- Log into the View Planner appliance using SSH
- Change paths to
- cd /root/ViewPlanner
- List out reports and runs available to show
- python ./report.pyc -a
- In order to produce output on a particular test such as Benchmark1, run the following command
- python ./report.pyc -t Benchmark1 -m local -b -d -s “2014-09-26 11:11:16”
- Use a tool like WinSCP to extract the reporting bundle and view the files including compliance.txt file
Tip – You will likely need a decent log viewer, as trying to read these log files in notepad is impossible.
Just for Fun
Here’s a picture of the demand put on the system during the initial power on and ramp up stage! The hosts eventually settled down quickly in terms of CPU usage. Still I can’t say I’ve ever seen this graphic and % CPU before 🙂