Vivek Venkatachalam, Shirley Tan, Marcelo Nery dos Santos, Microsoft
A key indicator of the quality of a software product is its responsiveness and performance, as this enables users to get tasks done quickly and efficiently. Test teams realize this and therefore typically set aside time and resources for performance testing. Unfortunately, the effort inevitably runs into the dreaded “high unexplainable variability in performance test results”. Since fixing this is hard to do, a typical workaround is to raise the acceptable loss threshold to ensure only large performance trigger corrective action. However, this causes a “death-by-a-thousand-cuts” effect where numerous small performance losses (that are all real) are allowed into the product. By the time the product is ready to ship, these small performance losses have accumulated and the product exhibits poor performance with no easy fix in sight.
How does one get close to the ideal of a performance test system that can reliably detect even small performance losses on a build-over-build basis while minimizing false positives? This paper describes our attempt to tackle this problem while running performance tests for Microsoft Lync.
The bulk of the variation in performance results when testing a distributed system is typically due to network traffic variations and varying load on the servers. Our approach to eliminate this variability was to design a system where the performance test runs on a virtualized distributed system, that we call a Perf Cell. This is essentially a single physical machine that houses all the individual components on separate virtual machines connected via a virtual network and otherwise isolated from all external networks. In the paper, we present our design and implementation as well as results that indicate that variability is indeed reduced. We believe this paper would be useful reading for engineers responsible for designing, implementing or running a performance engineering system.