Analyzing performance test results is where the REAL fun begins. It’s also the most challenging phase and where I see an opportunity to help mentor and broaden these needed analytical skills within the performance engineering industry. Remember my prior post: Analyzing is a Human Process. To be successful during the analysis phase, it’s not how well you know a tool…it’s the data interpretations using proven analytical methodologies and processes which will reveal the scalability limitations and isolate bottlenecks. Take my course!
Often, we as performance engineers encounter vast amounts of collected data resulting from multiple test executions. To make our lives easier, make sure to only collect the relevant KPI monitored metrics to reduce the sheer amount of data. The more relevant KPI’s, the clearer the performance story. It’s better to initially hunt down and choose to include too many KPI’s – so look under every rock. See my prior post: The Hunt for KPI’s. Then once these KPI’s are graphed out, if any of these KPI’s values is consistently flatlined – meaning there is no deviation during a test as the workload increases/decreases – you can then start to eliminate some irrelevant KPI’s from your harness, thereby reducing the amount of data.
Ok, it’s time to prove your KPI’s Worth.
Spin up a realistic load test using both the user and engineering KPI scripts. This test is not intended to reach target or peak loads. This type of test just needs to represent realistic user activity throughput and behaviors. You can even aim for a fraction of the target load since this is not a goal orientated performance test. Set up a slow ramping test. Important! Ramp up gradually, allowing for 3 KPI metric captures per load. For example, if the KPI collection interval is set at every 15 seconds, create a schedule where you add one user every 1 minute so you have at least 3 collected metric values. It’s very important to have at least 3 values!
Graph out ALL your Gold KPI’s. Let’s use methodical patience here. The information you collect from this test is worth its weight in gold. Review all of your graphed KPI’s and determine whether their values have a direct or inverse relationship to the TPS/workload reported by the load tool. Also, if your tool has this feature capability, build in some “breaks” into the test schedule where the user load returns to 0. Exercise the application in order to validate your KPI’s will trend with the workload. If the KPI doesn’t budge or make sense, it gets tossed out. Plain and simple! Clear the clutter – more on this later, to make your analysis job much easier.
Remember, the metered value of the KPI does not matter right now. Don’t try to analyze to see how much of a resource is being used or try to contemplate any possible resource saturations. At this point, it’s like a broken scale test where the initial weight is off but the measurement works fine – you still know whether you have gained or lost. For example, as the load increases, you should see an increase in web server’s requests per second, a dip in web server machine’s CPU Idle, an increase in app server’s active sessions, a decrease in free worker threads, a decrease in APP server’s OS CPU Idle, a decrease in free DB thread pool connections, an increase in DB’s queries per second, a decrease in DB machine’s CPU Idle, etc….You get the picture!
In summary, if a KPI does not trend with the workload, toss it from your harness and don’t waste valuable time collecting it or analyzing.