What I’ve learned (so far) from being the Web and Mobile Performance/Load Testing Lead for a Professional services company …

management

 

                                                                                                                 By Rebecca G. Clinard

 

I am proud to be a performance lead among UTest’s Performance Engineering community. I’m working with the most seasoned and brightest performance engineering team within this niche industry. Their skills are seriously impressive. If they encounter a unique technical challenge, they get creative to solve it. Most of these engineers are very independent but I encourage them to reach out and involve myself and the community members – not only because we all learn from their challenge but because I want everyone to understand they are never alone with a problem. If you have ever been the “sole” performance engineer for a company (as I have been in several positions), you know how overwhelming and isolating it can be during challenging situations. It can feel like a pressure cooker as you work through a very complex issue.  Just knowing we can pool our knowledge and brainstorm approaches helps to alleviate that pressure and allows one to solve problems more efficiently. No single individual can know Everything!

 

I wanted to share this valuable information I’ve learned during my Performance Lead position. Performance engineering is a very complicated and technical industry. My responsibilities often involve aligning the project requirements with the right skillset but I’ve learned so much more from holding this position. Here are some valuable insights.

 

Sometimes the client doesn’t necessarily know their own application’s technology, and that’s ok…

 

dontknow

During the initial scope of a project, we are often working with a management team. We need to vet the complexity of the project in order to create an accurate Statement of Work and prepare a quotation for services. We are very detailed in our questions but sometimes the answers are not always factual. I’ve found that it’s not that the client is intentionally misleading our team, it’s that they really don’t understand or aren’t aware of the technology from which of their web or mobile application was built upon. As you can imagine, it’s difficult to scope a project correctly based on answers which are later found out to be incorrect. For example, we will ask if their application communicates using any other protocol other than HTTP/s. The answer is a resounding “No”. We base our SOW upon this answer and go about creating load scripts only to find out that parts their application uses the WebSocket protocol (or some other network communication technology). We need to bring this new technology to their attention and explain that a unique plugin or different load tool is required to handle this protocol in order to emulate a realistic workload. SOW’s and quotes need to be revised. These types of situations happen quite frequently and the best way to present this news to the client is that we have been thorough in our investigation and transparent with our information gathering. Ultimately they understand we are bringing them more value as a service.

 

The need to set expectations for multiple load test executions…

 

expectations

Many clients perceive that their load test will be “1 and done.” That’s not really realistic, practical or methodical. Load testing is a process whereby you create a load of concurrent virtual users emulating real users actively using an application – often ramping up to the expected peak load. Performance engineering involves methodical approaches in creating the load test, executing, and analyzing the results. A very important piece of that process is validating the results. By running a single test, you can absolutely generate a report and send it off to the client. However, validating that results are reproducible will save time and money. Let me explain. Anomalies in performance happen for a variety of reasons. Deployment infrastructures are sensitive to a variety of conditions – shared resources, network connectivity, usage patterns, batched processes, data aggregation, business logic, maintenance and so on. Any circumstance or condition can cause an anomaly the results. For example, during the first test, the web server’s CPU spiked to 100% during a low load but during the peak load, it remained consistent at 40% used. Chasing anomalies is a huge waste of effort and time. A more methodical approach is to plan on multiple executions of the Same load test to reproduce the scalability limitations. Isolate that bottleneck. Then ensue with remediation. We need to educate the client on why this project should include multiple tests to validate results.

 

There is no single tool for all performance projects…

 

tools-hammer-screwdriver-wrench

Every application is built differently – I’ve not encountered two applications which were exactly the same, whether it be code, server, configuration, or any number of factors. Every developer is a mad scientist with access to a ton of technology options to build the same feature or solve a challenge. And it’s not always just the technologies which differ, it’s the behavior can differ as well. Ajax and polling are examples of behaviors. In order to emulate a realistic load, the tool has to meet both the technology and the behavior requirements. When scoping a project, keep an open mind on the tool solution. Don’t work backward and see if the application fits the tool or else you might pass on a lucrative engagement which can be very successful with the right tool.

 

Smoke tests and Mulligan’s…

 

mulligan2

Murphy’s Law for Performance Testing – You are going to need a Mulligan. Every performance engineer has been here:  You believe you have all the bases covered. All the transactions are working as expected. You are ready for the big target load test. You press start and …. the unexpected happens. Oh the unexpected can range from the load solution (load generators becoming overwhelmed), the load test harness (certain transactions can not be reiterated due to states), a invalid data pool issue (some users id’s are not able to login), a miscommunication (the IT team just took the DB down for maintenance), and so on. It’s impossible to list the unexpected. The best you can do is react, fix the situation, learn from it and press start again.

 

The Perception of a “Successful” performance test…

 

success

Let’s say the client expected their new mobile application to scale to 15,000 concurrent users but the results came back that it actually supports 3,000. Hooray! This news isn’t to be perceived as disappointing news. This just means that the investment that client made into this investigation paid off. It’s far better to find out this scalability limitation now and have time to react and fix it before hitting in production where you can lose customers, branding, and revenue. Our PE’s are data driven and they deliver results based on results and facts. Often the client will engage us and give us access to the monitoring of the deployment infrastructure so we can help to isolate the bottleneck. Once the test results have been validated by the correlation of a resource bottleneck, tuning ensues. Which brings me to the next Insight…

 

Timelines are like the weather, expect changes…

 

forecast

We all wish that project timelines can be quantified to a certain duration but no one can really predict the future. It would be extremely convenient if we could all say with absolute certain: The following performance engagement will require 5 tests to be conducted over a 3 week period and upon conclusion of this project, the application will scale to 23000 concurrent users. So many variables are in play but here’s an example. During the first week, it becomes obvious that the application cluster needs to double the number of nodes. This work will require the IT to create the infrastructure, deploy the code, configure the load balancer, etc. Turn around time can vary and the client might not be ready for another test for a few days or over a week. The load test execution is obviously stalled. Everyone (engineers and clients) needs to be flexible and manage their time to accommodate. Rarely do I see unrealistic demands but if this happens, it needs to be handled with setting the right expectations and explaining the performance engineering methodologies.

 

 

I hope you have gained some valuable insight! I applaud Applause/UTest’s Performance Engineering Community!

1 Comment on “What I’ve learned (so far) from being the Web and Mobile Performance/Load Testing Lead for a Professional services company …

  1. Hi Rebecca,

    Really very informative article. You have mentioned so many useful things in the article. One thing in your article is very impressive that “No single individual can know Everything!”, it is really very important to help each other in the community. Your thinking is so wide. You explained he Challenges that a performance engineer can face while doing the performance testing very well. Please keep posting your experiences so that community can learn more from you. Thanks for such a nice & informative article.

    Regards,
    Rajneesh Kaundal

Leave a Reply

Your email address will not be published. Required fields are marked *