You read that title correctly – it’s not a typo. What if UAT stood for Userless Acceptance Testing instead?
The people that know me well (and even the ones that only know me a little), know I’m a big video game enthusiast. The franchise I’m – by far – the most passionate about, is Microsoft’s Halo saga. Holiday 2021, they’ll be launching their new flagship title ‘Halo Infinite’ (which was already long overdue if you ask me, but as I don’t want to turn this blog into a whole whitepaper, I’ll keep that rant for another time).
A little while ago, Microsoft had their first iteration of beta testing as they gave people the opportunity to play an older build in a public Tech Test. What was special about this beta, is that Microsoft did not allow players to match up against one another. While a traditional Halo game exists of a 4v4 multiplayer match, the Halo Infinite beta pitted people against a team of 4 AI bots. Your opponents were not friends or random strangers across the internet, but a team of 4 AI bots moving across the map, shooting you at sight, working together as team, mimicking what an actual opposing team would do. Despite some hilarious moments, our AI opponents felt a lot more ‘real’ than I had expected, showing that AI and user simulation have come a very long way. While I enjoyed a weekend of bot mayhem, my mind wandered: if AI bots can replace end users in video games, what would they be capable of in the world of software testing and quality engineering?
In my previous blog, I mentioned gamified acceptance tests as one of the only viable options to gather large testing crowds and have representative acceptance tests for large and complex environments (Microsoft’s Tech Test being a prime example).
What if the opposite was true and you needed exactly zero end users to perform in-depth acceptance tests at large scale? What if you could create your own artificial ‘end users’ that do the acceptance tests instead? With the fields of AI and Machine Learning making huge leaps, this ‘what if’ scenario might not be as far away as we think. The threshold to create AI bots is becoming lower and lower, opening a vast range of possibilities.
In fact, some companies are already testing the waters. Recently, Facebook developed a full platform with multiple bots that simulate end user actions and patterns, using a mix of hard-coded and machine learning bots to play out a variety of scenarios*.
While the Facebook example is one of a few standalone cases for the time being, it is likely that other companies will follow suit. Imagine what organisations could do with a small set or even an army of artificial end users or bots.
Through thorough data analysis, it is possible to distill basic patterns from user navigation journeys and usage profiles, without breaching GDPR guidelines. Through anonymized data, you would be able to build representative bots that can successfully mimic an actual end user. This would allow you to build a whole variety of profiles, e.g. ‘Average teenage user’ bot, ‘Average parent with 2 children’ bot, ‘Average system power user’ bot, etc. By analyzing user patterns and building bots that behave like them, you would be able to get far more ‘user variety’ than would be feasible to organize through manual UAT by your actual end users.
Having different profile bots also allows for the simulation of end user interaction. Without needing real people, you would be able to mimic a whole set of different interactions: end user – end user, admin – client, etc. Interaction is often the area where most application or service errors occur, as it is the most unpredictable and has a lot of different potential combinations that would be very hard to test manually. By releasing bots on the system, you could test all those interactions in a fraction of the time.
Having multiple realistic profile bots also offers the opportunity to optimize performance testing. Not only could you spin up various clones of your user bots to crank up the system load, the actual load profile on your application or service would be more realistic and therefore representative.
Traditional performance tests generally assume a generic load of X amount of users, or Y amount of open connections, with some degree of variation. By introducing a diverse array of artificial end users, you could optimize your scenario to realistic production conditions. For example, instead of simulating 10.000 generic users, you could simulate 2000 bots of type A, 3000 bots of type B, and 5000 bots of type C, which could create a different load profile entirely. Tweaking and analyzing these scenarios could result in a more optimized application or service build.
Stepping beyond the ‘traditional’ testing types of functional or performance testing, artificial end users open up another set of possibilities.
One option is to continuously feed your bot production data so it learns and adapts, and stays as close to real user behavior as possible. By doing so, you ensure you are subjugating your application or service to the most realistic and representative situation possible.
Inversely, you could choose to not update your bots but instead keep a ‘baseline’ bot which you can compare to real user behavior. Let’s take the example where you introduce a simple functional change of adding 2 additional fields on a sign-up page. Your ‘baseline bot’ does not find any functional issues and can complete all user journeys without problems. However, when introducing this change in production, you see that your ‘real’ end user behavior drastically changes and sign-ups drop by 20%. This could be an indication you’ve introduced a usability issue and end users abort their standard user journey they completed before.
If we’re taking things even one step further and you’ve managed to create hyper-realistic bots, you could even use them to not only analyse but even predict end user behavior. This could be useful in situations where A/B testing is relevant: release your bot on two different application or service variants and see which ones the bot responds better to.
All dreaming aside – while this blog highlights a few interesting thought experiments there are obvious limitations, both to the ideas as well as the current feasibility of them.
- Accurately training bots takes a considerable amount of time and resources
- Creation of the bots itself is error prone, which could lead to wrong results or conclusions
- You would only be able to create bots for products or services that already have a load of historical data, so they are less suited for validating new products
Nevertheless, despite these obvious limitations the application of AI in Testing & Quality Assurance is no longer a distant dream – it’s right at our doorstep and we’ll only see the potential uses increase in the years to come.
*https://www.technologyreview.com/2020/04/15/999871/facebook-ai-bot-simulation/