How Does A/B Testing Work?

Feb 02, 2021
An animated graphic illustrating A/B Testing

You’ve probably had times when you’re visiting a website or using an app and notice something just slightly off. Maybe the colours seem a little off, or maybe the layout of the app has changed just a tiny bit. Only for it to fix itself the next time you go back. Or you might have noticed that the version of the app that you’re seeing is not exactly the same as your friends. One possible explanation for this is that the app might be running an A/B test behind the scenes.

A/B tests are a commonly used approach for companies to validate changes to their products before releasing to the entire user base. The basic idea behind an A/B test is to present a change to a small segment of the overall audience, and see how it impacts their behaviour. Any change you make could either have a positive or a negative impact on your product. Or in some cases, you might see no impact at all.

Let’s say you operate a small e-commerce store selling T-shirts. Off late, you’ve noticed that your sales numbers aren’t quite up to the mark. You’d like to investigate the issue and hopefully make some changes to your website that would invite users to spend more of their money. Before you sink your dollars into expensive marketing campaigns, you want to make sure that the store itself is optimized. Obviously, you don’t want to redesign the entire store. But you do want to make some quick changes to boost performance.

A graphic showing an e-commerce store, the call to action is at the bottom of the page

Maybe you’ve noticed that the position of the “Add to Cart” button, all the way at the bottom of the product page, is not ideal.

You suspect that if the button were placed in a more prominent location, it would catch more eyes and perhaps invite more users to complete the checkout process sooner, and make you more money.

Sounds reasonable. But how would you validate this? Well, one way is to just make the change and see what happens. If you switch things up and start making more money, boom. Problem solved.

This would be a fine approach under a lot of circumstances. If your changes are unlikely to cause massive shifts in user behaviour, that would be acceptable. But if your users are sensitive to changes in your product, or if these changes can negatively impact your bottom line, you might want to be a little more cautious.

To continue our slightly silly analogy, what if the button is actually perfectly placed? What if moving to a different location comes across as too aggressive for your customers, deterring them away altogether? You want to verify your hypothesis, but you also want to minimize the risk. Hedge your bets, so to speak.

One way you could mitigate this risk is by, surprise surprise, using A/B tests. In order to truly validate your idea, you want to run a scientific experiment of sorts. In this case, we’d create a new version of the website with the button in a different, more prominent, location. However, the key is that we serve this version of the website to only 50% of all users. The other 50% continue to see the original website with the button unchanged. This gives us two user groups to compare.

A graphic illustrating an e-commerce store, the call to action is at the top of the page

In an ideal scenario, we might want to eliminate all other differences between the two variations, such as the product type the user is purchasing, the time of day, their location etc. We want to be sure that any differences in the user’s shopping behaviour are purely due to the design changes. If you’ve noticed, this is a lot like conducting a scientific experiment. But for our fictional website, let’s assume we arbitrarily decide which user gets to see which variation.

Now, with our control group and challenging group in place, we simply monitor the numbers associated with each variation.

In this case, for example, you could check the user’s click-through rate (how often users are clicking on the “Add to cart” button) to see if there are any significant differences between the two.

Based on the experiment, you might conclude that there is, in fact, no difference between the two versions. This tells you that your initial hypothesis was incorrect and you need to look for other things to change. Or you might discover that you were right all along, and decide to roll out these changes to your entire user base (and enjoy the increased profits).

If all of this sounds too scientific, it’s because it usually is a scientific process. Understanding user behaviour is complicated and A/B tests can often get quite granular. They require a significant amount of statistical knowledge to extract any meaningful insights. But they’re also are a very powerful tool if used properly.

As users, we’re subject to A/B testing all the time without realizing it. In fact, any given product could be running multiple tests simultaneously as long as those tests don’t overlap. Here’s a Netflix blog explaining just how religiously they use A/B testing to make their product better over time.

So the next time you see things that seem off when browsing the internet, you know what’s going on.