What Is A/B Testing? A Detailed Introduction

Do you want to make sure that your website is performing the best that it possibly can be? Perhaps you want to know if your advertising money could be working better. Maybe you want to see if you could be getting more clicks from your email campaigns.

Enter A/B testing….

What is A/B testing?

A/B testing is a simple way to compare two versions of the same webpage, email, ad, or other content to see which one performs better. Another word for it is split testing, because you split your test audience between two versions of content to see which one performs better.

Door with A and B on it to represent A/B testing

A/B allows you to get very specific and test one thing at a time to see exactly what things are influencing your audiences’ behaviour and which things are not.

How can A/B testing help my online business?

Increase your conversion rates

A/B testing can help you understand what works best for your customer base or audience in terms of color, layout and design, wording and more. By testing different parts of your site, you can optimize your site for your particular audience and make it easier for them to find what they’re looking for, which increases your chances of them turning into sales.

Increase your bottom line

A/B testing can also save you a lot of money by helping you identify ways to get more bang for your buck.

For example, by testing your ads to find the ones that work better, you can justify spending more money on ads because you know they work better in bringing in the customers! What if one version of your ad works four times better than another? Wouldn’t that be something worth knowing?

Reduced support time

Maybe the layout of your site is confusing to your audience. Simply moving a menu or button to a different part of the page might make it easier for visitors to navigate and find the information they need, therefore reducing your support time.

Increased traffic

Testing different titles can help you increase the number of people who click through to your site.

For examples, does a negatively worded title get you more clicks than a positively worded title?

Decrease cart abandonment

Testing different checkout designs can help lower the number of customers who abandon the cart without making a purchase.

For example, does a one page checkout work better than a multiple step checkout? Are there unnecessary things in your cart that make it hard for your customers to navigate?

Basically, A/B testing helps you improve your online business bit by bit and based on real data.

How does A/B Testing work?

A/B testing works by testing the responses of your audience to two different versions of your site, email, or advertisement and seeing which one performs better.

I’ve broken it down into steps to make it easier to understand.

Dirty car next to clean car to show A/B testing
Do you prefer the dirty car or the clean car?

Know your stuff:

If you’re going to test something, you’ve got to first find out what you want to test. This means you’ll have to look at your analytics and be aware of your baseline performance. This is what you’ll be testing against.

For example, if you’ve already sent out emails in the past, know your current click rates. Then, if you decide to A/B test two new email styles against each other, you will be able to see which performed better out of version A and version B, and also if either of them perform better than your average baseline (but keep in mind that your baseline performance might be influenced by other factors than your test performance).

Once you’ve picked a specific part of your online business that you want to test, you need to pick ONE thing to test. The reason for this is, if you try to test more than one thing at the same time, you can’t be sure exactly which one was responsible for the results.

In our example, we’re going to test a call to action button color.

Theoretically, you could test more than two button colors. For example, you could test red vs blue vs green and split your audience between the three versions. But if you want to test multiple variables at once (like color AND location), then you need to use a different method, such as multivariate testing.

For now, we’re going to stick with A/B testing red vs blue button color.

Hypothesis and null hypothesis:

Like all good experiments, you need a theory, because once you’ve picked something to test, you’ve got to have something to measure it against.

white coffee beans next to brown coffee beans to show A/B testing
We think people prefer brown coffee beans over white coffee beans

Your hypothesis is what you think will happen if there is a relationship between the two variables. The null hypothesis is what will happen if there is no relationship between the variables.

Generally the structure of a hypothesis and the null hypothesis are as follows:

Hypothesis: Variable A and Variable B are dependent, a change in Variable A will result in a change in Variable B. Null hypothesis: Variable A and Variable B are independent. A change in Variable A will NOT make any difference to Variable B.

You could leave your hypothesis non-directional and just say there will be a change. But generally when you make changes to your web pages or emails, it’s under the assumption that they’ll perform better, not worse! So, we’ll assume your theory will be that the change will increase the performance of your store!

So, in our example, we think a red call to action button will get more sales than our current blue one. Our hypothesis would be something along the lines of:

Hypothesis: There is a relationship between button color and conversion rate, changing the button color to red will increase our conversion rate. Null hypothesis: There is no relationship between button color and conversion rate, changing the button color will make no difference to conversion rates.

Design:

You will need to create two identical copies of the content you’re trying to test, be it a web page, email, or advertisement, and change only one variable. One version will be your control version (version A), and the alternative is your experimental version (version B).

It is very important that you test both versions at the same time (unless you’re testing for the best time of the day to show your content). You can’t test version A today and version B tomorrow because you won’t know whether something about today or tomorrow influenced your results.

In this case, we’re going to have the sales page with the blue button as our control. The identical sales page with the red button is the experimental group. And we’re going to test it over a period of one month to see which one performs better.

Organize your sample groups:

You’ll need to split your audience into two groups. This is a lot easier to do with email A/B testing as you can just create two different email campaigns. If you’re doing a web page A/B test, you’ll need to split the traffic evenly between your control and your experimental versions.

You also have to take into account sample size (the number of people you test) here. A larger sample generally means more accurate results. Again, if you’re doing an email campaign, it’s easy because you have complete control over sample size. But if you’re doing a web page test, then you have no control over how many people visit. The longer you run the test, the larger your sample will be, which means your results will be more reliable.

Test:

This is the easy part, it just takes a bit of patience while you let the test run and wait for the numbers to roll in. It is recommended that you only do one A/B test at a time. If you try to run too many experiments at the same time, you won’t know if one test has influenced the results of the other.

Analyze:

You’ll probably be able to see simply by glancing at the data which version was more successful. However, you’ll want to get the real stats.

picture of graph and ruler to show real statistics

Performance and effect size

Performance is the thing we’re all interested in. Which one did better? How much better was it? Simple calculators such as this one tell you how much better one test performed than the other.

Let’s say in our sample, we had 2020 visitors to our site. 1000 saw the control version (blue button) and 1020 saw the experimental version (red button). Of the ones who saw the blue button, 310 bought our product. Of those who saw the red button, 400 bought our product.

That means your blue button had a conversion rate of 31% and the red button had a conversion rate of 39%. The red button performed 26.5% better than the blue button.

Statistical significance

The next and important thing you’ll want to know is if your results were significant. The percentage change doesn’t matter if the change was due to random chance. So it pays to test the significance of your findings. That way, you can rule out the thought that your results were due to random chance.

You can use online calculators such as this one, or this one. But it’s a good idea to know the maths behind the stats to get a better understanding of them.

In A/B testing, you’re going to get one of three results

  1. Your alternative version won. It had a statistically significant better performance over your control version.
  2. Your control version won. It performed better than the alternative and the results were statistically significant.
  3. Your results were not statistically significant. There was no great difference between the performance of the two versions.

Repeat:

Whatever your result was, you should not stop testing. If one version won over the other and your results were statistically significant, use that to improve your site and then keep testing for ways to further increase performance.

If your results were not statistically significant, check whether you did something wrong in the testing process. That way, you can fix it and perform the test again. If you can’t find any errors on your part, test new ways to improve that aspect of your site for your audience.

The motto here is: test, test, re-test, and then test again.

How do I interpret the results?

Check for significance

Generally in experiments, results are seen as significant (in other words, true) when you can say there’s less than a 5% chance that the results observed would occur in a world where there is no relationship between the variables you’re testing. It’s up to you at what point you consider your results statistically significant enough to be worth believing and acting on.

In research, significance is measured using the p-value. If you have a p-value of less than 0.05, then your results are generally considered to be significant. In other words, you can say there is less than a 5% probability that the results I observed (with the red button performing better) would have happened if there was no relationship between button color and conversion rates.

If your p-value is greater than 0.05 (e.g., 0.07 or 0.6) this means your results do not show a statistically significant relationship between the two factors tested. It probably means your customers aren’t really influenced by whatever you’re testing, for example button color. Or it could mean that your audience equally likes both colors. The only way to know for sure is to do a repeat test, but change one of the colors for a different one. If you again don’t see a real difference, then it probably means your audience isn’t really influenced by button color.

Check for the strength of your effect

In this case, our hypothesis was correct. There does seem to be a relationship between button color and conversion rate. And we can see that the red button seems to result in a higher conversion rate than the blue one.

Once you’ve discovered if your result was significant, then you need to decide whether there was actually a big enough difference between the two factors for it to be worth you changing your store. Was the difference in behavior that you observed worth making a permanent change to your site? That’s up to you to decide.

And remember…

A statistic is only as strong as the data it’s gathered from – meaning that your test needs to be set up correctly from the start and you need to take into account other things that might be influencing your audience’s behavior. And still, our interpretations can be wrong. Just because red performed better than blue, doesn’t mean that red is the best color for our store. We should do another A/B test comparing red to a different color and see how that performs.

As you’re not trying to save the world here, the main thing to remember is that the stats are there as a guide for you in optimizing your store. Keep a mindset of continuous improvement and keep testing as you go!

Stats are not the be all and end all forever. Things could change next year, next month, or even next week. If in doubt, test again!

What can I use A/B testing for?

Basically anything you can change, you can test. But you should try to focus on the things that you believe will have the biggest impact on your business.

I’ll include a list of possibilities to get you started:

  • Website design, layout, or wording
  • Email subject lines
  • Email sender names
  • Logos and banners in emails or on websites
  • Navigation link location, color, size
  • Call to action button design, size, location, color, or wording
  • Promotion package bundles/deal location, styling, pricing.
  • Free trial signup options and trial length
  • Advertising wording or design
  • Font size and style

Some exampes for testing:

Button/menu placement

You can test where a call to action button or menu should be located on your sales page by creating another version of the page with just the button/menu moved to a different spot.

Button colors

Like in our earlier example, you can change the color of your call to action button and see if that helps with conversion rates.

Optimal form length

Maybe you’re not getting the information from your customers that you want because you’re asking for too much. You could try A/B testing with a shorter length form that asked less questions to see if more customers fill in the form.

Email subject lines or web page titles

You can test whether certain subject lines or page titles get opened more than others. This way you know what sorts of words or topics gets your audience intrigued.

What are the challenges and limitations of A/B testing?

Limited insight

Since you’re only testing one thing versus another, your insight is really limited to those two variables. This means you might have to do a series of A/B tests to really get down to what the best design is for each section of your web page/email.

For example if you’re testing a red vs blue button, the test won’t tell you if another color, such as green, would work even better. It also won’t tell you if the button could be in a better location on your page. You would have to run more A/B tests to figure this out, or use a different testing method.

Time consuming

Ideally, you can only run one A/B test at a time. Otherwise you don’t know for sure if one test is being influenced by another. This means that it can take a lot of testing and re-testing to truly optimize your site.

For example, you can’t use A/B testing to test button color AND button location at the same time. You would have to first test color and then test location afterwards.

alarm clock and pink and white split screen to indicate time testing

Misleading results due to insufficient testing time

Not running an A/B test for long enough can give you skewed results that do not reflect reality. You should make sure you get a big enough sample size and run your test for a least a few weeks to make sure your results are as accurate as possible.

For example, in the week leading up to Christmas, people might be drawn more towards red than other colors, which may not reflect their behaviour during the rest of the year.

Summing up

A/B testing can be a really useful tool to help you hone in and optimize specific parts of your site or marketing efforts. It can help you get your money working better in advertising as well as allowing you to identify ways to increase your conversion rates.

Always check whether your results were significant and take a look at the effect size of your results so you know whether it’s worth making changes and digging into this aspect of your site further.

Ideally, you would be contantly running A/B tests and continually learning from them, whether the results was as you expected or not. There’s always more ways to improve and optimize your online store.

Some people refer to tests that showed no significant results as failed tests. But no test is a failure if you can learn from it – and you can ALWAYS learn something. If your A/B test results were not significant, there’s insights to draw from that too. Why didn’t the audience react the way you thought they would? What other aspects of your site are having a higher impact on your audience? What other ways could you increase audience engagement with the variable you tested?

When using A/B testing, it’s important to keep on questioning. Don’t simply take the result of one test and use it to create a rule. Ask yourself why. Why did the audience prefer that version? What other implications might this have? How can you use this insight to improve your store even more?

Once you have determined something is good for your store, make sure you re-visit it and re-test it from time to time! After all, what was good one year ago, might not be still the best for your site today.

Once you’ve mastered A/B testing, the next step is to start tracking multiple metrics when running your tests. For example, having a video on your sales page might impact on conversion rates, time spent on page, and number of support tickets.

After you’ve mastered tracking and analyzing multiple metrics in an A/B test, you can move on to multivariate testing where you test multiple variables at the same time (such as button color AND location).

Most importantly, keep on testing, learning, and optimizing! Remember, test, test, re-test, and then test again!

Default image
Lari
Hi, I'm Lari. I hope what I've written here is easy to understand and you find it useful. Feel free to drop a comment if you've got any questions or topics you want me to look into and write about! Catch you in the next one 😊

Leave a Reply