Poor A/B testing methodologies are costing online retailers up to $13bn a year in lost revenue according to new research from Customer Experience Management (CXM) platform, Qubit.
Whilst AB testing – which lets retailers test the impact of changes to their websites – has often claimed to deliver sales uplifts, poorly executed tests often fail to deliver meaningful results and can even lead to a negative impact on sales. Properly executed tests, however, have been shown to deliver a potential 12% uplift in sales, totalling $13bn across the global revenue of US companies according to the Qubit research.
A/B testing involves comparing a change to a website against the original with consumers on a live website. The sales impact of the change can be measured by comparison to the performance of the original. However, whilst the concept is simple, there are a number of statistical tricks that can catch out the unwary, producing results that seem to show significant positive results where none are present or that understate positive impacts. Without a solid application of statistics to A/B testing approaches, retailers risk actually damaging their business through reliance on inaccurate testing results.
There are three key reasons for failures in A/B testing:
- Insufficient statistical power – Statistical power is the probability that a test will accurately detect a trend. To ensure appropriate statistical power the sample size for a test must be carefully calibrated to ensure accuracy – a calculation that is often missed out in mainstream testing approaches.
- Multiple testing – Running multiple A/B tests simultaneously – as advocated by many systems – significantly increases the likelihood of receiving a false positive result. Similarly, stopping tests as soon as they show a positive result significantly impacts on testing accuracy.
- Regression to the mean – Over time, many apparently positive changes will see a declining return. This is due to the well-known but rarely applied, statistical phenomenon of regression to the mean – which essentially states that false positives are inevitable in small testing samples but that anomalous results will regress towards the average, accurate result over time.
Graham Cooke, CEO of Qubit, said: “A/B testing has been sold as a way to scientifically test the impact of changes to your website. However, as with all science, unless your experimental methodology is robust, the results of your testing will be meaningless. Many A/B testing approaches take a statistically naïve approach to methodology, leading to test results that are inaccurate at best and actively damaging at worst.
“Optimizing website performance is a top priority for advanced online retailers and testing is a vital part of making sure that optimization strategies are effective. However, if your testing is flawed then you might as well make changes at random in the hope that they’ll succeed”
Qubit Whitepaper: Most winning A/B test results are illusory
Qubit infographic: A/B Testing: 5 Checkpoints on the road to success