When Testing Goes Wrong (And How to Keep Control)
By Richard Parkin
Keeping up a constant flow of tests is fundamentally important for any direct response entrepreneur. It’s the only way to maximize your return with an offer – intuition and best practices only take you so far, and won’t always give you the right answer.
That said, testing needs to be handled right, or it can drag your offer down into unprofitability, without giving you any useful information. With the right procedures in place, however, you can completely minimize any test-related risk, while ensuring that the data you gather is statistically relevant and immediately useful.
This blog is all about where testing goes wrong, and the systems you can use to cut out those potential issues.
When To Run Tests
While you should be running tests on a constant (or near-constant) basis, there’s no point in testing the same thing over and over. Focusing exclusively on one part of your funnel can stop you getting the accurate data you need, while leading to fatigue for your marketing team.
Essentially, if you’re testing headline after headline after headline, you don’t have a baseline stat to compare everything to – all you have is the results of your last split test. That’s going to throw off your understanding of how your offer performs, and take away your ability to accurately assess the impact of any successful tests that you implement.
Whatever you’re testing, keep up a rotation. Never test the same thing twice in a row. Typically, we progress the tests we run on a level-by-level basis, to ensure an extra level of data clarity: we’ll launch with landing page tests, progress to any interstitial pages, the checkout, the upsells, and so on, repeating once the loop is complete.
When Are Results Significant?
This is one of the most challenging parts of running a test. One variant may seem to be an obvious winner, converting higher day after day – and then completely collapse before the finish line, with nothing else having changed. If you called the test early, you’d never have seen that downward trend, and have no idea why your ROI started to drop.
At the same time, you can’t just leave tests running indefinitely in case the pattern reverses itself. You’ve got to know where to draw the line, and how to draw it quickly enough to reap the full benefit of your testing.
It’s all about statistical significance. You need to have enough data that your conclusion is justified. This can be a difficult line to draw, and really depends on what you’re testing. 10 extra daily orders for a split test may be significant if you’re testing on something that typically makes 50 sales a day, but may be irrelevant if you’re making thousands of daily sales.
You need to consider the significance of your audience, your results, and the time period that you’re running the test across.
Have enough people seen the test?
Are your results distinctly different, or is there only a small difference between the test and control?
Are the results from an average period of time, or has something affected user behavior?
Typically, we use a couple of shorthand methods to check for significance and determine whether it’s time to dive further into the results:
- Minimum of 100 conversions.
- Minimum of 10% difference (to justify implementing a change).
- Minimum of a week’s worth of results.
Unsuccessful Tests
Not every test is going to improve how you operate. When you prepare to launch a test, you should be prepared for it to damage your sales. It’s the classic saying – you have to spend money to make money.
If you approach testing with a considered perspective, however, you can do a lot to mitigate the failures. Qualify the tests that you run – why do you believe they’ll be worth running? What do you expect to find – and what do you hope to find?
By planning out a testing schedule in a data-driven system based on the possible shortcomings of your funnel, you can minimize failed testing – and turn them from total failures into valuable data points for future planning and adjustments.
In PPC advertising, however, failure can be harder to tolerate. An unsuccessful creative doesn’t just mean lost revenue – it also means lost money, a difference that makes failure a lot more severe.
Below, we’ll explore the options you can explore to minimize the effects of failed PPC campaigns, ads and testing.
How to Deal with Failed PPC Tests
The number one mistake for PPC testing is arbitrariness. In a lot of cases, you’ll see advertisers launch campaigns at arbitrary times, wrap them up after an arbitrary amount of time, spend an arbitrary amount of money, judge the results arbitrarily – and then wonder why their results are so inconsistent.
Whenever we roll out PPC testing, we build out a plan. That plan covers timing, spend, and thresholds for making adjustments. We determine our maximum losses before spending even a cent on the campaigns. That means it’s impossible for us to lose an unacceptable amount of money – we’ve signed off on the campaign spending, know exactly what’s going to happen based on the results, and know who’s going to make any changes.
Case in point. A client asked us to roll out some Facebook ad testing based on their team’s creatives. We set up a system where we’d spend $100/ day on each ad, rolling out new campaigns at midnight on Thursdays.
Unless there was some incredibly obvious issue, we wouldn’t touch those ads until Monday. That’s 4 days of results, at a maximum of $400 spending per ad, a level of loss that they were happy to sign off on.
We set up a series of thresholds for taking action on Mondays. If the CPA for an ad was above a specific level, we’d pause it. If it was below, we’d increase the budget. If it was in-between, we’d revisit the ad 3 days later, making a decision based on a full week of performance data.
Result? They’ve got an effective bank of ad content, with the results just improving over time, driven by effective split testing for both the ads and funnel.