“Being right 90% of the time means 50% of your ads aren’t performant.”
If that math sounds absurd to you, feel free to read the rest of this article and write me a scathing review in the comments.
Over the years I’ve been noticing an unfortunate reality — advertisers aren’t very good with statistics.
Maybe that’s news to you.
I’m sure I’ve made my fair share of statistical errors, as well — so I’ll lump myself in here with the rest of the lot. But I’m seeing statistical errors that I think should be pretty elementary being shouted by the “loudest” on social media, and it’s something that I believe we should start calling out and encouraging each other to raise the bar on advertising’s collective statistical acumen.
A single example on Twitter
Just last week, we released an article on our blog about aggregated data dumping and how that’s impacting Facebook Ads’ ability to pick the winning advertisement.
You would think suggesting that maybe, just maybe, Facebook might not always be picking the winning advertisement wouldn’t be controversial, but I was surprised by how strong the pushback was on Twitter.
That led to someone else posting a poorly framed question about this later that day…
… which Karan Jasser was quick to point out.
Side note: I want to be very clear here, all of the people that I’m mentioning above are well-respected advertisers — not hacks. I’m hiding some of their names simply because I personally respect them and I don’t want to “call them out” — I want to call attention to the concepts in the tweets themselves so we can have some good advertising dialogue.
I’m 100% sure I’ve got just as many examples of me saying something incorrectly as well. That’s ok — that goes with the territory of pushing the boundaries of advertising — testing, exploring, etc.
What about the other 50%? Or was that 10%?
Here’s where it gets a little more fun.
Let’s assume that Facebook can pick the most performant ad 90% of the time. That sounds like a really good deal, right?
Note: Now, before you get too carried away with semantics, I do realize that the Twitter user said that Facebook would beat him 90% of the time, not that Facebook was right 90% of the time. Since I don’t know if Facebook would beat him 90%, someone else 80%, and someone else 50% of the time, I chose to look at this with Facebook having a 90% accuracy (which I think is a fair, but generous, number — you’ll be hard-pressed to find people that truly believe Facebook is right more often than that).
Why don’t we bring in our trusted friend (and apparent allergen), statistics, to see if it’s as good of a deal as it sounds.
First, let’s all agree on a few things for this to work…
- We’re going to categorize ads into two buckets, Performant vs. Non-performant. In real life, this is a continuum, not an either/or, but let’s at least assume that the top 10% most-performant ads are in the Performant group, while the bottom 90% are in the Non-performant group — because there’s often a noticeable difference between the BEST ads, and the ads that are “just ok”.
- We’re going to assume that we’ve run 1,000 ads in this account.
- We’re also going to assume that Facebook will correctly identify if an ad is Performant or Non-performant with a 90% accuracy (by picking it as the winner to put the spend behind).
The question: If Facebook has chosen Ad A as a winner, what is the probability that Ad A is actually Performant?
Let’s start by setting this up in a grid to visualize it. In our grid, we have one column P for the Performant group (10% of our ads), and 9 columns N for Non-performant group (90% of our ads). Since we have 100 boxes here, each box represent 10 ads.
Let’s look at our Performant group first. We’ve highlighted 9 of the 10 boxes with a darker green and called them “chosen” because Facebook has a 90% accuracy in correctly identifying the Performant ads.
We can look at the Non-performant group next. Facebook also correctly identified the ads that were Non-performant with an accuracy of 90%, which we’ve highlighted in a darker red. That leaves only 10% of the Non-performant ads that Facebook incorrectly chose as “performant” — sounds good, right?
Great, so we now have a grid that shows Facebook had a 90% accuracy in correctly identifying the Performant ads, and a 90% accuracy in correctly identifying the Non-performant ads.
Let’s add up the numbers in each group.
- Performant Ads correctly chosen as Performant – 90 (90% of 100)
- Performant Ads incorrectly chosen as Non-performant – 10 (10% of 100)
- Non-performant Ads correctly chosen as Non-performant – 810 (90% of 900)
- Non-performant Ads incorrectly chosen as Performant – 90 (10% of 900)
Remember, the question that we asked above was, if Facebook has chosen Ad A as a winner, what is the probability that Ad A is actually Performant?
We have 180 ads marked as Performant — 90 Performant Ads and 90 Non-performant ads, so Ad A would be contained inside that set of ads.
That means the probability that Ad A is a performant ad is only 50%. OUCH!
That’s not nearly as great sounding as Facebook having a 90% accuracy.
I don’t believe you — your math is wrong somehow
The math on this is very well established. In fact, this is based on Bayes’ Theorem and it’s really quite magical. I won’t go into the math too much in this article, but if you’re curious for some “extra credit” reading, here’s the formula…
Image credit: https://en.wikipedia.org/wiki/Bayes’_theorem
This problem isn’t unique to just advertising. When doctors have to tell someone bad news about a test (like a positive cancer screening on a mammogram), they have the same type of problem. In fact, there was a study done by Gigerenzer that suggests that only 21% of doctors answered this type of question correctly.
Doctors are smart. So are advertisers. Both are quite capable of misunderstanding the statistics in a situation like this.
But what do doctors do when they get a positive test result, that’s 90% accurate, yet they’re worried about the potential for a false positive?
They look at other data.
And that’s the same thing we should do as advertisers. If we have an ad that’s getting the majority of the ad spend (because Facebook has chosen this ad as Performant), it’s never a bad idea to double-check this from time to time.
Note: Admittedly, the math above is based on assumptions. There are no official case studies that prove that Facebook has a 90% accuracy rate in choosing performant ads. However, it feels like a safe number to use for the purposes of this illustration, because it seems unlikely that Facebook has a higher than 90% accuracy rate. On the flip side, if Facebook has a lower than 90% accuracy, then it only further supports this article.
I would agree that Facebook does a great job of putting the budget behind the right ad, and there are many reasons for this:
- Facebook is using a lot of data to determine intent to purchase
- Did that person just visit another website that is selling the same type of product?
- Does this person EVER purchase things online?
- Is this person engaging with content about that type of product?
- And literally… SO much more!
- Facebook is looking at the leading metrics on the ad itself
- Are people engaging with the ad (liking it, commenting on it, sharing it)?
- Are people clicking on the ad?
- Facebook is looking at lagging metrics as well — but that’s where it gets blurry.
- Adds to Cart
- Purchases, etc.
Facebook can use all of that data to make AMAZING predictions for who to show the ad to, even if you set the audience to wide open. Don’t underestimate the data that Facebook still has access to and how good it is at using it. They actually do a fantastic job of outlining how the Facebook advertising algorithm works here.
But it’s equally as wrong for us to ignore where the data gets fuzzy for Facebook. And I think some people don’t understand how strong of an indicator attributed purchases are for informing the algorithm on which ad should get the budget.
The data is blurry, but so are my eyes after reading such a long article
I hear you… we’re almost done.
What happens after someone clicks on that ad?
- Facebook goes into “data loss mode” where it begins to model out performance since Apple blocks so much of the data
- Google Analytics logs the event and captures the UTM parameters
Admittedly, Google Analytics isn’t perfect at capturing the UTMs on every visit. For as long as advertising has existed, we’ve known that Google Analytics significantly underreports on many of those things. Plus, Google Analytics isn’t completely perfect at tracking this beyond 7-days (thanks to ITP 2.3).
But still, what it does track is quite accurate — underreported — but based on actual clicks and conversions.
For Facebook (and other apps affected by Apple’s changes like TikTok, Snapchat, Pinterest, YouTube, etc.), the data is no longer based on actuals. If someone purchases after clicking on that ad, Apple will “hold” that from being passed through for up to 72 hours, and it will “batch” those purchases together.
We call that, “aggregated data dumping”.
There could be 5 or more purchases that get sent from Apple to Facebook at one time, and Facebook has to decide where to attribute that batch of them. I’d imagine Facebook has gotten better and doesn’t need to batch ALL 5 to the same ad even if only one of those purchases came from that ad, but this is important… Facebook DOES NOT know which ad or ads were exactly responsible for those purchases, so it has to “guess” (through some admittedly advanced modeling).
That’s a significant difference, though, compared with using the ACTUAL data the way that Google Analytics is doing in this case.
It’s in these moments, that my team and I like to look at the Google Analytics Paths report to compare numbers. If Facebook says that Ad A is driving the purchases, and Google Analytics agrees — great. But if Facebook says that Ad A is driving the purchases, and Google Analytics disagrees — and we know that 50% of the ads that Facebook picked as “winners” with a 90% accuracy are potentially NOT the performant ads (based on the math above), then it might be worth considering some interceding action.
So, should I switch, or not?
Since 50% of the ads are Performant and 50% of the ads are Non-performant, blindly turning off an ad at random has a 50% chance of being the right decision.
But when you add in the data from Google Analytics Paths you should be able to increase your chances of making the right decision.
Advertising analyst, Mitch Larson, states that, “the key would be in determining the statistical significance or confidence level (ex. 78% confidence of accepting/rejecting the null hypothesis, which is that GA is right and FB is wrong). You can get that number based on analyzing the difference between the two ad metrics in GA.”
But let’s save that for another article 🙂
Alternatively, you could set up a geo-holdout test, you could simply turn off the ad that Google Analytics is saying isn’t performant, you could run those ads to different landing pages, or a number of other tests to see confirm one way or another. At Elumynt, we run a lot of these types of tests for our clients.
Oooooooor, you could do nothing…
… let it continue on its merry way, and put your full faith in an algorithm that’s been forced into using verifiably incorrect data on one of the most important data points (the actual conversion).
The algorithm is brilliant, and we recommend using it to your benefit (and not working against it), but we also believe it’s prudent to double-check results with other data sources, and if necessary, intervene a bit to make sure you’re supplying the algorithm with the best data so it can continue to make more of the right decisions.
As long as you aren’t allergic to statistics.