A/B Testing is a vital tool in marketing. It allows for two different types of analysis:
- To compare the behavior of exposed vs. unexposed users for a marketing campaign.
- To make marketing decisions by exposing and comparing a group of users to experimental changes/adjustments, in a campaign.
Test & Control (T&C) is Eyeview’s implementation of A/B testing. Test is the group of users exposed to a campaign, while Control is a smaller group which is compared with the Test group.
We came across a few challenges when first implementing this feature. Different approaches we enacted had their own limitations. The mechanics of our implementation evolved over time until we reached the desired behavior.
Approach #1: Mutually Exclusive Tactics
Tactics are entities that define budget, audience and targeting configuration.
In any A/B Testing scenario, mutual exclusivity is important to keep groups clean, without interference, to be able to compare them statistically. Individuals belonging to group A should never be exposed to B and vice versa. However, we also need both groups to comprise the same type of audience.
We enacted two tactics, one for Test and one for Control. These two tactics were mutually exclusive: a user served a video from the first tactic wouldn’t see any videos from the second one (audience tied to only one tactic). Both tactics had identical targeting and different goals for video views – typically a 90%-10% split ratio.
This simple idea had limitations and problems:
- Human error caused targeting to be different between T&C tactics: changes applied to one tactic needed to be applied to the other.
- Manual adjustments were required from campaign managers to accomplish the split ratio.
- The split ratio was a ratio of video views instead of a ratio of users (as we mentioned that tactics are separated by budgets).
- The performance of T&C tactics was measured independently, resulting in different prices – therefore different audiences.
- Sometimes, multiple targeting strategies were employed (say we want to spend $1000 on fashion shoppers and $2000 on car enthusiasts). In that case, we needed multiple tactics for test and multiple tactics for control. Aside from being an even harder operational challenge, this was also an unnecessary delivery constraint because all the tactics were mutually exclusive (so if a user was a fashion shopper and also a car enthusiast we used only one category).
Approach #2: Ad Serving Routes
For our second attempt, we allowed a single tactic to own the T&C split.
An Ad Serving Route (ASR) is Eyeview’s way of directing a user to campaign videos within the tactic. If we define a Test ASR (i.e. test videos) vs. a Control ASR (control videos), we could now make sure to split ASRs according to weight as well as making sure users are “sticky” to an ASR.
It seemed like the right solution but provided new and unresolved problems:
- If we needed multiple ASRs, all the Test ASRs and all the Control ASRs were mutually exclusive with each other – similar to approach #1, there was still an unnecessary delivery constraint.
- This scheme didn’t allow mutual exclusivity among different tactics within the same campaign.
Approach #3: Ad Serving Routes + Groups
In addition to ASRs we introduced the concept of and Ad Serving Route Group (ASRG).
An ASRG contains ASRs from different tactics within a single campaign. The Test group and the Control group are modeled as different ASRGs. One campaign may have many Test ASRGs and many Control ASRGs, all of them are mutually exclusive from each other.
In this scheme:
- Within the scope of a tactic, stickiness continues happening at the ASR level.
- Within the scope of a campaign, stickiness happens at the ASRG level.
This mechanism solved our previous limitations:
- Different Control ASRs were not mutually exclusive with each other as far as they belong to the same ASRG.
- Different tactics were mutually exclusive as far their ASRs belong to different ASRGs.
Implementation and Technologies
We use MongoDB for our Campaign DB, which includes campaign, tactic, ad serving routes, and ad serving route group definitions.
We use Aerospike to store the user frequency data. Here we store the history of our users, keeping track of what tactic and what ad serving routes we served.
When our bidder processes a bid call and evaluates a tactic as a candidate to serve, it checks the user history to determine if the user has already been served by that tactic or by another tactic in the same campaign. If that’s the case it performs user personalization, enforcing the conditions of stickiness to ad serving routes and mutual exclusivity between ad serving route groups.
Ad Serving Route Type Available in User Scoring Context
When we process a bid call we leverage a user scoring functions (USF) to decide how interested we are in the user. We do this for every tactic that is a candidate to serve.
We make USF calls by passing a variety of values related to the tactic, the user, and the bid call. We decided to add the Ad Serving Route Group type (an indication of whether we are going to serve on a Test or a Control group) to that input.
To accomplish this we reshaped our targeting logic. Both USF evaluation and video variant selection are time and resource-consuming tasks that don’t happen in our bidder instances, they all run parallel in dedicated servers. In the past, our targeting logic comprised both ad serving route selection and video variant selection. We needed to move the ad serving route selection to an earlier state in the bid call processing pipeline so that all parallel calls could be made with all the required inputs.
At Eyeview we understand that the success of a video campaign is determined by its capacity to deliver results. We also assert that campaign performance must be measured thoroughly.
Our Test & Control mechanism allows us not only to measure the outcome we deliver to our clients but also to continuously explore how results can be improved, providing support to make smart marketing decisions.