Forem Creators and Builders

Jeremy Friesen
Jeremy Friesen

Posted on • Originally published at takeonrules.com on

Forem Feed Experiment One

January Results

Background

Earlier this month Amy wrote about running an experiment on our feed. And it’s time to revisit that experiment and make a decision.

The Goals

In our previous feed experiments, we established six goals to track for our feed experiments:

  1. User creates a comment.
  2. User creates comments on at least 4 different days within a week.
  3. User views pages on at least 4 different days withint a week.
  4. User views pages on at least 4 different hours within a day.
  5. User views pages on at least 9 different days within 2 weeks.
  6. User views pages on at least 12 different hours within five days.

For this current experiment, which we’re wrapping up, we re-used those goals.

Here’s a link to the code that captures “conversions” for each of the goals.

The Methodolgy

We use the field_test gem to facilitate the Bayesian A/B hypothesis testing. As part of the experiment, I added an AbExperiment model to Forem. This provides numerous mechanisms to test and toggle experiments. Which proved fortuitous when I broke production.

We then introduced the code to select which Feed algorithm to use. And aside from the minor outages I introduced (and we corrected), we sat back and let the experiment run.

Results

Below are the summary of results regarding the experiments:

Scenario Incumbent Conversion Challenger Conversion Likely Winner Probability of Winner
Creates a comment. 5.58% 5.87% Challenger 90%
Creates comments on at least 4 different days within a week. 0.23% 0.19% Incumbent 78%
Views pages on at least 4 different days withint a week. 23.98% 23.52% Incumbent 86%
Views pages on at least 4 different hours within a day. 14.17% 13.62% Incumbent 94%
Views pages on at least 9 different days within 2 weeks. 9.60% 9.41% Incumbent 73%
Views pages on at least 12 different hours within five days. 2.24% 2.13% Incumbent 73%

Conjecture

First, and foremost, it appears that both feed strategies encourage close to the same engagement. Which is reassuring that the experiment likely did not adversely affect the DEV.to experience.

Second, I’m prepared to call this first experiment in favor of the incumbent.

Third, it appears that the challenger encourage initial conversations, but those conversations dwindled overtime.

Why do I think that this is the behavior? My hypothesis is two primary changes for the challenger:

  • The daily_decay_factor, the numeric multiplier we assign to the publication date, overly favored more recently published articles.
  • Sorting the relevant feed entries by publication date, instead of the relevance score.

Let’s look at the change in publication date decay rate.

Days Since Published Challenger #1 Weight Challenger #2 Weight
0 1 1
1 0.95 0.99
2 0.9 0.985
3 0.85 0.98
4 0.8 0.975
5 0.75 0.97
6 0.7 0.965
7 0.65 0.960
8 0.6 0.955
9 0.55 0.95
10 0.5 0.945
11 0.4 0.94
12 0.3 0.935
13 0.2 0.93
14 0.1 0.925
15 or more 0.001 0.9

For the original challenger, I chose a more aggressive decay rate. For the second challenger, I’m significantly easing off of the decay.

I’m also removing the order by publication date, so the upcoming feed experiment will now sort things in relevance order.

Next Steps

I’ve begun the proposal for our next feed experiment. This introduces a few minor tweaks and is intended to be a point for a conversation around how to configure the challenger’s case statements.

Discussion (0)