Forem Feed Experiment One

#product #discuss #feedalgorithm

January Results

Background

Earlier this month Amy wrote about running an experiment on our feed. And it’s time to revisit that experiment and make a decision.

The Goals

In our previous feed experiments, we established six goals to track for our feed experiments:

User creates a comment.
User creates comments on at least 4 different days within a week.
User views pages on at least 4 different days withint a week.
User views pages on at least 4 different hours within a day.
User views pages on at least 9 different days within 2 weeks.
User views pages on at least 12 different hours within five days.

For this current experiment, which we’re wrapping up, we re-used those goals.

Here’s a link to the code that captures “conversions” for each of the goals.

The Methodolgy

We use the field_test gem to facilitate the Bayesian A/B hypothesis testing. As part of the experiment, I added an AbExperiment model to Forem. This provides numerous mechanisms to test and toggle experiments. Which proved fortuitous when I broke production.

We then introduced the code to select which Feed algorithm to use. And aside from the minor outages I introduced (and we corrected), we sat back and let the experiment run.

Results

Below are the summary of results regarding the experiments:

Scenario	Incumbent Conversion	Challenger Conversion	Likely Winner	Probability of Winner
Creates a comment.	5.58%	5.87%	Challenger	90%
Creates comments on at least 4 different days within a week.	0.23%	0.19%	Incumbent	78%
Views pages on at least 4 different days withint a week.	23.98%	23.52%	Incumbent	86%
Views pages on at least 4 different hours within a day.	14.17%	13.62%	Incumbent	94%
Views pages on at least 9 different days within 2 weeks.	9.60%	9.41%	Incumbent	73%
Views pages on at least 12 different hours within five days.	2.24%	2.13%	Incumbent	73%

Conjecture

First, and foremost, it appears that both feed strategies encourage close to the same engagement. Which is reassuring that the experiment likely did not adversely affect the DEV.to experience.

Second, I’m prepared to call this first experiment in favor of the incumbent.

Third, it appears that the challenger encourage initial conversations, but those conversations dwindled overtime.

Why do I think that this is the behavior? My hypothesis is two primary changes for the challenger:

The daily_decay_factor, the numeric multiplier we assign to the publication date, overly favored more recently published articles.
Sorting the relevant feed entries by publication date, instead of the relevance score.

Let’s look at the change in publication date decay rate.

Days Since Published	Challenger #1 Weight	Challenger #2 Weight
0	1	1
1	0.95	0.99
2	0.9	0.985
3	0.85	0.98
4	0.8	0.975
5	0.75	0.97
6	0.7	0.965
7	0.65	0.960
8	0.6	0.955
9	0.55	0.95
10	0.5	0.945
11	0.4	0.94
12	0.3	0.935
13	0.2	0.93
14	0.1	0.925
15 or more	0.001	0.9

For the original challenger, I chose a more aggressive decay rate. For the second challenger, I’m significantly easing off of the decay.

I’m also removing the order by publication date, so the upcoming feed experiment will now sort things in relevance order.

Next Steps

I’ve begun the proposal for our next feed experiment. This introduces a few minor tweaks and is intended to be a point for a conversation around how to configure the challenger’s case statements.

Forem Creators and Builders 🌱