Forem Creators and Builders 🌱

Ben Halpern for Forem Core Team

Posted on • Edited on

Feed test experiment 4 results: "Final ordering" helps close the loop on feed quality.

Background

The Weighted Query algorithm's initial query does not ultimately lead to opinions on how posts are ordered within that set. The ordering is effectively "default ordering".

But there is space for us to order this final data, and this week's work focused on that.

Because we are testing something pretty new, we decided to try a variety of ordering strategies rather than hone in too much. This week we wound up testing eight different versions of "final order".

We tested the "original" variant with 51% of results, and the other 7 with 7 each (7x7=49). Spending 49% of the yield here was based on the confidence that this was very much low hanging fruit, so we would not be hurting existing members by throwing them at untested versions. We had good confidence that most of these would be an improvement.

Results

Since we had a lot of variables at play, we are going to display the data by ranking for each experiment. We first note where the incumbent finished, and then what the 3rd, 2nd and first place finishers were for each experiment.

Here are the eight ordering possibilities:

  • score
  • comment_score
  • last_comment_at
  • random
  • random_weighted_to_score
  • random_weighted_to_comment_score
  • random_weighted_to_last_comment_at
Scenario Incumbent ranking 3rd Place 2nd Place Likely Winner
Creates a comment. 7th incumbent random_weighted_to_score last_comment_at
Creates comments on at least 4 different days within a week. 3rd random_weighted_to_score random_weighted_to_comment_score random_weighted_to_last_comment_at
Views pages on at least 4 different days within a week. 6th score random random_weighted_to_score
Views pages on at least 4 different hours within a day. 7th random last_comment_at random_weighted_to_score
Views pages on at least 9 different days within 2 weeks. 6th score random_weighted_to_comment_score random_weighted_to_score
Views pages on at least 12 different hours within five days. 3rd incumbent score random_weighted_to_score

Conjecture

The clear winner here is random_weighted_to_score. This was an ordering algorithm which shuffled the results (such that the same post does not stick to the top all day)... But weighted the results approximately by their "score".

Score is a calculated feed which mostly correlates to total positive reaction minus any negative moderator actions.

This gives us evidence that the most relevant signal to noise scenario here is one where we let votes of popularity indicate which posts should float to the top most often. Importantly, our weighted query algorithm does not directly use score — so this value is merely for ordering among what we initially reported.

This was a predictable outcome: Basically top posts should show up near the top of the feed on average if we want people to come back to a useful destination. But if we blindly keep the top post there for too long, it will not be as useful.

It is worth noting that random_weighted_to_last_comment_at had a bit of downtime due to not appropriately handling edge cases — and it still finished well in most tests. The clear winner was still clearly random_weighted_to_score, even if we believe that ...last_comment_at was penalized, but this should inform some future tests which might build on this initial "low hanging fruit" outcome.

Next steps

I am very excited to declare a winner here, and have the results impact all users. I think this was a big step in the process. The next proposed test is an adjustment in how we assign tag preference weights.

I believe these tests are picking up steam, and we will be able to eye some refactors which make them more flexible for all Forems. DEV is the only Forem big enough to confidently glean information on these A/B tests, but these results will begin clearly impacting all Forems for the better.

Top comments (1)

Collapse
 
lee profile image
Lee

Great news Ben! Thanks for the constant updates on this