A new setting has been added to manage which articles can be indexed by search engines based on their score. New communities should see more posts indexed by search engines faster.
We identified a few recent pain points among creators where articles in their communities were being ignored by Google and other search engines.
After further investigation, we identified some choices the Forem team had made to limit indexing of spam or low quality posts.
In new communities, where a large body of existing activity, ratings, comments, etc. may not exist, some of the limits might not make sense. One in particular, prioritizing posts with
snippets in them as higher quality, was a good fit for DEV and other code centered communities and less meaningful for other Forems.
What type of PR is this? (check all applicable)
- [x] Refactor
- [ ] Feature
- [ ] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
Remove the check for
<code> tags (there are a few test cases that exercise this that can be removed still).
Simplify the presence of two "magic" numbers 3 and 5 in the sitemap generation and article show pages, if an article is going to be included in the sitemap, it should be indexed.
Related Tickets & Documents
Andy's forem.team post
Ben's suggestion from the comment was the guidance here.
QA Instructions, Screenshots, Recordings
Behavior covered by the spec/requests/stories_show_spec.rb cases.
Confirmed new setting shows and can be updated in the admin/customization page:
UI accessibility concerns?
- [ ] Yes
- [x] No, and this is why: refactor
- [ ] I need help with writing tests
[Forem core team only] How will this change be communicated?
Will this PR introduce a change that impacts Forem members or creators, the development process, or any of our internal teams? If so, please note how you will share this change with the people who need to know about it.
- [x] I will share this change internally with the appropriate teams
- [x] I will add a Forem.dev changelog post
- [x] Updated the admin guide
[optional] Are there any post deployment tasks we need to perform?
DEV and possibly other communities should have the created setting set to non-zero.
We will add a new setting "index minimum score" to the User Experience and Branding section of the customization/config admin area.
This setting sets a minimum score. This score will decide which posts appear in the Sitemap. It will also determine whether a post can include the “noindex” and “nofollow” robots meta tags.
This setting will work similar to the tag minimum score and home feed minimum scores that are used to filter, but rather than controlling the displayed feed pages, it controls the search engine crawlers.
Since the two different values were causing some pages to find their way into the sitemap, while also including the noindex/nofollow meta tags, we decided to unify this into a single threshold to avoid sending conflicting signals. The sitemap suggests the crawler should visit the page, the noindex meta instructs the crawler to ignore this page, so it makes sense to keep that consistent.
Negatively scored (down-voted by a moderator) articles will not be indexed, regardless of the value of this setting.
If you take no action after this is released, some articles which had been excluded from the sitemap (because they had score lower than 3) or from the search engine results (because of the noindex meta) may start showing up.
Raising this setting above 0 in the settings page will require published posts to clear a higher bar before being indexed by search engine crawlers. This can cut down on spam submissions but won't serve as a complete alternative to moderation.
For posts with active discussions or many heart/unicorn votes, this will not be a problem. For new communities, where there's not a lot of active feedback or discussions, this change should speed up getting content read by Google and available in the search results.