TL;DR: How long do you run a given version of Forem before updating?
Hey all, I'm the principal SRE here at Forem, and I'm looking for some feedback on how best to support folks in the Forem ecosystem when it comes to updating your Forem installation.
I've been leading some internal discussions about how we apply changes to both the structure and data in the database. Since some of these changes don't make sense after a while (for example, we can't migrate data to a DB column that no longer exists), we sometimes delete them. So we want to make sure that we give everyone enough time to update to the latest Forem version[1] and run these migrations.
The hard part is that we don't have any hard data on how long a given version of the code is running in the wild on any deployments we don't control ourselves. We've been working on a Forem Discovery service, which is basically a registry of the known Forem ecosystem. If you register your Forem instance with this service (we won't force it on you — you may not want to make your Forem instance discoverable if it's private), it can let us know which version you're on so we can see how long we need to keep a given database migration. Unfortunately, that service isn't ready yet.
Mostly we're trying to figure out whether we need to support them for a matter of weeks or months. Hopefully, we don't need to go to years 😄 but we just want to understand what the order of magnitude is that we need to optimize for.
Looking forward to hearing from y'all!
[1] I'm using the word "version" to refer to the state of the codebase at any given point in time. This is not about the versioning project, but we do have a lot more clarity around that and I'll be writing another post about that soon.
Top comments (11)
I think making sure that updates go smoothly is one of the main (if not the main) concerns for owners of self-hosted instances. In that spirit, wouldn't it make sense to have a system where some "checkpoint" updates (eg all those that touch the db schema) must be performed when trying to update to the latest version?
For example to go from 0.6 to 0.8 one has to first update to 0.7 because it contains a series of data mutations that are necessary to ensure that the update to 0.8 goes smoothly.
It's been a long time since I used a framework like Rails (and in my case it was Django anyway), but I believe that those systems did in fact have support for this type of thing. Of course this only covers the DB, so it might be necessary to engineer something similar for other stateful components of the system.
Agreed! 🙌🏼
Yes and no. You can run
rails db:migrate
in Forem against a fresh Postgres database and it will work. This part is fine in Forem.Data migrations, on the other hand, are done as a separate task. This practice isn't specific to Forem and is a pretty common practice in the Rails ecosystem. There are multiple popular gems for it, even, such as
data-migrate
andafter_party
. We use a home-grown solution at Forem that works on basically the same principle.This is where things get more complicated for us, though. All of those solutions run all pending data migrations after all pending schema migrations, which optimizes around the idea that the app is being deployed at the same cadence. For nearly every production Rails app this is fine, but there are a lot of Forem deployments we don't control. For example, consider the following scenario:
For deployments we control, it's extremely unlikely that S2 will run before D1. It's virtually impossible on DEV (which is deployed on every commit to
main
) and the window is pretty small on the few dozen other Forem communities that we host (we deploy them 2-5x/week).However, for deployments we don't control, we can't guarantee that, so this creates the potential for a race condition. If it's feasible for S2 to run before D1, we need to ensure that D1 can complete successfully in that case. It's become common practice at Forem to delete data migrations after a certain amount of time so that they don't cause this problem, but if your Forem instance is updated after that point, you may miss a data migration. We don't want that, either. 🙂 So we want to understand how long we need to support these data migrations or if it makes more sense to change our process.
I see. It seems to me that there is no way of this making people happy if the default state is that everytime you push a schema-breaking change you have to accept that you are cutting off some instances from the normal upgrade path. I don't mean to make this a ruby vs python thing, but I believe that django had a way of interleaving data changes with schema changes to make migrations repeatable even when doing all of them at once.
Aside from that, maybe there is another angle that could be worth exploring: what if those changes triggered a major version change of forem and required people to manually change the container image tag to continue the upgrade process? That way you could use those tags as checkpoints.
As a small aside, I am interested in anycase in messing with the tag / endpoint of the rails app because I want to perform some minor personalizations of the interface, so I will have to maintain my own fork and merge manually any changes from upstream. I'm mentioning this because it's another (imho not too uncommon) usecase that requires messing around with the rails image endpoint and tag.
This approach (specifically, treating data migrations as schema migrations to run them chronologically) is on the table 🙂
I like this idea and I really like how some libraries and frameworks (for example, Ember.js and RSpec) implement it — upgrade to the latest 2.x release, fix deprecation warnings, upgrade to 3.0. Unfortunately, our versioning scheme will be calendar-based rather than SemVer, so this wouldn't fit. Still, it's a pretty slick move if you can swing it.
Galaxy brain solution: only do breaking updates once a year on January 1st and call it a major version increase ;-)
Redis doesn't store anything that can't be replaced easily for longer than a few milliseconds and we removed Elasticsearch to make self-hosting simpler and more affordable. The Postgres DB is all that remains when it comes to long-term storage of critical data. 😄
I update my Forem on daily basis.
I also update frequently, just to ensure that I don't miss any of those important updates and data scripts etc
Thanks for weighing in here, Lee. Your use case was one of the ones I was most concerned with since, as far as I'm aware, you run more Forem instances than anyone. I even considered tagging you directly here.
My concern was that, beyond a certain threshold, it might be difficult to keep up with updates, but if it's not it sounds like it we might be able to keep this reasonably tight.
Just so I've got some concrete numbers to work with, when you say "frequently" here, how often would you say that is in days/weeks on the high end?
No worries, I rarely go longer than 3 days but then I may go on holiday for a couple of weeks. I only have the one that is Heroku hosted, the remaining Forems are SaaS 😎