Forem Creators and Builders 🌱

Phuc Tran
Phuc Tran

Posted on

Selfhost architecture questions

I am going to create my community, I found forem and I'm impressed. I have a few questions:

  1. Can we use another database (not postgresql)?

  2. Should I separate our database to another host/service?

  3. How to setup load balance to avoid down time?

Top comments (6)

Collapse
 
ellativity profile image
Ella (she/her/elle) • Edited

Welcome, @thphuc ! These are great questions, and I'm sure the answers will be interesting for a lot of people, so I'm happy you posted them here on forem.dev!

I'm going to wager that the best people to weigh in on these questions are @djuber, @jamie, and @andygeorge - always stand to be corrected, though!

Collapse
 
djuber profile image
Daniel Uber

The self-host solution really was designed to collapse a multi-tier webapp architecture into a single machine (where redundancy and reliability are secondary to cost considerations), targeting a much lower price point than you would find you're paying to host the necessary services stand-alone.

The self-host architecture instead assumes a moderately sized VPS in a cloud host, and runs those services as containers (with the default option file storage to local disk, and no redundancy). That single point of failure was a design decision, to control the cost to host for small communities.

  1. I'm 90% sure there's a hard dependency on postgresql as the database backend, because of the use of full text search. Unless you're using a fully compatible database, much of the search functionality will be broken. Originally Forem used Elasticsearch, and leveraging Postgres's full text search enabled us to remove a dependency on that (which makes the minimum memory requirements for hosting a lot lighter), this decision binds our dependency more tightly to postgresql than a "vanilla" rails app, but we had to have a SQL database anyway, host DEV.to on Heroku (where postgresql is the default, and easiest, database to use), and wanted to decrease the memory footprint of the app for small installations, where elasticsearch was an oversized solution relative to the rest of the stack.

  2. You can host the postgresql database on a separate service, you might want or need to disable the forem-postgresql service as you set that up, and provide the database url in an environment variable or the .env file (since the assumption would be that postgresql was hosted locally, and your database host will have provided another connection string for you). The same applies to the redis cache (and background job queue), which would make sense to host externally if you need access from multiple backends (both for shared cache, and to have a single job queue).

  3. If you're looking at load balancing (and having redundant web backends), in addition to hosting the postgresql and redis data stores separately, it would be important to enable externally hosted file uploads (the default configuration saves image uploads to disk, but adding cloud storage to S3, or an S3 compatible storage service, can be enabled by ensuring FILE_STORAGE_LOCATION is not set to "file", and providing the following environment variables: AWS_ID, AWS_SECRET, AWS_BUCKET_NAME, AWS_UPLOAD_REGION.)

The default configuration has openresty/traefik running in containers on the same machine, likely some of this would need to be removed or reconfigured if you're putting a load balancer in front of the service and no longer handling ssl termination locally.

With this level of customization, it's likely that the choices made for the default case for self-hosting forem won't align with your needs, and you'll end up using only the forem-rails and forem-sidekiq containers (and these may be on distinct hosts as well, sharing via redis and the database). We're unable to support modifications of this kind to the self-host script, which will become more an example of what was needed to start the system on a single machine, than an easily repurposed installer to a multiple machine solution. Every component installed by the self-host script is serving a purpose, none of those components is strictly necessary to be local on the same machine as the others (hosting nginx and the ruby webserver together probably does make some sense), and any multiple service or multiple host solution will end up recreating much of that architecture.

I don't have a great feeling for "how-to" setup the load balancer if you were going forward with that, but it's definitely outside any supported self-hosting configuration. It's possible a community member may have experimented with this in the past, and if you do find a solution, sharing it here would be welcome.

Collapse
 
thphuc profile image
Phuc Tran • Edited

Thanks a lot @djuber for your great answers.

I guess I will need more time to be familiar with forem first.
By the way, I cloned the repository to my local PC (MacBook) and run it. However, it showed a warning: "Setup not completed yet, missing community description, suggested tags, and suggested users. Please visit the configuration page."

When I clicked on the link "visit the configuration page", it showed below error:

Thread Thread
 
ellativity profile image
Ella (she/her/elle)

Hey @thphuc when you say you cloned the repo to your local PC and ran it, what were the steps you followed?

The "Setup not completed yet..." warning usually appears on a brand-new Forem that hasn't yet been configured for members to join. However, that warning isn't seen until the First User/Admin User is logged in.

Without knowing what steps you took to see the warning without being logged in as the site admin, it's difficult to know how to help you. Can you please share your actions step-by-step?

Thread Thread
 
djuber profile image
Daniel Uber

@thphuc - when you run the forem software locally (which looks like it started up correctly) the software is in "development" mode - and some error handling that would normally have occurred (in this case, redirecting back to sign in when trying to access pages that require login) is disabled, the idea is a developer would like to see the actual error and where it happened, while in "production" mode the errors would expose too much of the internals to be useful to most users, for example the same link in DEV.to for me gives a 404 error page.

You'll want to log in (the database seed created a few users, all with password "password", the administrator account has email "admin@forem.local" - once you log in, following that link will work without raising the authorization error.

Thread Thread
 
thphuc profile image
Phuc Tran

Thanks @ellativity and @djuber . I think I should follow the instructions from @djuber first. I will let you know if I see any other issue.