Forem Creators and Builders 🌱

Discussion on: Scalability of FOREM selfhost

 
jamie profile image
Jamie Gaskins

We use a total of 16 Puma processes across 2 PL dynos with a total of over 100 Puma threads. Sidekiq workers run 2 Sidekiq processes across 2 PM dynos with a total of 30 Sidekiq threads.

it would require a machine with well over 300GB of RAM.

this statement has confused me.
If we have different servers for each service, we allocate 2 GB RAM for each VM, it adds up to 12 GB.
If we have all the services in one VM, I was thinking that the idle RAM is better utilized and either 6 or 8 GB may be enough.

Most Forem instances should indeed run everything on the same box for precisely that reason. There is significant overhead per machine and performance-per-dollar decreases once you scale beyond a single machine, so until you need more capacity than a single machine can feasibly provide, you should absolutely stick to that single machine.

However, DEV is well beyond that threshold. We serve more traffic than Hacker News, so it's more cost-effective at our scale to spread across multiple machines and delegate things like Postgres and Redis to cloud services that specialize in running them. If there's an outage, downtime is measured in seconds to minutes, whereas if we put everything on a single box to save infrastructure costs, downtime would easily be 8-24 hours. It would take weeks or months to recover from the SEO penalty of that downtime and, most importantly, the hot takes on Twitter would be devastating. πŸ˜„

Thread Thread
 
9comindia profile image
9comindia

@jamie thank you very much for the excellent clarification.
I am planning to use cloudflare free plan as CDN to reduce the load on server.
Any insights on why I should or should not use cloudflare..

Thread Thread
 
ben profile image
Ben Halpern

We use Fastly on DEV and there is more built-in logic for really caching in that way. All in all It's your best bet for max scalability given the custom support in the Forem code.

Thread Thread
 
jamie profile image
Jamie Gaskins

^^ This. The only currently supported caching reverse proxies are Fastly and Nginx. The Forem Self-Host configuration uses OpenResty, which is built on Nginx, so even though you won't have the points of presence that a proper CDN would provide (which often has a low cache hit rate anyway without a large quantity of traffic), you'll still get solid caching on top of the layers of application-level caching we do.

Are you currently seeing traffic volume that would require looking into a CDN?

Thread Thread
 
9comindia profile image
9comindia • Edited

Are you currently seeing traffic volume that would require looking into a CDN?

Not yet :) just preparing for the future.
I tried a website killer app few days ago on my website url, which was hosted with 2 core 4 GB RAM on GCP.. the response times are approximately as below:
5 concurrent users - 1 sec
100 concurrent users - 2 sec
200 concurrent users - 4 sec
800 concurrent users - 50 sec

100 concurrent users itself is a huge traffic, which the website can hold good, proved by the test.

But if someone else uses similar tools and continuously attack the website, there is no protection,
Even the cloud security mechanisms did not intercept the attack; and the website was not accessible from other devices during the attack.

That's why hoping that cloudflare or any other CDN may be able to effortlessly serve the cached static content to such attackers and keep the website up for real users.