@helgek @darnell @volkris @nickapos @jan that is purely technical.

For now we are stuck with Mastodon which is a Rails app, which comes with all sorts of performance characteristics. The larger alternatives (elixir+pg, php+pg) rather similar.

Performance and scaling are stuck there now. Regardless of the amount of people or amount of followers, the limits are imposed by the tech stack , today.

So that's where work is needed.

@berkes @helgek @darnell @volkris @nickapos Rails is not the main reason for the scaling issues. Mainly background jobs where Ruby is just a wrapper for transcoding libs etc. You can scale Puma and Sidekiq horizontal by adding more compute/workers. Same for storage. Meaning its not the language or the framework, its the protocol and overall architecture which leads to more resource consumption.

@jan @berkes @helgek @darnell @volkris @nickapos I also don’t think it can be so much an issue with raw compute power. That only affects how the server performs within it’s own workloads, not between computers at all.

Network IO takes magnitudes longer than compute. But while doing it, the thread often waits most of the time, doing very little compute.

Maybe a layer of distributed caches? Delivery is the key.

Follow

@gimulnautti @jan @berkes @helgek @darnell @nickapos

Remember, this is not theoretical. This is actual experience hearing from real people running instances and finding themselves having to unexpectedly shell out more money for higher hosting prices that they weren't expecting.

The protocol requires poorly scaling processing and bandwidth.

And that's not even getting into expensive design decisions that Mastodon in particular put on top of everything else. For example the intentional decision not to redistribute image previews but instead require each instance to go out and pull its own image preview, duplicating that effort throughout the whole platform.

It seriously sounds like [almost] nobody involved in this from protocol design up through platform implementation gives a second thought to what's going to happen at scale.

And I may have said it in this thread, but when I was in school for computer science we were hammered with big-O analysis of algorithmic scaling but someone recently told me that's not emphasized in school these days. It sure looks like that's the case.

@volkris @gimulnautti @jan @berkes @helgek @nickapos We definately need to figure out how to operate #Mastodon / #Fediverse at scale.

@Sujiyan revealed that running Pawoo (pawoo.net), MSTDN Japan 🇯🇵 (mstdn.jp) & Mastodon Cloud (mastodon.cloud) surpassed $1 million over 2 years (legal, staff, servers, etcetera).

👉🏾 darnell.day/zooming-with-pawoo

@volkris @gimulnautti @jan @helgek @darnell @nickapos i am one of those that (help) run a large insatnce.

And no. Bandwidth isn't an issue, nor is the amount of requests.

I've built and provisioned backends that handle way more requests an ingress/egress that this large mastodon instance, on way smaller and cheaper infra.

In Mastodon the only and most costly piece is the async workers. Which is this costly. 1/2

@volkris @gimulnautti @jan @helgek @darnell @nickapos because rails is too slow to handle most stuff inline.

Ruby is slow, but most of all, the architecture (AR, MVC) isn't one optimized for scale. Typically with rails, "scaling" means just throwing more resources at it. Costly. With the typical SaaS usecase for Rails, not a big problem. But a real issue on the Fediverse.

The protocol isn't the problem. Yet.

@berkes @volkris @gimulnautti @jan @helgek @darnell @nickapos I am shocked how big an instance I have had to provision. Disk space management and aggressive caching doesn’t help. Far too much seems to be preemptively cached.

Back over 20 years ago e-mail delivery for 1000s of clients with loads of network I/O over shitty links was more efficient. On hardware less powerful than your phone.

@snookerarmchair 's comment is exactly kind of thing I'm referring to when I say I see people surprised by the amount of resources it takes to run a / instance.

This thread highlights comparing against different levels of expectation.

Someone starting from a place where they expect it to take those resources doesn't change that someone else is surprised that it does.

Personally, I suspect we set the bar too low and waste resources because we tolerate systems that don't operate efficiently.

And thus we get ActivityPub, which seems to live down to those expectations.

@berkes @gimulnautti @jan @helgek @darnell @nickapos

@volkris have you ran a gotosocial or even a pleroma instance?

Again, and I'm repeating myself, the amount of requests nor the bandwidth imposed by Activitypub are any kind og problem yet. It really only is Rails, underlying mastodon.

I urge you to provision a gotosocial and experience how much a tiny VPS can handle. How much toots it can move on under €5/month infra.

@berkes

And again, you're basing your statements against a particular standard that others don't necessarily hold.

In your case ActivityPub isn't causing any problems *given the resources you're already devoting to it* while other people who aren't devoting those resources are having problems.

Other people who haven't expected the platform to be so bandwidth intensive are finding themselves surprised by the amount of bandwidth it takes. Your expectations are different and you're not surprised. That just speaks to expectations, not to the platform.

@volkris who is surprised by the BW used? Honestly, never encountered that limit in all the servers I helped run.

All cloud providers, or even rack space providers give some "free" bandwidth with their servers. And I'd be very surprised to find anyone reaching that free limit before running into the CPU, mem and storage limits. And when one ups that, the amt of "free" data increases.

When or in what case is it a bottleneck?

@berkes

I've heard probably half a dozen admins complaining about it in the last month or so, sharing graphs of their bandwidth, and wondering if something is going wrong.

Storage too.

@berkes

> [Mastodon] comes with all sorts of performance characteristics. The larger alternatives (elixir+pg, php+pg) rather similar.

Curious. #Elixir shouldn't have issue with async workers? Based on #Erlang it has a native actor model and designed from ground up for resilient distributed and async computing. It has its roots in telecom industry (Ericsson).

Seems ideal for #ActivityPub comms 🤔

en.wikipedia.org/wiki/Erlang_(

@volkris @gimulnautti @jan @helgek @darnell @nickapos

@smallcircles elixir is rather far away from Erlang. It's really mostly a standalone language that runs on the Erlang VM. Concepts like actors are not used in the webstack.

Furthermore, Mastodon follows ActiveRecord MVC, which tightly couples the app to an RDB. Performance characteristics follow that. Phoenix/elixir is almost a carbon copy in that.

So yes, Erlang and actors would make a nice fit. But elixir and phoenix not.

@berkes

I think this is mixing things up. #Phoenix Liveview isn't where you would implement your #ActivityPub protocol stack. This would be done using the abstractions #Elixir offers of #Erlang/OTP and BEAM. Using GenServers and such, hence fully benefiting from the actor model and concurrency features.

Phoenix is where you provide your front-end UI's. Different part of your architecture (and many choices to do the separation in terms of architecture patterns).

@smallcircles it's where you "would" but currently is. Pleroma is a rather standard elixir/phoenix setup. It's tightly coupled to the database, the HTTP layer (mixes AP and user interface) directly talks to a complex, rigid RDS. The same database layer is used in Phoenix to hydrate html client side.

Despite the stack, it's characteristics are very similar to Rails (or Laravel, or Django).

An architecture that foregoes the message/event nature of AP entirely.

@berkes

Ah, that makes sense. I am not too familiar with particular Pleroma app approach, other than I know it is rigid on AP impl.

For other readers: Phoenix web framework itself comes with a set of opinionated Mix tasks to bootstrap a standard CRUD-like web app. But they are optional and just for convenience if someone wants to generate boilerplate and spin something up fast.

@smallcircles I don't have the answer, but I do see there's an architectural misalignment with the current "popular" activitypub software.

Using traditional MVC to fulfill event based requirements. Being hard to deploy and op when there's a strong selfhosting need. Weighed down with features to manage large communities, when we need to avoid centralization.

I can only hope projects like gotosocial gain traction. Their alignment is much better.

@berkes @smallcircles "Mastodon follows ActiveRecord MVC, which tightly couples the app to an RDB" is the Problem I pull from this thread.

As a maintainer of a Mastodon fork I'd be interested to merge a solution should there be one. Later you mention GoToSocial so I take it you see this as too remote a possibility to consider.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.