(Obviously none of what say here will be useful to Mastodon admins fighting fires this week.)
Maybe needing multiple machines with gigabytes of RAM to support only 75,000 users suggests that a more efficient reimplementation of #ActivityPub would be helpful. Maybe not using Ruby and using enough backpressure to handle overload conditions gracefully would help.
I don't know, how effectively can ActivityPub implementations apply backpressure? Does the protocol itself make it difficult? Can failed deliveries get reattempted after a short time if deferred due to overload?
How much total bandwidth are we talking about for this number of users? Back of the envelope? This is super stupid because haven't even read the protocol spec, so please let me know if I'm making totally wrong assumptions here.
First, the inter-server traffic. I'm thinking media attachments aren't included in the activity stream itself, but thumbnails are; those are typically 100K; everything that isn't a media attachment is of insignificant size; users typically subscribe to 100 other users on other instances, who each post 100 posts a day; one post in five has a media attachment; and on average each remote user has two subscribers on your instance, so the 10000 incoming posts a day per user gets reduced to 5000, which is 100 megabytes per user per day. Is that about right?
100 megabytes per day is about 9600 baud, so a gigabit pipe should be adequate for the inter-server communications for around 100k people.
But then you actually have to serve those posts to them, which means at least twice as much bandwidth, and maybe more if they reload the page and you can't force their browser to cache those stinky thumbnails forever.
In terms of messages per second, 5000 incoming posts per user per day is about 6000 posts per second for 100k users. That's about an order of magnitude below what RabbitMQ can do (on one machine!) and two orders of magnitude below ZeroMQ. So the bandwidth thing rather than CPU is really probably the crucial limiting factor (though not, of course, with Ruby).