**glyn** @underlap@qoto.org · Jan 02, 2022, 15:15

**glyn** @underlap@qoto.org · Jan 02, 2022, 15:15

glyn @underlap@qoto.org

Jan 02, 2022, 15:15

@freemo or anyone: I'm trying to find out how connected the graph of fediverse nodes is. For instance, if it's fully connected (every node connected to every other), that might imply a scalability limit to the number of nodes.

I guess another way of framing the question is whether two nodes A and B are necessarily directly connected because a user in node A follows a user in node B.

#fediverse #graph #connectedness #topology #scalability

**Spartaco Pécora {🇻🇪 ⋂ 🇮🇹}** @zpartacoos@qoto.org · 2022-01-02T15:29:28Z

Spartaco Pécora {🇻🇪 ⋂ 🇮🇹} @zpartacoos@qoto.org

@underlap @freemo I don't know the specific scaling limitations which you have in mind but I suspect that if you have two users, say, X in server A and Y in server B then if one follows the other we have a link which in this hypothetical fediverse of cardinality 2 is not a problem as every post/publish is only a single channel but if we have even just 10,000 fully connected nodes in distinct servers that'd be 49,995,000 connections/edges to publish to. Is this the problem you're concerned about?

Jan 02, 2022, 15:29 · · Tusky · · ·

**glyn** @underlap@qoto.org · Jan 02, 2022, 16:41

**glyn** @underlap@qoto.org · Jan 02, 2022, 16:41

Jan 02, 2022, 16:41

glyn @underlap@qoto.org

@zpartacoos @freemo I haven't got a specific limit in mind, except perhaps for the number of users to be in the billions (for mass migration from Facebook, Twitter, etc.), but my thinking so far is as follows.

Suppose the graph of nodes needs to be fully connected (note: this is still an open question for me). Let's also suppose each node has one IP address (it may be possible to use a cluster of IP addresses to represent a single node, but let's keep things simple for now).

Then a given node can have at most 64k (probably less in practice) TCP/IP connections open at any one time (because each requires a 32 bit ephemeral port number). So if there are, say, 100k nodes or more (e.g. to host 1-2 billion users), then connections will need to be re-established periodically in order to keep "subscribers" up to date with "publishers" (pardon my terminology - I must read the ActivityPub spec!).

This will result in a trade-off between CPU consumption and the timeliness of delivery of messages. E.g. if we want to display any new post within one minute of it being published, we'd potentially need to cycle round all possible connections once per minute.

(If, on the other hand, the graph of nodes isn't fully connected, then some intermediate nodes would need to do more work to forward messages.)

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…