@freemo or anyone: I'm trying to find out how connected the graph of fediverse nodes is. For instance, if it's fully connected (every node connected to every other), that might imply a scalability limit to the number of nodes.
I guess another way of framing the question is whether two nodes A and B are necessarily directly connected because a user in node A follows a user in node B.
@zpartacoos @freemo I haven't got a specific limit in mind, except perhaps for the number of users to be in the billions (for mass migration from Facebook, Twitter, etc.), but my thinking so far is as follows.
Suppose the graph of nodes needs to be fully connected (note: this is still an open question for me). Let's also suppose each node has one IP address (it may be possible to use a cluster of IP addresses to represent a single node, but let's keep things simple for now).
Then a given node can have at most 64k (probably less in practice) TCP/IP connections open at any one time (because each requires a 32 bit ephemeral port number). So if there are, say, 100k nodes or more (e.g. to host 1-2 billion users), then connections will need to be re-established periodically in order to keep "subscribers" up to date with "publishers" (pardon my terminology - I must read the ActivityPub spec!).
This will result in a trade-off between CPU consumption and the timeliness of delivery of messages. E.g. if we want to display any new post within one minute of it being published, we'd potentially need to cycle round all possible connections once per minute.
(If, on the other hand, the graph of nodes isn't fully connected, then some intermediate nodes would need to do more work to forward messages.)