How Decentralized Is Bluesky Really? https://dustycloud.org/blog/how-decentralized-is-bluesky/
A technical deep-dive, since people have been asking me for my thoughts. I'll expand a bit on some of the key points here in a thread. 🧵
First of all, before I say anything else, my goal here is NOT to be mean to Bluesky's devs. I know there's a lot of fediverse-Bluesky rivalry, but I have enormous respect for Jay Graber and her team and I know they believe in their vision!
This started because I got some very kind encouragement by @bnewbold to write something. I'm trying to be technical in my analysis, not unkind. I hope that can be recognized, really and truly.
On the fediverse we also see a lot of accusations of Bluesky being owned by Jack Dorsey, and this isn't true. My understanding is that Jay performed an impressive amount of negotiation to allow Bluesky to receive funding independently.
These days Jack Dorsey is instead focusing on Nostr, which I can only describe as "a sequel to Secure Scuttlebutt with extremely bad vibes where bitcoin people talk about bitcoin"
I participated a bit in the process of when Bluesky was Jack Dorsey and Parag Agrawal's personal project. I also believe Jack and Parag were sincere about Bluesky as a decentralized social network protocol that Twitter would adopt, which is the directive that Bluesky was given as an organization.
When Jay Graber was awarded the position to lead Bluesky, I was not surprised. To me, Jay was the obvious choice to deliver what Bluesky was being directed, and I do think Jay is an excellent leader
There is also something which Bluesky gets right which the fediverse does not. I mentioned that Bluesky uses decentralization *techniques*, and the most important of those is content-addressing. This allows content to exist even when a server goes down.
This is a great decision and I have advocated that the fediverse do so as well. In fact several years ago I wrote a demo in @spritely's early days showing off how one could build a content-addressed ActivityPub in a spec-compatible way.
In fact ATProto's own tutorial even says "Think of our app like a Google": https://atproto.com/guides/applications
And indeed this is a good way to think about things. But it doesn't seem so bad, because we have Personal Data Stores like blogs, so probably things are fine, right?
But let's stay on this blog/search engine analogy for a while before we unpack what it means on a *technical* level, which is interesting. Let's analyze for the moment from a power dynamics level.
Building a web search engine is actually pretty easy these days, you can do so with off-the-shelf tools. And yet there are only a couple of search engines *really*, Google and Bing (DDG mostly uses Bing). And yet the information is right there. *Anyone* could run their own engine. Why don't they?
People are trying; most notably alice has done some great work recently: https://alice.bsky.sh/post/3laega7icmi2q
So now someone *can* run their own Relay (not the AppView yet, but maybe soon), and we're getting a sense of the cost and scale. This is good news; we didn't know before.
In fact we also have an idea of the rate of growth. Approximately 4 months prior, @bnewbold.net posted an article detailing how to run a Bluesky relay: https://whtwnd.com/bnewbold.net/entries/Notes%20on%20Running%20a%20Full-Network%20atproto%20Relay%20(July%202024)
This is great. We need more people trying to do so to get a sense of how decentralized things can be.
I tried estimating how much this would cost; as a lazy approximation I dumped a 5 terabyte machine into seeing what Linode would cost to self-host, and it was approximately 55k a year: https://bsky.app/profile/dustyweb.bsky.social/post/3lah5n3kld42q
That's a lazy estimate, but that's also what many people make in the US every year
However @bnewbold.net pointed out, correctly!, that there were cheaper options available. If we used even Linode's block storage, it would be cheaper (but still expensive) for the storage component, and this is true https://bsky.app/profile/dustyweb.bsky.social/post/3lah5n3kld42q
In fact @bnewbold and alice had gotten the server down to just close to $200/month in their estimate, much much cheaper than I had, by choosing a dedicated server plan. Much cheaper!
But there's a problem though; that's cheap because you've got a server that has a dedicated disk...
Even if we look at the dedicated hosting provider that @bnewbold provided in June and scale the cost to the pre-election storage requirements, we are adding on a massive amount of cost every month, over $400/month more.
But worse, we have reached the limits of what is possible to do with a dedicated server. We *have to* move to abstracted storage from this point forward because we're starting to hit the limits of what's offered for cheap dedicated storage on one machine. And this number will only grow, and as said previously, is growing at an enormous rate.
I have spent a lot of time focusing on the cost of storage, but storage is only one cost required. These estimates have been done so far against servers that *nobody is actually using*. The cost of servers that people are using will be much higher, because more needs to happen than just store things.
And that is not even to mention the challenges with administrating, dealing with takedown requests, illegal content, etc, which are probably much more serious.