On multiple occasions I've listened to instance admins speak about high S3 costs. The sheer amount of data absolutely balloons the more activity your server sees, I get it.
What I don't get is whether there's some unknown fedi ethical reason everybody insists on setting up an S3 cache (followed immediately by complaining about it).
Y'all want to know what the rest of the web does? Hosts their own uploaded media, and links out to the rest...
Am I wrong for thinking that this established expectation (especially for smaller bootstrapped instances) is perfectly cromulent from an ops perspective? Honestly asking because I come from a time before DevOps and Microservices were a thing, and we all hosted our crud on servers we had physical access to (though VPSes are great!)
Yes, I totally get the benefits of having a CDN. Especially with global access, but nobody's setting up a globally distributed CDN for their dinky Mastodon instance.
Yes, sure, it reduces the load on the origin server, if access to the media was distributed via other federated servers' CDNs, but one neat trick to reducing your transit costs is to... not carte blanche host every piece of media your instance stumbles onto.
If anything, the rationale seems rather contrived.
I just don't want to make a misstep here and come off as a selfish fediverse implementor.
@olives I don't know about you but I'd rather not be financially ruined because a scraper decided to blow up my instance overnight.
I don't know when exactly it became normalized to have an always-up site, and to absorb the associated costs of it.