Show newer
R. A. Dehi boosted

Journalist @adamdavidson on moving to Mastodon: "I think we got lazy as a field and we let Mark Zuckerberg, Jack Dorsey, and, god help us, Elon Musk and their staff decide all these major journalistic questions."
themarkup.org/newsletter/hello

R. A. Dehi boosted

I find it frustrating how user research about how normies prefer algo timelines is floated without acknowledging that it so far has always led to outrage-driven hellholes like Twitter or Facebook.

I don’t want to take it away from anyone just b/c I don’t like it, but if you don’t have a plan for this, I’m ok with the 99% scrolling thru TikTok & arguing with their racist uncles on Facebook about made up news—while this stays a tiny nerd hole.

My adrenal system is tired of outrage baiting.

@samplereality Well, it might be a more conservative estimate to say that Ruby wastes 99.7% of your CPU.

R. A. Dehi boosted

I've now installed a "RT @" filter so Twitter retweets don't show up in my home feed (shakes fist at clouds)

@scottjenson I've signed up for GitLab using an email from a domain that certainly never "signed up" itself. I assume that still works.

R. A. Dehi boosted

It's great how well crafted the Mastodon RSS feeds are. Just wrote a blog post about that.

scripting.com/2022/11/21.html#

R. A. Dehi boosted
R. A. Dehi boosted

Social media rambling 

I actually have an old B&N Nook e-reader that I was able to get autographed by the inimitable @pluralistic - he wrote on the back of it “if you can’t open it, you don’t own it”

Feeling like there’s a corollary there for the modern day: “if it can be bought and destroyed by a billionaire, you don’t own it”

No billionaire can just “buy Mastodon”. They might buy an instance (I got one you can have for $8!), but the protocol is free forever.

Show thread

@samplereality Twitter was written in Ruby at first too, but Ruby wastes 99.9% of your CPU and 90% of your RAM, so rewrote it in Scala years ago to make the costs much less enormous.

R. A. Dehi boosted

@scottjenson @jeffjarvis You could simply do a very easy logistic regression of your own decision of "favorite" or "bookmark" or "reply" against the vectors of boost and favorite of all your follows to get a statistical prediction of how likely you are to favorite (or reply to, etc.) a post. Let's call that prediction "quality", acknowledging that is a firmly subjective (intersubjective?) kind of quality.

Then need some way to combine that "quality" metric with recency. I'd favor a quality threshold that varies over time to maintain a roughly constant rate of posts selected from the thousands of candidates per day: 5 per day, say, or 50 per day. Maybe only rate a post 48 hours after posting so all the data is in.

@scottjenson @jeffjarvis You could simply do a very easy logistic regression of your own decision of "favorite" or "bookmark" or "reply" against the vectors of boost and favorite of all your follows to get a statistical prediction of how likely you are to favorite (or reply to, etc.) a post. Let's call that prediction "quality", acknowledging that is a firmly subjective (intersubjective?) kind of quality.

Then need some way to combine that "quality" metric with recency. I'd favor a quality threshold that varies over time to maintain a roughly constant rate of posts selected from the thousands of candidates per day: 5 per day, say, or 50 per day. Maybe only rate a post 48 hours after posting so all the data is in.

R. A. Dehi boosted

The consistent increased risk of diabetes after Covid across all age groups, highest in the first 3 months after infection, from a systematic review of 9 studies, ~40 million people
bmcmedicine.biomedcentral.com/

@rmerriam @shriramk Yeah, Kanren is kind of like a super-Prolog (or a super-Snobol, maybe). I think Forth is maybe actually a little less weird in that sense, with most of its strangeness being syntactic.

@rmerriam @shriramk I think is true that if you know Algol you can figure out Fortran, but Snobol and Lisp are pretty different (at least the sublanguage of Lisp without SETQ) and I think Coq and Kanren are even further out.

Helps to understand a program in a language you don't know well if the language is designed for easy readability by people who aren't familiar with it, like Python and COBOL. Also helps if the author is aiming for that. Lots of Haskell code isn't.

@enkiv2 Hmm, does archive.org successfully archive YouTube now?

R. A. Dehi boosted

(Obviously none of what say here will be useful to Mastodon admins fighting fires this week.)

Maybe needing multiple machines with gigabytes of RAM to support only 75,000 users suggests that a more efficient reimplementation of would be helpful. Maybe not using Ruby and using enough backpressure to handle overload conditions gracefully would help.

I don't know, how effectively can ActivityPub implementations apply backpressure? Does the protocol itself make it difficult? Can failed deliveries get reattempted after a short time if deferred due to overload?

How much total bandwidth are we talking about for this number of users? Back of the envelope? This is super stupid because haven't even read the protocol spec, so please let me know if I'm making totally wrong assumptions here.

First, the inter-server traffic. I'm thinking media attachments aren't included in the activity stream itself, but thumbnails are; those are typically 100K; everything that isn't a media attachment is of insignificant size; users typically subscribe to 100 other users on other instances, who each post 100 posts a day; one post in five has a media attachment; and on average each remote user has two subscribers on your instance, so the 10000 incoming posts a day per user gets reduced to 5000, which is 100 megabytes per user per day. Is that about right?

100 megabytes per day is about 9600 baud, so a gigabit pipe should be adequate for the inter-server communications for around 100k people.

But then you actually have to serve those posts to them, which means at least twice as much bandwidth, and maybe more if they reload the page and you can't force their browser to cache those stinky thumbnails forever.

In terms of messages per second, 5000 incoming posts per user per day is about 6000 posts per second for 100k users. That's about an order of magnitude below what RabbitMQ can do (on one machine!) and two orders of magnitude below ZeroMQ. So the bandwidth thing rather than CPU is really probably the crucial limiting factor (though not, of course, with Ruby).

(Obviously none of what say here will be useful to Mastodon admins fighting fires this week.)

Maybe needing multiple machines with gigabytes of RAM to support only 75,000 users suggests that a more efficient reimplementation of would be helpful. Maybe not using Ruby and using enough backpressure to handle overload conditions gracefully would help.

I don't know, how effectively can ActivityPub implementations apply backpressure? Does the protocol itself make it difficult? Can failed deliveries get reattempted after a short time if deferred due to overload?

How much total bandwidth are we talking about for this number of users? Back of the envelope? This is super stupid because haven't even read the protocol spec, so please let me know if I'm making totally wrong assumptions here.

First, the inter-server traffic. I'm thinking media attachments aren't included in the activity stream itself, but thumbnails are; those are typically 100K; everything that isn't a media attachment is of insignificant size; users typically subscribe to 100 other users on other instances, who each post 100 posts a day; one post in five has a media attachment; and on average each remote user has two subscribers on your instance, so the 10000 incoming posts a day per user gets reduced to 5000, which is 100 megabytes per user per day. Is that about right?

100 megabytes per day is about 9600 baud, so a gigabit pipe should be adequate for the inter-server communications for around 100k people.

But then you actually have to serve those posts to them, which means at least twice as much bandwidth, and maybe more if they reload the page and you can't force their browser to cache those stinky thumbnails forever.

In terms of messages per second, 5000 incoming posts per user per day is about 6000 posts per second for 100k users. That's about an order of magnitude below what RabbitMQ can do (on one machine!) and two orders of magnitude below ZeroMQ. So the bandwidth thing rather than CPU is really probably the crucial limiting factor (though not, of course, with Ruby).

@shriramk Have found learning new languages opened my mind a lot; I wonder what a CS curriculum would look like that embraced Haskell, Kanren, Elixir, Levien's Io, Coq, Pure, Forth, Verilog, APL, ToonTalk, and the pi-calculus? Type of, solving difficult problems in each of them?

Could maybe get bogged down in puzzle-solving without students achieving transferability to other domains.

R. A. Dehi boosted

9/ Anyway, code is increasingly generated in so many ways that we're moving from "writing lines of code is hard" to "the hard problem is determining whether a chunk of code is fit for purpose" (this motivated Joe Politz's dissertation a decade ago!).

10/ The next generation computing problems will not be about writing 80s style 5-line for-loops. It'll be about properties, specification, reasoning, verification, prompt eng, synthesis, etc. How will we get there?

Show thread

@micahflee In 2002 wrote my papers in LaTeX or HTML; didn't crash.

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.