Show newer
Simon boosted

On Monday, @ahl and I will be joined by members of the Oxide team to talk about a doozy: an 18-year-old ZFS data corruption bug that we recently nailed. We'll be at a special Europe-friendly time: 9a Pacific/noon Eastern/5p GMT -- join us for the wild tale!

discord.gg/QrcKGTTPrF?event=14

Simon boosted

community note: using cost on the y axis makes it appear like cheaper models are more capable on pass@3

Show thread
Simon boosted

Someone sent vee the most amazing study on a pediatric hospital contacting professional racing pit teams and asking them to advise on drafting a handoff procedure for ICU patients of the highest concern between wards.

And Ferrari and Williams went "You all are babies, let us show you how it's done" and cut the error rate in handoffs by like 20% and generally found that you need less training, not more, to do it correctly despite having a faster, more detailed protocol.

This is my shit, so much.

Edit: The link to the article is in a reply to prevent masto-hugging the host but people seem to not be seeing it: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1460-9592.2006.02239.x

Simon boosted
Simon boosted

Evolutionary Algorithms for optimizing LLM weights Gradient descent and backpropagation have a lot of problems, alignment becomes a nightmare. Evolutionary algos fix this, but they don’t scale A recent paper, EGGROLL, makes it computationally feasible to do now www.alphaxiv.org/abs/2511.16652

Simon boosted

There's an amazing new music coming out from European women artists at the moment — this week on the podcast we've got recommendations for Robyn, Oklou, Zaho de Sagazan and Lily Allen. Who should we be adding to our list?

#music #pop #europe #culture #podcast

Simon boosted

Over at the Erdos problem website, AI assistance is now becoming routine. Here is what happened recently regarding Erdos problem #367 erdosproblems.com/367 :

1. On Nov 20, Wouter van Doorn produced a (human-generated) disproof of the second part of this problem, contingent on a congruence identity that he thought was true, and was "sure someoneone here is able to verify... does indeed hold".

2. A few hours later, I posed this problem to Gemini Deepthink, which (after about ten minutes) produced a complete proof of the identity (and confirmed the entire argument): gemini.google.com/share/81a65a . The argument used some p-adic algebraic number theory which was overkill for this problem. I then spent about half an hour converting the proof by hand into a more elementary proof, which I presented on the site. I then remarked that the resulting proof should be within range of "vibe formalizing" in Lean.

3. Two days later, Boris Alexeev used the Aristotle tool from Harmonic to complete the Lean formalization, making sure to formalize the final statement by hand to guard against AI exploits. This process took two to three hours, and the output can be found at borisalexeev.com/t/Erdos367.le

EDIT: after making this post, I decided to round things out by making AI literature searches on this problem, which (after about fifteen minutes) turned up some related literature on consecutive powerful numbers, but nothing directly relating to #367. chatgpt.com/share/6921427d-9dc gemini.google.com/share/0d2964

Simon boosted

My notes on Gemini 3, including analyzing a 3.5 hour council meeting audio recording and performance on a new, improved version of my pelican on a bicycle benchmark simonwillison.net/2025/Nov/18/

Simon boosted

@fj is it that the party behind it always complains "no foreign judges", but that "foreign king and mob boss" seems to be ok?

Is it that they agreed to move jobs to the USA, while complaining that foreigners take our jobs?

Simon boosted

i'm delighted to be hosting some academic course material at grebedoc.dev!

(yes, you can push 750 MB of slides and stuff as a single site to it. yes, i will gladly host it! no, it will not cost me any remotely meaningful amount of money, push at your leisure)

Simon boosted

A little while ago, we had a terrific discussion with Jerry Neumann on Oxide and Friends, vowing to have him return with his co-author Elizabeth Zalman to discuss their book, "Founder vs. Investor."

Today, Jerry and Liz join @ahl , @sdtuck and me, along with Oxide investor Seth Winterroth, to get into some of the untold stories of founders and investors.

Join us today, at a special East Coast and Europe friendly time: noon Pacific, 3p Eastern:

discord.gg/QrcKGTTPrF?event=14

Simon boosted

Ashley doesn't really post here anymore but ICYMI she's now the Editor-in-Chief of the Journal of Undergraduate Neuroscience Education (JUNE)!

Which is just mind-blowing because I am pretty sure it was only yesterday we were both anxiously waiting to know whether we'd win the absolutely bonkers two body problem lottery and she'd get a faculty offer from UCSD, moving from solo research to focus on building the pipeline of neuroscience for all students.

bsky.app/profile/analog-ashley

Simon boosted

“When you liberate programming from the requirement to be professional and scalable, it becomes a different activity altogether, just as cooking at home is really nothing like cooking in a commercial kitchen”

Great post, very much describes me and programming.

robinsloan.com/notes/home-cook

#programming #code

Simon boosted

the ironic part about immigrants is they’re not lazy, the lazy ones didn’t have enough agency to move to a different country immigration is as close to a filter for high performing individuals as you’re going to get

Simon boosted

i made some changes to eulerroom.com !!! it should work better on phones now, and should be a bit easier to read on everything

we've filled 76 out of 96 slots for the upcoming live stream marathon for Palestine. grab your slot before it's too late :)

pleaze share this around with all live coders! particularly those in different time zones to me <3 xx

Simon boosted

Oh no - archive.today is under attack. I always was wondering how they finance their service, and who is behind it. Now it seems that the FBI is targeting it...

- people ask to put it on a blacklist: adguard-dns.io/en/blog/archive
- wikipedia writes the FBI subpoenaed their registrar: en.wikipedia.org/wiki/Archive.
- the talk page of wikipedia marks it as a Russian company
- traceroute points to an Estonian server
- some countries already block DNS requests to it

Such a nice, illegal service.

Simon boosted

Some notes on GPT-5.1, which is now available in the OpenAI API

The new reasoning options are interesting, but the pelican feels like a bit of a regression from GPT-5 simonwillison.net/2025/Nov/13/

Simon boosted

I find AI does accelerate solving complex problems, so you can get back to your to-do list. Unfortunately, I love being immersed in long complex problems, and hate managing my top-level to-do list. So I am once again begging tech companies to make us an AI Project Manager.

Simon boosted

I hope that I can ruin your day by getting you to read this map (courtesy wikipedia) of US town names that are portmanteaus of the two (or sometimes more) states they are near the borders of

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.