On Monday, @ahl and I will be joined by members of the Oxide team to talk about a doozy: an 18-year-old ZFS data corruption bug that we recently nailed. We'll be at a special Europe-friendly time: 9a Pacific/noon Eastern/5p GMT -- join us for the wild tale!
Wrote about using #googlejules to migrate an #RStats test suite to #testthat
community note: using cost on the y axis makes it appear like cheaper models are more capable on pass@3
Someone sent vee the most amazing study on a pediatric hospital contacting professional racing pit teams and asking them to advise on drafting a handoff procedure for ICU patients of the highest concern between wards.
And Ferrari and Williams went "You all are babies, let us show you how it's done" and cut the error rate in handoffs by like 20% and generally found that you need less training, not more, to do it correctly despite having a faster, more detailed protocol.
This is my shit, so much.
Edit: The link to the article is in a reply to prevent masto-hugging the host but people seem to not be seeing it: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1460-9592.2006.02239.x
Evolutionary Algorithms for optimizing LLM weights Gradient descent and backpropagation have a lot of problems, alignment becomes a nightmare. Evolutionary algos fix this, but they don’t scale A recent paper, EGGROLL, makes it computationally feasible to do now www.alphaxiv.org/abs/2511.16652
Over at the Erdos problem website, AI assistance is now becoming routine. Here is what happened recently regarding Erdos problem #367 https://www.erdosproblems.com/367 :
1. On Nov 20, Wouter van Doorn produced a (human-generated) disproof of the second part of this problem, contingent on a congruence identity that he thought was true, and was "sure someoneone here is able to verify... does indeed hold".
2. A few hours later, I posed this problem to Gemini Deepthink, which (after about ten minutes) produced a complete proof of the identity (and confirmed the entire argument): https://gemini.google.com/share/81a65aecfd70 . The argument used some p-adic algebraic number theory which was overkill for this problem. I then spent about half an hour converting the proof by hand into a more elementary proof, which I presented on the site. I then remarked that the resulting proof should be within range of "vibe formalizing" in Lean.
3. Two days later, Boris Alexeev used the Aristotle tool from Harmonic to complete the Lean formalization, making sure to formalize the final statement by hand to guard against AI exploits. This process took two to three hours, and the output can be found at https://borisalexeev.com/t/Erdos367.lean
EDIT: after making this post, I decided to round things out by making AI literature searches on this problem, which (after about fifteen minutes) turned up some related literature on consecutive powerful numbers, but nothing directly relating to #367. https://chatgpt.com/share/6921427d-9dc0-800e-b798-be8fc94a9240 https://gemini.google.com/share/0d296454bea0
My notes on Gemini 3, including analyzing a 3.5 hour council meeting audio recording and performance on a new, improved version of my pelican on a bicycle benchmark https://simonwillison.net/2025/Nov/18/gemini-3/
i'm delighted to be hosting some academic course material at https://grebedoc.dev!
(yes, you can push 750 MB of slides and stuff as a single site to it. yes, i will gladly host it! no, it will not cost me any remotely meaningful amount of money, push at your leisure)
A little while ago, we had a terrific discussion with Jerry Neumann on Oxide and Friends, vowing to have him return with his co-author Elizabeth Zalman to discuss their book, "Founder vs. Investor."
Today, Jerry and Liz join @ahl , @sdtuck and me, along with Oxide investor Seth Winterroth, to get into some of the untold stories of founders and investors.
Join us today, at a special East Coast and Europe friendly time: noon Pacific, 3p Eastern:
Ashley doesn't really post here anymore but ICYMI she's now the Editor-in-Chief of the Journal of Undergraduate Neuroscience Education (JUNE)!
Which is just mind-blowing because I am pretty sure it was only yesterday we were both anxiously waiting to know whether we'd win the absolutely bonkers two body problem lottery and she'd get a faculty offer from UCSD, moving from solo research to focus on building the pipeline of neuroscience for all students.
https://bsky.app/profile/analog-ashley.bsky.social/post/3m5ttc73fwc2u
“When you liberate programming from the requirement to be professional and scalable, it becomes a different activity altogether, just as cooking at home is really nothing like cooking in a commercial kitchen”
Great post, very much describes me and programming.
i made some changes to https://eulerroom.com !!! it should work better on phones now, and should be a bit easier to read on everything
we've filled 76 out of 96 slots for the upcoming live stream marathon for Palestine. grab your slot before it's too late :)
pleaze share this around with all live coders! particularly those in different time zones to me <3 xx
Oh no - archive.today is under attack. I always was wondering how they finance their service, and who is behind it. Now it seems that the FBI is targeting it...
- people ask to put it on a blacklist: https://adguard-dns.io/en/blog/archive-today-adguard-dns-block-demand.html
- wikipedia writes the FBI subpoenaed their registrar: https://en.wikipedia.org/wiki/Archive.today
- the talk page of wikipedia marks it as a Russian company
- traceroute points to an Estonian server
- some countries already block DNS requests to it
Such a nice, illegal service.
Some notes on GPT-5.1, which is now available in the OpenAI API
The new reasoning options are interesting, but the pelican feels like a bit of a regression from GPT-5 https://simonwillison.net/2025/Nov/13/gpt-51/
I hope that I can ruin your day by getting you to read this map (courtesy wikipedia) of US town names that are portmanteaus of the two (or sometimes more) states they are near the borders of
code / data wrangler in Switzerland.
Recovering reply guy. Posts random photos once in a while.