Evolutionary Algorithms for optimizing LLM weights Gradient descent and backpropagation have a lot of problems, alignment becomes a nightmare. Evolutionary algos fix this, but they don’t scale A recent paper, EGGROLL, makes it computationally feasible to do now www.alphaxiv.org/abs/2511.16652
Over at the Erdos problem website, AI assistance is now becoming routine. Here is what happened recently regarding Erdos problem #367 https://www.erdosproblems.com/367 :
1. On Nov 20, Wouter van Doorn produced a (human-generated) disproof of the second part of this problem, contingent on a congruence identity that he thought was true, and was "sure someoneone here is able to verify... does indeed hold".
2. A few hours later, I posed this problem to Gemini Deepthink, which (after about ten minutes) produced a complete proof of the identity (and confirmed the entire argument): https://gemini.google.com/share/81a65aecfd70 . The argument used some p-adic algebraic number theory which was overkill for this problem. I then spent about half an hour converting the proof by hand into a more elementary proof, which I presented on the site. I then remarked that the resulting proof should be within range of "vibe formalizing" in Lean.
3. Two days later, Boris Alexeev used the Aristotle tool from Harmonic to complete the Lean formalization, making sure to formalize the final statement by hand to guard against AI exploits. This process took two to three hours, and the output can be found at https://borisalexeev.com/t/Erdos367.lean
EDIT: after making this post, I decided to round things out by making AI literature searches on this problem, which (after about fifteen minutes) turned up some related literature on consecutive powerful numbers, but nothing directly relating to #367. https://chatgpt.com/share/6921427d-9dc0-800e-b798-be8fc94a9240 https://gemini.google.com/share/0d296454bea0
My notes on Gemini 3, including analyzing a 3.5 hour council meeting audio recording and performance on a new, improved version of my pelican on a bicycle benchmark https://simonwillison.net/2025/Nov/18/gemini-3/
i'm delighted to be hosting some academic course material at https://grebedoc.dev!
(yes, you can push 750 MB of slides and stuff as a single site to it. yes, i will gladly host it! no, it will not cost me any remotely meaningful amount of money, push at your leisure)
A little while ago, we had a terrific discussion with Jerry Neumann on Oxide and Friends, vowing to have him return with his co-author Elizabeth Zalman to discuss their book, "Founder vs. Investor."
Today, Jerry and Liz join @ahl , @sdtuck and me, along with Oxide investor Seth Winterroth, to get into some of the untold stories of founders and investors.
Join us today, at a special East Coast and Europe friendly time: noon Pacific, 3p Eastern:
Ashley doesn't really post here anymore but ICYMI she's now the Editor-in-Chief of the Journal of Undergraduate Neuroscience Education (JUNE)!
Which is just mind-blowing because I am pretty sure it was only yesterday we were both anxiously waiting to know whether we'd win the absolutely bonkers two body problem lottery and she'd get a faculty offer from UCSD, moving from solo research to focus on building the pipeline of neuroscience for all students.
https://bsky.app/profile/analog-ashley.bsky.social/post/3m5ttc73fwc2u
“When you liberate programming from the requirement to be professional and scalable, it becomes a different activity altogether, just as cooking at home is really nothing like cooking in a commercial kitchen”
Great post, very much describes me and programming.
i made some changes to https://eulerroom.com !!! it should work better on phones now, and should be a bit easier to read on everything
we've filled 76 out of 96 slots for the upcoming live stream marathon for Palestine. grab your slot before it's too late :)
pleaze share this around with all live coders! particularly those in different time zones to me <3 xx
Oh no - archive.today is under attack. I always was wondering how they finance their service, and who is behind it. Now it seems that the FBI is targeting it...
- people ask to put it on a blacklist: https://adguard-dns.io/en/blog/archive-today-adguard-dns-block-demand.html
- wikipedia writes the FBI subpoenaed their registrar: https://en.wikipedia.org/wiki/Archive.today
- the talk page of wikipedia marks it as a Russian company
- traceroute points to an Estonian server
- some countries already block DNS requests to it
Such a nice, illegal service.
Some notes on GPT-5.1, which is now available in the OpenAI API
The new reasoning options are interesting, but the pelican feels like a bit of a regression from GPT-5 https://simonwillison.net/2025/Nov/13/gpt-51/
I hope that I can ruin your day by getting you to read this map (courtesy wikipedia) of US town names that are portmanteaus of the two (or sometimes more) states they are near the borders of
Can I nerd snipe y’all with a question?
I’ve seen mentioned a few times obliquely about the toolkit of services that hyperscaler teams use to build high powered distributed systems.
I’m aware of various “building blocks” that are used to build distributed systems in hyperscalers. eg, a distributed lock service, disaggregated write ahead log, and perhaps a few others, but I don’t think anyone has written down a glossary of these? Am I missing one? Or is this just cult knowledge?
(Object storage and serverless compute feel like they would fit here, but they’re too high level for what I’m thinking of. I’m more interested in things that would probably never make it as a product offering because they’re a bit too bare bones but are still invaluable for building systems. Log structured merge trees come to mind as another potential example here.)
Does anyone have anything they could share regarding said glossary of building blocks? I’d love to know more about how various teams view these building blocks and how they compose them effectively to build distributed systems at scale. As well as what they think those components are 🙂
Hallo Fediversum,
ich habe mein iPad in einem Flugzeug nach Singpapur vergessen. Da ich mich auf der Durchreise befand, klappte es nicht mit einer Übergabe oder Weiterleitung nach Bangkok. Also liegt das iPad noch bei Lost&Found am Changi-Airport Singapur. Gibt es ggf. hier jemanden, der 1. aus dem Kölner Raum kommt und aktuell in Singapur ist oder am Flughafen Changi vorbeikommt und das iPad von dort mit nach Deutschland bringen könnte? Bitte boosten, Danke!
#singapur #changi_airport #ipad
The talks from posit::conf are all online now, which means you can watch my keynote on the Psychology of Technologists here. I am very happy this was recorded as it was one of the talks of the year I'm the most proud of.
If you've been following my work for a while you will see many familiar projects but I believe this includes a lot of the WHY behind what I do. ❤️
code / data wrangler in Switzerland.
Recovering reply guy. Posts random photos once in a while.