@timkellogg.me is this Nano banana pro? Either it's a bit confused with greek vs latin alphabets, or it's making math / typographical puns
Evolutionary Algorithms for optimizing LLM weights Gradient descent and backpropagation have a lot of problems, alignment becomes a nightmare. Evolutionary algos fix this, but they don’t scale A recent paper, EGGROLL, makes it computationally feasible to do now www.alphaxiv.org/abs/2511.16652
Over at the Erdos problem website, AI assistance is now becoming routine. Here is what happened recently regarding Erdos problem #367 https://www.erdosproblems.com/367 :
1. On Nov 20, Wouter van Doorn produced a (human-generated) disproof of the second part of this problem, contingent on a congruence identity that he thought was true, and was "sure someoneone here is able to verify... does indeed hold".
2. A few hours later, I posed this problem to Gemini Deepthink, which (after about ten minutes) produced a complete proof of the identity (and confirmed the entire argument): https://gemini.google.com/share/81a65aecfd70 . The argument used some p-adic algebraic number theory which was overkill for this problem. I then spent about half an hour converting the proof by hand into a more elementary proof, which I presented on the site. I then remarked that the resulting proof should be within range of "vibe formalizing" in Lean.
3. Two days later, Boris Alexeev used the Aristotle tool from Harmonic to complete the Lean formalization, making sure to formalize the final statement by hand to guard against AI exploits. This process took two to three hours, and the output can be found at https://borisalexeev.com/t/Erdos367.lean
EDIT: after making this post, I decided to round things out by making AI literature searches on this problem, which (after about fifteen minutes) turned up some related literature on consecutive powerful numbers, but nothing directly relating to #367. https://chatgpt.com/share/6921427d-9dc0-800e-b798-be8fc94a9240 https://gemini.google.com/share/0d296454bea0
@simon way to bury the lede: pelican-AGI v2 released!
My notes on Gemini 3, including analyzing a 3.5 hour council meeting audio recording and performance on a new, improved version of my pelican on a bicycle benchmark https://simonwillison.net/2025/Nov/18/gemini-3/
i'm delighted to be hosting some academic course material at https://grebedoc.dev!
(yes, you can push 750 MB of slides and stuff as a single site to it. yes, i will gladly host it! no, it will not cost me any remotely meaningful amount of money, push at your leisure)
A little while ago, we had a terrific discussion with Jerry Neumann on Oxide and Friends, vowing to have him return with his co-author Elizabeth Zalman to discuss their book, "Founder vs. Investor."
Today, Jerry and Liz join @ahl , @sdtuck and me, along with Oxide investor Seth Winterroth, to get into some of the untold stories of founders and investors.
Join us today, at a special East Coast and Europe friendly time: noon Pacific, 3p Eastern:
Ashley doesn't really post here anymore but ICYMI she's now the Editor-in-Chief of the Journal of Undergraduate Neuroscience Education (JUNE)!
Which is just mind-blowing because I am pretty sure it was only yesterday we were both anxiously waiting to know whether we'd win the absolutely bonkers two body problem lottery and she'd get a faculty offer from UCSD, moving from solo research to focus on building the pipeline of neuroscience for all students.
https://bsky.app/profile/analog-ashley.bsky.social/post/3m5ttc73fwc2u
@ligasser I'm not sure you can rely on these types of responses
“When you liberate programming from the requirement to be professional and scalable, it becomes a different activity altogether, just as cooking at home is really nothing like cooking in a commercial kitchen”
Great post, very much describes me and programming.
@timkellogg.me another word for agency is opportunism; there is such a thing as too much of it (especially if unevenly distributed)
i made some changes to https://eulerroom.com !!! it should work better on phones now, and should be a bit easier to read on everything
we've filled 76 out of 96 slots for the upcoming live stream marathon for Palestine. grab your slot before it's too late :)
pleaze share this around with all live coders! particularly those in different time zones to me <3 xx
Oh no - archive.today is under attack. I always was wondering how they finance their service, and who is behind it. Now it seems that the FBI is targeting it...
- people ask to put it on a blacklist: https://adguard-dns.io/en/blog/archive-today-adguard-dns-block-demand.html
- wikipedia writes the FBI subpoenaed their registrar: https://en.wikipedia.org/wiki/Archive.today
- the talk page of wikipedia marks it as a Russian company
- traceroute points to an Estonian server
- some countries already block DNS requests to it
Such a nice, illegal service.
Some notes on GPT-5.1, which is now available in the OpenAI API
The new reasoning options are interesting, but the pelican feels like a bit of a regression from GPT-5 https://simonwillison.net/2025/Nov/13/gpt-51/
@timkellogg.me it is definitely a good dive in handwriting recognition (which is a niche use case by volume). I don't think the model had to make a calculation in that case though, it would only have needed enough knowledge about the context (how much sugar is typically bought, what units are used...).
Gemini 3 may still be better than others at synthesizing context and its learned knowledge, but that seems more of an incremental improvement than the post makes it out to be.
code / data wrangler in Switzerland.
Recovering reply guy. Posts random photos once in a while.