Simon boosted

I'm an IT professional and I:

Simon boosted
Simon boosted
Simon boosted

It is tempting to view the capability of current AI technology as a singular quantity: either a given task X is within the ability of current tools, or it is not. However, there is in fact a very wide spread in capability (several orders of magnitude) depending on what resources and assistance gives the tool, and how one reports their results.

One can illustrate this with a human metaphor. I will use the recently concluded International Mathematical Olympiad (IMO) as an example. Here, the format is that each country fields a team of six human contestants (high school students), led by a team leader (often a professional mathematician). Over the course of two days, each contestant is given four and a half hours on each day to solve three difficult mathematical problems, given only pen and paper. No communication between contestants (or with the team leader) during this period is permitted, although the contestants can ask the invigilators for clarification on the wording of the problems. The team leader advocates for the students in front of the IMO jury during the grading process, but is not involved in the IMO examination directly.

The IMO is widely regarded as a highly selective measure of mathematical achievement for a high school student to be able to score well enough to receive a medal, particularly a gold medal or a perfect score; this year the threshold for the gold was 35/42, which corresponds to answering five of the six questions perfectly. Even answering one question perfectly merits an "honorable mention". (1/3)

Simon boosted

halone.within.lgbt need to test something, thanks!

Boost for reach!

Simon boosted

I scraped the schedule for Open Sauce 2025 this morning and built an alternative schedule interface with the option to add everything to your calendar (via ICS)... working entirely on my iPhone, using OpenAI Codex and Claude Artifacts

I guess you could call this "vibe scraping"? OpenAI Codex turns out to be great at writing custom scrapers if you give it internet access and tell it to download and install Playwright

Prompts + transcripts: simonwillison.net/2025/Jul/17/

Simon boosted
Simon boosted

“the worst possible thing, in your own mind” 😳

Show thread
Simon boosted

I've been thinking about this comment from Ted a lot since he posted it. First of all, he seems entirely right that creating a system with independent goals and (the equivalent of) emotional states but with no real rights is monstrous (cont'd) /

RE: https://bsky.app/profile/did:plc:565ebob5f6hw33hjdkxty6qj/post/3ltq3xtqtjc2s

Ted Underwood  
I think what people underestimate is that, at some point, it’s going to be unethical to give these things what they’re missing — if what they’re m...

@rtyler @bert_hubert I do think a lot of people fail to grasp the potential implication of performance at scale (Dan Luu has written on this, e.g. danluu.com/algorithms-intervie. then again, most people (me included, I expect*) do not work on code that runs at scale.

*at least that was not meant to run at scale 🤞

@mcc ghost Kirby is afraid of whatever that is. Is that business Vegeta?

Simon boosted
Simon boosted

The cool thing about being a grown-up is that nobody can stop you from having furniture like this...

#robot #bedside #table #BotLife

Simon boosted

Would you watch a stream where I try to use vibe coding tools?

Simon boosted

Nat, do not work for Meta. Build your own. Be your own person. Under no circumstances should you ever join their cult of hot or not. Nat, call me. I'll work with you.

@akamran @noplasticshower I want a t-shirt that says "a cross between a lentil and a velociraptor"

Simon boosted

@noplasticshower "Lone star ticks are aggressive and can speedily follow a human target if they detect them. “They will hunt you, they are like a cross between a lentil and a velociraptor,” said Sharon Pitcairn Forsyth, a conservationist who lives in the Washington DC area."
😬

@nic221 newspapers are not universal enough (Spotify/Netflix*) or specific enough (substack newsletters with a dedicated audience of 'true fans') to work as a subscription business. I personally would like to be able to pay a small amount per article, instead of all-you-can-eat or nothing, but I'm not sure if this is a widespread desire. For now newspapers are mostly losing the game of chicken with search engines (they make the whole content of the article available to crawlers, so I can read it for free on archive.today et al). In my home region they don't do it, which I imagine leads to very low organic growth, but maybe they don't care.

* seems to be showing its limitations with the proliferation of TV subscriptions...

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.