Gave some elevator logic code (Python) to a room of smart coders at a trading firm yesterday and asked "does it have any bugs in it?" Nobody found the bug, but a few noted some hard-to-describe oddities that were bug-adjacent perhaps.

Just gave the same code to Gemini Pro 2.5 and asked it to find bugs. It found a few superficial Python things and a few bits of odd behavior that none of the human coders found. But, it also didn't find the bug.

So, the elevator remains undefeated.

@dabeaz now I want to see this code. I’ve never been humbled by an elevator before.

@neutrinoceros For the full humbling experience, read over the first part of this article and follow my instructions: "Ok computer person, code me up an elevator and prove that the code actually works. It's just an elevator."

Image source: technologyreview.com/2023/02/1

@dabeaz @neutrinoceros Ick, Rust - solving non-problems by making basic coding more error prone since 2007

@vy @neutrinoceros Rust aside, one will very quickly come around to the idea that an elevator might be crashed due to dozens of other things besides memory safety.

@dabeaz @neutrinoceros or it could easily be a memory safety issue in a driver or hardware controller where Rust would have to run in unsafe mode or use a C library

Follow

@vy @dabeaz @neutrinoceros At least the elevator cabin itself didn't crash from the top floor while the doors were open, cutting someone's stuff in progress.
(Gonna need some model checking to prevent that...)

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.