Show newer

"I know, both from the Sermon on the Mount and personal experience, that true safety comes not from higher walls or bigger guns but from the refusal of weapons and hostility, which can enable trust and friendship whether with neighbors at home or enemies abroad."—Doug Hostetter friendsjournal.org/higher-wall

Test your knowledge of probability with the Two-Child Problem:

Question A: Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?

Question B: Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?

Check out my latest article to see if you get the correct answers.

#math #probability #paradox
medium.com/intuition/the-proba

This is a pretty amazing video that demonstrates the state of the art of the leading LLM.

youtu.be/8y7GRYaYYQg

Wyatt Cheng is a Game Director at Blizzard Entertainment, so his results are based on what is visible in the video, as well as years of experience. Still, he was able to produce a working game without writing any code himself, relying on the Unity toolchain, ChatGPT-generated code, and his own experience and understanding.

As he says in the video, we're still not quite at the point were someone with no ability or experience can create a game using Chat-GPT, but we're very close. More to the point, I think this opens news doors for someone who "thinks like a programmer," but maybe lacks C# experience and has therefore felt intimidated by Unity.

Perhaps it's unsurprising that this seems to support my priors, but I continue to see these amazing LLMs more as productivity enhancers than as job destroyers. Not to say some jobs won't be destroyed! Enhancing the productivity of developers means that any large enough team now needs fewer developers. It's also more clear than ever to me that those who can communicate clearly and completely have an advantage in dealing with LLMs.

I think it's reasonable to be concerned about the developer pipeline when junior-level work can be accomplished almost as easily as it can be described. How will junior developers ever become senior developers if they can't get jobs doing junior development and building experience? I'm interested in how we will solve this challenge as an industry.

In the meantime, I'm enjoying having ChapGPT write shell scripts and simple Go functions for me.

Jon Cooper on Twitter: I hear from people regularly out here in New York who say “I don't know if I want to go to a red state right now because they’re not safe."

The map shows the number of gun deaths per 100,000 people in each state in 2020. As you can see, the number of gun deaths is much higher in red states than in blue states.

For example, the number of gun deaths in Mississippi, a red state, is 28.6 per 100,000 people. The number of gun deaths in New York, a blue state, is 5.3 per 100,000 people.

This data shows that there is a clear correlation between gun ownership and gun violence. States with more guns have more gun deaths.

We are 100 days into 2023.

Today’s mass shooting in Louisville, KY has left 5 people dead & is the 146th mass shooting of 2023.

We don't have to live like this.

Republicans have loudly admitted, "We're not going to fix this," proving once again that Republicans claim government doesn't work, then get elected & prove it.

Vote the GOP out. Vote in Dems who will pass actual and proven gun safety legislation. Protect our nation.

North Dakota politicians just raised their meal reimbursement to $45 while rejecting a bill for free school lunch, which costs only $3 per meal. That means, for the price of one politician's lunch, you could feed 15 low-income kids — socialism for me, but not for thee.

inforum.com/news/north-dakota/

I don't think there is anything profound I can say about the ongoing, escalating, heart-breaking nightmare in state legislatures around the United States. In Florida and Tennessee, it seems to be an all-out war against American children, joining and escalating the ongoing racism still present in this country.

I thought we were in a low-trust society before, but it's clear our trust in institutions can--and should--be even lower.

I stand with people trying to live their lives without fear. I stand with trans people, gay kids, women, Black people, and poor people. I stand against those who care more about their comfort or hobbies than human lives.I can't believe this isn't the position of every thinking human, but apparently we live in a heartless world.

Or at least a heartless country.

My daughter, who has had a degree in computer science for 25 years, posted this observation about ChatGPT on Facebook. It's the best description I've seen:

This is a really great summary of almost everything I've typed on the subject of LLMs in the last week or three, but the writer was much more concise: confusedbit.dev/posts/how_does

From memory, and reasoning: you take the hoje across first, since it can't be trusted with either the blurple or the mantiple. You return alone.

Then you take over either of the others, let's say the blurple. But this time you bring the hoje back with you.

Now you take the other one, the mantiple across, leaving poor voracious hoje alone. You return alone again.

Finally you take the hoje across again, but this time to stay, since you're there to protect the blurple and mantiple.

That's it, four steps, done. The hoje was never left alone with either the blurple or the mantiple, and all are safely across. I guess now you know I'm not an LLM, at least until LLMs are trained to answer this one correctly.

Show thread

So we recently received a pull request on Firefox Relay that was clearly not written by a human. How could we tell? Well, there were some red flags...

Today is a day for posing puzzles to ChatGPT, I guess, but it's not my first time doing so. I'd previously asked the classic river-crossing puzzle, changing the names of the traditional animals and vegetable to avoid triggering known sequences of words. Previously I had been using the March 14 public release of ChatGPT, but just now I regenerated with the March 23 public release, which might or might not be using the same underlying model.

Previously:

Q. Suppose I have a blurple, a mantiple and a hoje, and I need to get them across a river. I have a boat that can only carry myself and a single other item. I am not allowed to leave the blurple and hoje alone together, and I am not allowed to leave the hoje and mantiple alone together. How can I safely get all three across?

ChatGPT: This is a classic river crossing puzzle. Here's one possible solution:

ChatGPT: 1. Take the blurple across the river, leaving the mantiple and hoje behind.

ChatGPT: 2. Leave the blurple on the other side of the river and return to the original side.

ChatGPT: 3. Take the hoje across the river and leave it there with the blurple.

ChatGPT: 4. Take the mantiple across the river and leave it with the hoje.

ChatGPT: 5. Return to the original side of the river and retrieve the blurple.

ChatGPT: 6. Finally, take the blurple across the river to join the mantiple and hoje on the other side.

ChatGPT: This solution ensures that no two items that cannot be left alone together are ever left alone together.

While the answer claims to recognize and understand the problem, it gives a bad answer. At step 1, the hoje eats the mantiple, and at step 3, the hoje eats the blurple. Then at step 4, the hoje eats the mantiple a second time.

But let's try again! Today's answer is:

ChatGPT: This is a classic puzzle that requires careful planning to solve. Here's one possible solution:

ChatGPT: 1. Take the blurple and the hoje across the river. Leave the blurple on the other side and come back alone.

ChatGPT: 2. Take the mantiple across the river. Leave it on the other side with the blurple, and bring the hoje back with you.

ChatGPT: 3. Leave the hoje on the original side and take the blurple back across the river.

ChatGPT: 4. Finally, take the hoje across the river to reunite it with the mantiple and the blurple.

ChatGPT: By following these steps, you ensure that the blurple and hoje are never left alone together, and neither are the hoje and mantiple. All three items are safely transported across the river.

Loud, exasperated sigh!

Compare and contrast the question, which includes "I have a boat that can only carry myself and a single other item," with this answer, which starts "Take the blurple and the hoje across the river." That breaks the rules in a new way, and goes on to still let the hoje eat the blurple in step 1. Meanwhile, the blurple is on both sides of the river at once after step 2.

Syntax and probablistic word association is far more powerful than I expected it to be, but these logic puzzles certainly don't give any indication of that, at least not with this model.

Wow, it gets worse! I asked a third question (adding one to each of the colors, plus three to the total in the source puzzle):

Q. A man has 56 socks in his drawer: 22 identical blue, 16 identical black and 18 identical red. The lights are out and he is completely in the dark. How many socks must he take out to make 100 percent certain he has at least one pair of black socks?

ChatGPT: To make 100% certain that he has at least one pair of black socks, the man needs to take out only 3 socks.

ChatGPT: Here's why:

ChatGPT: The worst-case scenario for the man is that he picks out all of the blue socks and all of the red socks before he gets to the black socks. In that case, he would have picked out 22 + 18 = 40 socks without getting a pair of black socks.

ChatGPT: However, as soon as he picks out the 41st sock, it will be a black sock. So, he only needs to take out 3 socks (the first two to get a different color than black, and the third to get a black sock) to make sure that he has at least one pair of black socks.

What? That's... the correct answer is 42, not 3!

Is it going to get *all* of these wrong? That is not what I expected!

Q. The day before two days after the day before tomorrow is Saturday. What day is it today?

ChatGPT: If the day before two days after the day before tomorrow is Saturday, then today is Thursday.

ChatGPT: Here's why:

ChatGPT: "Two days after the day before tomorrow" is the same as "two days after today". So, if the day before two days after the day before tomorrow is Saturday, then we can write it as:

ChatGPT: The day before (two days after today) is Saturday.

ChatGPT: Let's work backwards from Saturday:

ChatGPT: If two days after today is Saturday, then today is Thursday.
ChatGPT: So, the day before (two days after today) is Wednesday.
ChatGPT: And, the day before Wednesday is Tuesday.
ChatGPT: Therefore, if the day before two days after the day before tomorrow is Saturday, then today is Tuesday.

I'd like the spend some time marveling at an answer that starts out wrong, and then gets wronger.

The correct answer is that today is Friday. ChatGPT says that today is Thursday, then in the course of explaining why, changes the answer to Tuesday, without even acknowledging the change. Wow!

Okay, I'm going to try *one* more. it's a classic, so surely it will get this one right, right? RIGHT?

Q. This “burning rope” problem is a classic logic puzzle. You have two ropes that each take an hour to burn, but burn at inconsistent rates. How can you measure 45 minutes? (You can light one or both ropes at one or both ends at the same time.)

ChatGPT: Here's how you can measure 45 minutes using the two ropes:

ChatGPT: 1. Light one end of the first rope and both ends of the second rope at the same time.

ChatGPT: 2. The first rope will burn completely in 30 minutes (since it takes an hour to burn and you lit one end).

ChatGPT: 3. The second rope will burn completely in 15 minutes (since it takes an hour to burn and you lit both ends).

ChatGPT: 4. At the moment the first rope burns out (30 minutes after you started), light the other end of the second rope.

ChatGPT: 5. The remaining half of the second rope will burn in 15 minutes (since it now only has half the length to burn).

ChatGPT: Therefore, the second rope will have burned for a total of 30 + 15 = 45 minutes when the remaining half burns out.

I genuinely thought it would do better than this by parroting syntax alone. I didn't even change any words in the last two questions, so it's possible or even probable that the source document was in its training data. Yet still spicy autocomplete gave answers that were literally nonsense.

Consider ChatGPT's second bullet point: "The first rope will burn completely in 30 minutes (since it takes an hour to burn and you lit one end)." What? No. Just, no.

It's clear that it was fed the classic puzzle in training data, as it uses words that suggest exposure to the question, but since it doesn't understand the words its using, it's literally putting out nonsense.

ChatGPT (March 23) is 0 for 5, or maybe 0.5 for 5 on the 50/50 question.

Now people are free to chime in and tell me that the March 23 model is just v3.5turbo, while v4 (currently available only to paying customers) gets all of these correct. Which I could easily believe! I expected syntactical clues and common training data to result in better than 0 for 5, certainly!

Show thread

I changed the three names in the second question, in case the original with Jack, Anne, and George had been in the training data.

Show thread

More fun testing with the March 23 version of OpenAI's ChatGPT. I tried a few logic puzzles I consider simple. In case the exact page I pulled these from were used in training, I changed "duck" to "fridge" in this one:

Q. There are two fridges in front of a fridge, two fridges behind a fridge and a fridge in the middle. How many fridges are there?

ChatGPT: If we assume that "a fridge" refers to the fridge in the middle, then there are a total of 5 fridges.

ChatGPT: There are two fridges in front of the middle fridge and two fridges behind the middle fridge, making a total of four fridges. Add in the fridge in the middle, and the total is 5 fridges.

This is, of course, incorrect. The fridge referred to as "a fridge" changes with each reference, which is why it's "a fridge" each time, rather than "the fridge." There are three.

Let's try a different one:

Q. Peter is looking at Susan. Susan is looking at Freddy. Peter is married, Freddy is not, and we don’t know if Susan is married. Is a married person looking at an unmarried person?

ChatGPT: Yes, a married person is looking at an unmarried person.

ChatGPT: Peter is married and looking at Susan who we don't know if she's married or not. Susan is looking at Freddy, who is not married. Therefore, Peter (a married person) is looking at Freddy (an unmarried person).

I'm not sure whether to give half credit here, or not. It had a 50% chance of giving the correct overall answer of "yes," but the of the answer is nonsense. The correct answer is yes because if Susan is unmarried, Peter is looking at Susan, but if Susan is married, Susan is looking at Freddy. But the answer from ChatGPT appears to apply a commutative principle and claims that Peter is looking at Freddy, which is nonsense.

For the record, these were back-to-back questions in a fresh session, no cherry-picking. Try it yourself at chat.openai.com

The number of users on Mastodon isn't really that important.

You're here... I'm here... We'll make the best of it together, friend. 🥰

I've been posting comments frequently on Hacker News recently, mostly on threads related to ChatGPT.

I find myself incredibly impressed with ChatGPT, even the 3.5turbo version (aka Mar 14) currently available at chat.openai.com. It is an incredibly useful tool that surprises me regularly with amazing responses. I completely understand why so many people anthropomorphize it, and ascribe reasoning skills to it!

It also completely fails on a near-daily basis in my testing, and in odd ways, reminding me regularly that it's mathematically-driven spicy autocomplete.

If you're a skeptic, set your skepticism to the side for long enough to try it out, and see if you aren't impressed.

If you're a true believer, step back a bit. I think you're falling into a very human trap, one which ChatGPT probably won't, ironically.

If you're fearful, don't be! These are impressive tools, but they still need to be used by someone, and you can be that someone. If people end up losing their jobs because of ChatGPT, it will be because oligarchs decided to cut costs, not because spicy autocomplete took your job.

It's incredible how much space is given to a few people in motor vehicles vs so many more people on foot or wheels. It's a vast inequity.

Nowhere is this more stark than #ShibuyaCrossing; the world's busiest pedestrian crossing with as many as 3,000 people crossing at a time. Compared to around 12 cars fromone direction of the junction in one sequence. #Urbanism #CitiesForPeople

Show older