**CF Bolz-Tereick** @cfbolz@mastodon.social · Dec 04, 2022, 12:46

**CF Bolz-Tereick** @cfbolz@mastodon.social · Dec 04, 2022, 12:46

CF Bolz-Tereick @cfbolz@mastodon.social

Dec 04, 2022, 12:46

CF Bolz-Tereick @cfbolz@mastodon.social

On Twitter I had a thread going this year in which I tried to reflect on bugs that I found throughout the year, how to avoid this kind of bug, what can be learned, etc. I will port this idea over to here and see how it goes in the future (I'm still both here and on Twitter, we'll see how that goes).

**CF Bolz-Tereick** @cfbolz@mastodon.social · Dec 04, 2022, 12:47

**CF Bolz-Tereick** @cfbolz@mastodon.social · Dec 04, 2022, 12:47

Dec 04, 2022, 12:47

CF Bolz-Tereick @cfbolz@mastodon.social

Recently I fixed a bug in PyPy's time.strftime. It was using some unicode helper function that takes as argument a byte buffer with some utf-8 encoded string, as well as the number of code points. strftime was using this API wrong and passing the number of bytes instead.

https://foss.heptapod.net/pypy/pypy/-/issues/3862

**CF Bolz-Tereick** @cfbolz@mastodon.social · Dec 04, 2022, 12:48

**CF Bolz-Tereick** @cfbolz@mastodon.social · Dec 04, 2022, 12:48

Dec 04, 2022, 12:48

CF Bolz-Tereick @cfbolz@mastodon.social

After finding the bug we tried to make this API more robust by having a check in the function that counts the codepoints in the byte buffer and complains if that is different from the second argument. This shouldn't be one by default for performance reasons, but it's on during testing.

The reason why the bug got away for so long is that if you test only with ASCII chars it works, because number of bytes == number of codepoints in that case. Lesson: write tests with wider ranges of characters.

**CF Bolz-Tereick** @cfbolz@mastodon.social · Dec 04, 2022, 13:01

**CF Bolz-Tereick** @cfbolz@mastodon.social · Dec 04, 2022, 13:01

Dec 04, 2022, 13:01

CF Bolz-Tereick @cfbolz@mastodon.social

Another bug, this time in itertools.tee: tee has an optimization that uses a __copy__ method on the iterator if it has one, instead of carefully using its generic implementation. However, PyPy got it wrong and copied the *iterable* instead of the iterator

https://foss.heptapod.net/pypy/pypy/-/issues/3852

This works in simple tests, but in more complicated situations it gives nonsense.

**Paul Ganssle** @pganssle@qoto.org · 2022-12-04T13:18:50Z

Paul Ganssle @pganssle@qoto.org

@cfbolz Jeez this sounds like it could create some very annoying-to-debug situations. 😅

Dec 04, 2022, 13:18 · · Tusky · · ·

**CF Bolz-Tereick** @cfbolz@mastodon.social · Dec 04, 2022, 13:23

**CF Bolz-Tereick** @cfbolz@mastodon.social · Dec 04, 2022, 13:23

Dec 04, 2022, 13:23

CF Bolz-Tereick @cfbolz@mastodon.social

@pganssle I think the fact that it went unnoticed for a long time means that people don't use tee, and if they do, their objects don't have __copy__ 😅

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…