Don't you love it when clang test suite fails because of pthread_create bad address or something and the stack trace points to glibc :akkoderp:

So it's like is clang breaking (scary) or it's glibc (scarier)

Though it only happens with clang-built-clang (and not with gcc-built-clang) so my guess is that it's clang that is broken but

Okay it seems to not happen when I set LIBCLANG_NOTHREADS so probably something happens during thread creation

Also, if I single-step through it with gdb then the problem's gone...

But it still happens if I just press continue through all the breakpoints

So, race condition?

So setting a breakpoint *just* before the instruction that traps to the kernel to do clone(2) seems to slow things down enough that the testcase succeeds, so now I strongly believe that this is indeed a race

Now the question is, how do I track this down :notlikemiya:

Follow

@koakuma is it simple to compile clang with clang with tsan or asan? (TSAN might not find a race condition that's not a data race, but if the symptom is unallocated memory deref, then asan's stacks at deallocation time might contain enough of a clue)

@robryk Unfortunately I am not on a supported platform now (sparc64) so no tsan for me...

@koakuma

Huh, I wonder if this is platform-specific. I'm starting to have suspicions that TLS might be broken in some way. Were it not for your sched-locked observation I'd also suspect disagreement between clang and libc on how to implement different memory orders on that CPU (e.g. on x86_64 there are two ways to implement c++ memory model which are incompatible with one another and basically everyone picks one of them).

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.