Closed
Bug 1141568
Opened 10 years ago
Closed 5 years ago
TSan: data race js/src/jsgc.cpp:1794 prepareForIncrementalGC
Categories
(Core :: JavaScript: GC, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: froydnj, Unassigned)
References
(Blocks 2 open bugs)
Details
(Whiteboard: [tsan])
Attachments
(2 files, 1 obsolete file)
The attached logfile shows a thread/data race detected by TSan (ThreadSanitizer).
* Specific information about this bug
I see this race a fair amount (handful of times per small mochitest run), but I've never been able to get satisfactory stacks for the reading side of the race. (I assume the read comes from JIT code?) Posting this here in case somebody who's more acquainted with the GC understands what's going on...Terrence, any ideas?
* General information about TSan, data races, etc.
Typically, races reported by TSan are not false positives, but it is possible that the race is benign. Even in this case though, we should try to come up with a fix unless this would cause unacceptable performance issues. Also note that seemingly benign races can possibly be harmful (also depending on the compiler and the architecture) [1][2].
If the bug cannot be fixed, then this bug should be used to either make a compile-time annotation for blacklisting or add an entry to the runtime blacklist.
[1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
[2] _How to miscompile programs with "benign" data races_: https://www.usenix.org/legacy/events/hotpar11/tech/final_files/Boehm.pdf
Flags: needinfo?(terrence)
Comment 1•10 years ago
|
||
Sorry it's taken me so long to look at this new round of races! Since we only have one stack here, it's really hard to tell what's going on. Nathan, do you have any other reports that end in prepareForIncrementalGC that have the other stack present?
What can we infer from the single stack? The fact that the updated field has size 2 means this is probably not a pointer, so it is probably the bit-field write of allocatedDuringIncremental. This bit field includes allocKind and a ton of other useful things that we read all over the place. The background sweep thread is indeed running here; however, the main thread is only looking at arenas from the emptyLists, which the background sweep thread should not be looking at -- it has its own list. I guess we don't halt parse threads for GC, so maybe some path there can call thing->allocKind()?
I think that's about as much as we can tell without the second stack.
Flags: needinfo?(terrence) → needinfo?(nfroyd)
| Reporter | ||
Comment 2•10 years ago
|
||
(In reply to Terrence Cole [:terrence] from comment #1)
> Sorry it's taken me so long to look at this new round of races! Since we
> only have one stack here, it's really hard to tell what's going on. Nathan,
> do you have any other reports that end in prepareForIncrementalGC that have
> the other stack present?
No problem, thanks for looking at these!
I haven't been able to get a two-sided stack report for this race yet. I will be looking for one with great eagerness. :)
The report says the read is coming from the "Analysis Helper" thread; I assume that's the "parse thread" you're talking about...or is that some JIT thread running?
Flags: needinfo?(nfroyd) → needinfo?(terrence)
Comment 3•10 years ago
|
||
"Analysis Helper" is one of the JS thread pool threads. We run all of our background work via this threadpool, so it could be absolutely anything.
Flags: needinfo?(terrence)
| Reporter | ||
Comment 4•10 years ago
|
||
Finally got two-sided stacks for this. Terrence, any insight based on this?
Attachment #8575321 -
Attachment is obsolete: true
Flags: needinfo?(terrence)
Comment 5•10 years ago
|
||
Uhg. This one is going to be hard to fix, but at least I don't think there is going to be any observable side-effects from any arbitrary ordering.
So what's going on here: On the background thread, the compiler backend is looking at its template objects in order to generate code to handle objects like those kinds of objects. Some of that information (e.g. the allocKind) is stored in arena headers for sharing. Simultaneously, we are doing a GC that is traversing the arena headers. They are reading and writing on different fields; however, these fields are stored in a bit field.
The compiler thread is doing reads of allocKind and the gc threads is doing reads and writes of allocatedDuringIncremental. So, allocKind is fixed for the duration of the GC and the compilation, because we cannot sweep the arena, because the objects are held by the compiler. Therefore, *unless* the gc thread temporarily writes invalid allocKinds into that word, I think this is "safe". This code is crazily inlined, so I wasn't able to trivially pinpoint the relevant writes -- I'll do a build with MOZ_NEVER_INLINE later and see if I can at least get some confidence that I'm not lying at least for clang and gcc.
Flags: needinfo?(terrence)
| Reporter | ||
Comment 6•10 years ago
|
||
I learned today that I wasn't using llvm-symbolizer to print stacks, which does a much better job of dealing with inlined code. I can see if I can reproduce the race with llvm-symbolizer active.
There's no such thing as a "safe" race. :p
| Reporter | ||
Comment 7•10 years ago
|
||
And thank you for the quick response!
Comment 8•10 years ago
|
||
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #6)
> I learned today that I wasn't using llvm-symbolizer to print stacks, which
> does a much better job of dealing with inlined code. I can see if I can
> reproduce the race with llvm-symbolizer active.
>
> There's no such thing as a "safe" race. :p
Hey, I used appropriate scare quotes around "safe". ;-)
Specifically, by "safe", I mean that I do not see any way that a RAR, RAW, WAR, or WAW reordering could result in dynamically unhandled behavior, and spent a good 15 minutes thinking about the possibilities. Of course, the caveat here is that this is only an analysis at the C++ level -- whether that holds for any particular combination of CPU, compiler, and optimization level is anyone's guess. I, for one, would guess: not safe.
Comment 9•10 years ago
|
||
(In reply to Terrence Cole [:terrence] from comment #8)
> Specifically, by "safe", I mean that I do not see any way that a RAR, RAW,
> WAR, or WAW reordering could result in dynamically unhandled behavior, and
> spent a good 15 minutes thinking about the possibilities. Of course, the
> caveat here is that this is only an analysis at the C++ level -- whether
> that holds for any particular combination of CPU, compiler, and optimization
> level is anyone's guess. I, for one, would guess: not safe.
Yes. Proving that it's harmless at the source level is not enough
because the standard (C++11) specifically allows implementations to reorder
delivery of stores between cores. That mean both compilers and hardware
are free to do such reordering. Also, since C++11 declares racey code to
constitute undefined behaviour, if a compiler can ever prove the code
racey, it can transform it however it likes.
Comment 10•8 years ago
|
||
I'm skimming through the current races reported while running SM(tsan), and I think the attached report matches -- or at least, is very similar to -- the one discussed here.
Jan, I don't know how space-sensitive MIR nodes are. Would it be possible to store a template object's allocKind alongside the object pointer? Or at least, I'm assuming we could record that information on the main thread when creating the node; perhaps that is incorrect.
Alternatively, we could bloat up Arenas by splitting out the allocKind field into its own word. It'd be a shame to waste the space, although on 64 bits I'm not sure it would change anything. (We have 4056 bytes of data in a 4096 byte Arena, which is enough to fit 253 16-byte Cells with 8 bytes of slop. And a js::ShapedObject is 16 bytes. Though even a NativeObject is 32 bytes.)
Flags: needinfo?(jdemooij)
Comment 11•8 years ago
|
||
(In reply to Steve Fink [:sfink] [:s:] from comment #10)
> Jan, I don't know how space-sensitive MIR nodes are. Would it be possible to
> store a template object's allocKind alongside the object pointer? Or at
> least, I'm assuming we could record that information on the main thread when
> creating the node; perhaps that is incorrect.
That should be fine.
(We keep running into these issues with template objects where the background thread reads some immutable field that happens to share a word with something else :/ Eventually I'd like to encapsulate template objects better, or store all info we need in a separate data structure, to avoid this.)
Flags: needinfo?(jdemooij)
Comment 12•5 years ago
|
||
(In reply to Terrence Cole [:terrence] from comment #5)
The gc threads is doing reads and writes of allocatedDuringIncremental
This field has been removed, so closing this bug.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•