Closed
      
        Bug 979293
      
      
        Opened 11 years ago
          Closed 10 years ago
      
        
    
  
TSan: Off-thread parsing races the main thread accessing the atom table (data race js/HashTable.h:715 isFree)   
    Categories
(Core :: JavaScript Engine, defect)
Tracking
()
        RESOLVED
        FIXED
        
    
  
        
            mozilla39
        
    
  
| Tracking | Status | |
|---|---|---|
| firefox39 | --- | fixed | 
People
(Reporter: decoder, Assigned: n.nethercote)
References
(Blocks 1 open bug)
Details
(Whiteboard: [tsan])
Attachments
(3 files, 2 obsolete files)
| 31.06 KB,
          text/plain         | Details | |
| 4.93 KB,
          patch         | luke
:
              
              review+ | Details | Diff | Splinter Review | 
| 10.73 KB,
          patch         | bhackett1024
:
              
              review+ | Details | Diff | Splinter Review | 
The attached logfile shows a thread/data race (mozilla-central revision 626d99c084cb) detected by TSan (ThreadSanitizer).
Typically, races reported by TSan are not false positives, but it is possible that the race is benign. Even in this case though, we should try to come up with a fix unless this would cause inacceptable performance issues. Also note that seemingly benign races can possibly be harmful (also depending on the compiler and the architecture) [1].
If the bug cannot be fixed, then this bug should be used to either make a compile-time annotation for blacklisting or add an entry to the runtime blacklist.
[1] http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
| Reporter | ||
| Updated•11 years ago
           | 
Component: General → JavaScript Engine
Summary: TSan: data race objdir-ff-tsan64opt/js/src/../../dist/include/js/HashTable.h:715 isFree → TSan: data race js/HashTable.h:715 isFree
| Comment 1•11 years ago
           | ||
Method names in the stack suggest off-thread parsing makes read-only accesses to the main thread's atom table, so in some loose theory that might be sufficient to be "safe" here.  If our hash table is thread-safe in the reading sense.  Which it might be, if it used thread-safe operations to update/access hash table state.  But it doesn't, so this is kind of asking for trouble, from appearances here.
Summary: TSan: data race js/HashTable.h:715 isFree → TSan: Off-thread parsing races the main thread accessing the atom table (data race js/HashTable.h:715 isFree)
|   | ||
| Comment 2•10 years ago
           | ||
This race is so common that it prevents mochitests from starting up under TSan due to hundreds of thousands of lines of output.  I appreciate that the function was named differently and used as such, but wouldn't it be nice if it did what was advertised on the tin?
|   | Assignee | |
| Comment 3•10 years ago
           | ||
js::HashTable lookup sounds like a read-only operation but it can modify the collision bit in |keyHash|, which is what TSAN is complaining about.
It feels like we want to have a way to freeze a js::HashTable and then it wouldn't do stuff with collision bits and there'd be extra checks to make sure you don't add or anything. We could even introduce a new type (js::FrozenHashTable) and provide a way to transfer the entryStorage from a js::HashTable to the js::FrozenHashTable.
|   | ||
| Comment 4•10 years ago
           | ||
(In reply to Nicholas Nethercote [:njn] from comment #3)
> js::HashTable lookup sounds like a read-only operation but it can modify the
> collision bit in |keyHash|, which is what TSAN is complaining about.
> 
> It feels like we want to have a way to freeze a js::HashTable and then it
> wouldn't do stuff with collision bits and there'd be extra checks to make
> sure you don't add or anything. We could even introduce a new type
> (js::FrozenHashTable) and provide a way to transfer the entryStorage from a
> js::HashTable to the js::FrozenHashTable.
I think TSan would still complain about this; at least in local testing, the stacks I get look something like:
  Read of size 4 at 0x7db000021350 by thread T41:
    #0 isFree /opt/build/froydnj/build-tsan/js/src/../../dist/include/js/HashTable.h:722 (libxul.so+0x0000049db986)
    #1 lookup /opt/build/froydnj/build-tsan/js/src/../../dist/include/js/HashTable.h:1264 (libxul.so+0x0000049db949)
    #2 readonlyThreadsafeLookup /opt/build/froydnj/build-tsan/js/src/../../dist/include/js/HashTable.h:1554 (libxul.so+0x0000049d77c2)
    #3 DefineProperty /home/froydnj/src/gecko-dev.git/js/src/jsapi.cpp:2597 (libxul.so+0x000004e69b8e)
    ...more frames
  Previous write of size 4 at 0x7db000021350 by main thread:
    #0 setCollision /opt/build/froydnj/build-tsan/js/src/../../dist/include/js/HashTable.h:729 (libxul.so+0x0000049db9be)
    #1 lookup /opt/build/froydnj/build-tsan/js/src/../../dist/include/js/HashTable.h:1257 (libxul.so+0x0000049db919)
    #2 readonlyThreadsafeLookup /opt/build/froydnj/build-tsan/js/src/../../dist/include/js/HashTable.h:1554 (libxul.so+0x0000049daab3)
    #3 getTokenInternal /home/froydnj/src/gecko-dev.git/js/src/frontend/TokenStream.cpp:1245 (libxul.so+0x000004ad732b)
    ...more frames
And TSan would still see that write (assuming we Move()'d into the FrozenHashTable) or the write from a copy (if we didn't Move(), which would be odd, IMHO) as coming before the read.  But having FrozenHashTable would at least make a place where it'd be a lot more convenient to MOZ_TSAN_BLACKLIST things, since it would be easier to explain what we're doing and we wouldn't be silently blacklisting away potential future races with hashtables.
|   | Assignee | |
| Comment 5•10 years ago
           | ||
Hmm, yes. Helgrind has annotations that I think can help with this sort of thing, esp. 
ANNOTATE_HAPPENS_BEFORE and ANNOTATE_HAPPENS_AFTER (http://www.valgrind.org/docs/manual/hg-manual.html#hg-manual.client-requests, but /usr/include/valgrind/helgrind.h has better documentation).
It looks like TSAN might support the same annotations, see https://code.google.com/p/data-race-test/wiki/DynamicAnnotations.
Also, the benefit of having js::FrozenHashTable is that we get to use the type system to enforce immutability.
| Reporter | ||
| Comment 6•10 years ago
           | ||
(In reply to Nicholas Nethercote [:njn] from comment #5)
> 
> It looks like TSAN might support the same annotations, see
> https://code.google.com/p/data-race-test/wiki/DynamicAnnotations.´
The page you are referring to is about TSan v1 (the valgrind based one), which is deprecated. We are using TSan v2 which probably doesn't have such annotations by design (as far as I can tell).
|   | ||
| Comment 7•10 years ago
           | ||
It looks like TSan v2 does have something like ANNOTATE_HAPPENS_BEFORE, since they have tests for it:
http://llvm.org/viewvc/llvm-project/compiler-rt/trunk/test/tsan/annotate_happens_before.cc?view=markup
(Thanks to Christian for pointing this out.)
So maybe we could use something like that.
|   | Assignee | |
| Updated•10 years ago
           | 
QA Contact: n.nethercote
|   | Assignee | |
| Comment 8•10 years ago
           | ||
One silly thing is that even when HashTable doesn't need to set a collision bit -- i.e.
when doing a lookup that doesn't precede an add -- it does a `|=` with a zero
RHS, i.e. a no-op write. This patch avoids that, and while it doesn't
solve the broader problem I think it's a necessary part of any solution.
froydnj, can you try it and see if it makes TSan complain less? Thank you.
        Attachment #8568987 -
        Flags: feedback?(nfroyd)
|   | Assignee | |
| Updated•10 years ago
           | 
Assignee: nobody → n.nethercote
Status: NEW → ASSIGNED
|   | ||
| Comment 9•10 years ago
           | ||
Comment on attachment 8568987 [details] [diff] [review]
Don't write collision bits unnecessarily
With this patch applied, I get zero warnings about hashtable collision races.
        Attachment #8568987 -
        Flags: feedback?(nfroyd) → feedback+
| Updated•10 years ago
           | 
QA Contact: n.nethercote
|   | Assignee | |
| Comment 10•10 years ago
           | ||
Comment on attachment 8568987 [details] [diff] [review]
Don't write collision bits unnecessarily
Review of attachment 8568987 [details] [diff] [review]:
-----------------------------------------------------------------
> With this patch applied, I get zero warnings about hashtable collision races.
Great! I was surprised to hear this because we still have the situation where the table is set up (i.e. written to a bunch of times) on one thread and then read from a bunch of times from multiple other threads. My best guess is that those other threads are created after the setup occurs, which would be enough to satisfy TSan.
I will still take a look at using some kind of separate "Frozen" type to make this code clearer and the guarantees stronger, but in the meantime I might as well get review on this part.
|   | Assignee | |
| Comment 11•10 years ago
           | ||
Luke, to summarize things...
- JSRuntime::permanentAtoms gets created and filled in at JSRuntime startup, by
  initializeAtoms().
  
- After that, it's read-only, so we let multiple threads read it without
  locking via threadsafeReadonlyLookup().
- *Except* that when you do a no-add-lookup in HashTable it actually does a
  `|= 0` on the keyHash, so it's not truly read-only. TSan complains frequently
  about this.
It's a genuine data race. Whether it would have negative effects in practice is
hard to say; the code is frequently overwriting keyHashes with the same value.
But given the unpredictability of the effects of data races in C++ (see
https://blog.mozilla.org/nnethercote/2015/02/24/fix-your-damned-data-races/)
and the simplicity of the fix, it's worth fixing.
BTW, I checked pldhash.cpp and it doesn't suffer from the same problem, because
SearchTable() only modifies the keyHash field if the |Reason| template
parameter is |ForAdd|. (Nb: pldhash.cpp doesn't have an equivalent to
threadsafeReadonlyLookup(), but if it did, it would be ok.)
        Attachment #8569355 -
        Flags: review?(luke)
|   | Assignee | |
| Updated•10 years ago
           | 
        Attachment #8568987 -
        Attachment is obsolete: true
|   | Assignee | |
| Comment 12•10 years ago
           | ||
Jan, given that you added threadsafeReadonlyLookup() in bug 844983, you should read through this bug and understand it, as penance :)
Flags: needinfo?(jdemooij)
|   | ||
| Comment 13•10 years ago
           | ||
Comment on attachment 8569355 [details] [diff] [review]
Don't write collision bits in HashTable unnecessarily
Review of attachment 8569355 [details] [diff] [review]:
-----------------------------------------------------------------
Makes sense.  With inlining (which I think we usually have with this function) this should be free.
        Attachment #8569355 -
        Flags: review?(luke) → review+
|   | Assignee | |
| Comment 14•10 years ago
           | ||
This clarifies the two phases -- (a) initialization and (b) read-only use
-- that |permanentAtoms| goes through. It also gives some type-based protection
against potential misuse.
        Attachment #8569465 -
        Flags: review?(bhackett1024)
|   | Assignee | |
| Comment 15•10 years ago
           | ||
I tweaked a comment on mccr8's suggestion.
        Attachment #8569475 -
        Flags: review?(bhackett1024)
|   | Assignee | |
| Updated•10 years ago
           | 
        Attachment #8569465 -
        Attachment is obsolete: true
        Attachment #8569465 -
        Flags: review?(bhackett1024)
| Updated•10 years ago
           | 
        Attachment #8569475 -
        Flags: review?(bhackett1024) → review+
|   | Assignee | |
| Comment 16•10 years ago
           | ||
Thank you both for the fast reviews.
https://hg.mozilla.org/integration/mozilla-inbound/rev/73fa2443adbd
https://hg.mozilla.org/integration/mozilla-inbound/rev/1e45aafbc1a5
https://hg.mozilla.org/mozilla-central/rev/73fa2443adbd
https://hg.mozilla.org/mozilla-central/rev/1e45aafbc1a5
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
          status-firefox39:
          --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
| Comment 18•10 years ago
           | ||
(In reply to Nicholas Nethercote [:njn] (on vacation until April 30th) from comment #12)
> Jan, given that you added threadsafeReadonlyLookup() in bug 844983, you
> should read through this bug and understand it, as penance :)
Done. Thanks for fixing!
Flags: needinfo?(jdemooij)
          You need to log in
          before you can comment on or make changes to this bug.
        
Description
•