Closed Bug 1244247 Opened 9 years ago Closed 6 years ago

TSan: data race dom/base/ScriptSettings.cpp:103 sScriptSettingsTLSInitialized

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox47 --- wontfix
firefox71 --- fixed

People

(Reporter: terrence, Assigned: tromey)

References

(Blocks 1 open bug)

Details

(Whiteboard: dom-triaged)

Attachments

(3 files)

I don't see this race on file anywhere, so filing it here.
Flags: needinfo?(ttromey)
Scripts, please!
Component: JavaScript: GC → DOM
Might be worth moving the script settings stack onto CycleCollectedJSRuntime to avoid having two similar TLS variables.
Flags: needinfo?(bobbyholley)
Blocks: tsan
Sounds fine to me if that's helpful, assuming the lifetimes match up. It does kinda link two already-large systems though, so I'd only be for it if it solved some concrete problem.
Flags: needinfo?(bobbyholley)
I don't really think the systems are disjoint though ... CCJSR and ScriptSettings are both fundamental infrastructure needed to execute JS safely.
Conceptually I agree. But practically they don't really touch each other at the moment. I basically feel the same about it as I do about moving outer windows into docshell - conceptually sane, but more stuff in the kitchen sink. Except that the docshell thing would reduce a lot of confusion, whereas this seems less likely to. Shrug.
Whiteboard: dom-triaged
According to ScriptSettings.h, the init function should only be called once: https://dxr.mozilla.org/mozilla-central/source/dom/base/ScriptSettings.h#65 So if there is a race here it means this contract is being violated. Maybe the comment is just wrong though. My earlier TLS change didn't really preserve what the previous code was doing. I'm going to test this if I can get tsan working locally: diff --git a/dom/base/ScriptSettings.cpp b/dom/base/ScriptSettings.cpp index df81f71..37feac4 100644 --- a/dom/base/ScriptSettings.cpp +++ b/dom/base/ScriptSettings.cpp @@ -95,13 +95,15 @@ UnuseEntryScriptProfiling() void InitScriptSettings() { - bool success = sScriptSettingsTLS.init(); - if (!success) { - MOZ_CRASH(); + if (!sScriptSettingsTLSInitialized) { + bool success = sScriptSettingsTLS.init(); + if (!success) { + MOZ_CRASH(); + } + sScriptSettingsTLSInitialized = true; } sScriptSettingsTLS.set(nullptr); - sScriptSettingsTLSInitialized = true; } void
Flags: needinfo?(ttromey)
Assignee: nobody → ttromey
Status: NEW → ASSIGNED
Comment on attachment 8717478 [details] [diff] [review] avoid thread race in InitScriptSettings This restores the behavior from before the TLS change (bug 757969). There's no race reported by tsan with this patch, in a local build. It also fixes up a seemingly erroneous comment in ScriptSettings.h.
Attachment #8717478 - Flags: review?(bzbarsky)
Comment on attachment 8717478 [details] [diff] [review] avoid thread race in InitScriptSettings Over to the reviewer from bug 757969. That said, I don't understand how adding more unsynchronized (with each other) checks would fix an actual race), but then again I also don't understand what the actual race here is... Putting that sort of details in the commmit message would be a good idea, generally speaking; the current commit message on this patch is not exactly helpful.
Attachment #8717478 - Flags: review?(bzbarsky) → review?(nfroyd)
Comment on attachment 8717478 [details] [diff] [review] avoid thread race in InitScriptSettings Cancelling the review based on bz's comments. I can't even reproduce the original race locally, even when I add a hack to disable the use of __thread. Terrence, could you say what steps you did to get this trace?
Flags: needinfo?(terrence)
Attachment #8717478 - Flags: review?(nfroyd)
(In reply to Tom Tromey :tromey from comment #10) > Comment on attachment 8717478 [details] [diff] [review] > avoid thread race in InitScriptSettings > > Cancelling the review based on bz's comments. > I can't even reproduce the original race locally, > even when I add a hack to disable the use of __thread. > Terrence, could you say what steps you did to get this trace? I built with the following configuration: --disable-debug '--enable-optimize=-O2 -gline-tables-only' --enable-elf-hack --enable-release --enable-update-packaging --enable-update-channel= --with-google-api-keyfile=/builds/gapi.data --disable-warnings-as-errors --without-intl-api --enable-thread-sanitizer --disable-jemalloc --disable-crashreporter --disable-elf-hack --enable-debug-symbols --disable-install-strip Compiler was: clang version 3.8.0 (https://github.com/llvm-mirror/clang.git 11f818d8c60f5dc1bef681640b71c756bc72b168) (https://github.com/llvm-mirror/llvm.git 72901a8afaae6c9f8ea63ba1c9c9d4699c7eec49) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /mnt/home/terrence/moz/clang/build/bin That said, I found the following afternoon that my harddrive was actively in the process of failing as I built the above compiler and browser. I know that at least one other build I made that day resulted in a binary with intermittent failures. So. I'm going to chalk this up as the feverish delusions of a dying harddrive. Sorry for wasting everyone's time!
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: needinfo?(terrence)
Resolution: --- → INVALID
Component: DOM → DOM: Core & HTML

I just hit exactly the same race when running ./mach test firefox-ui-functional on a local TSan build. The race is on the global mozilla::dom::sScriptSettingsTLSInitialized. I will add a runtime suppression for now.

Status: RESOLVED → REOPENED
Resolution: INVALID → ---

Yeah, so I don't understand how sScriptSettingsTLSInitialized can possibly work. It's a single global, but presumably we need to initialize the TLS thing on very thread that uses ScriptSettings, right?

The comment at https://searchfox.org/mozilla-central/rev/5e830ac8f56fe191cb58a264e01cdbf6b6e847bd/dom/script/ScriptSettings.h#34-35 about only doing it once is just wrong; init and destroy happen on a per-thread basis.

That said, we only have one consumer of ScriptSettingsInitialized(), and I don't understand why it's there. If we're on the main thread and running nontrivial code like MessageChannel::DispatchMessage, I'd really hope things are initialized, yes... Bobby, do you recall why you added that check?

Flags: needinfo?(bholley)

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #13)

Yeah, so I don't understand how sScriptSettingsTLSInitialized can possibly work. It's a single global, but presumably we need to initialize the TLS thing on very thread that uses ScriptSettings, right?

It was correct when it landed: https://hg.mozilla.org/mozilla-central/rev/96dbd5483c05#l1.22

Looks like it became incorrect in bug 757969. Maybe the thinking there was that the only consumer is effectively main-thread-only, and so it's kinda-sorta-ok, despite the value being read and written on arbitrary threads?

The comment at https://searchfox.org/mozilla-central/rev/5e830ac8f56fe191cb58a264e01cdbf6b6e847bd/dom/script/ScriptSettings.h#34-35 about only doing it once is just wrong; init and destroy happen on a per-thread basis.

Yes, that comment does seem wrong. Would be good to fix it.

That said, we only have one consumer of ScriptSettingsInitialized(), and I don't understand why it's there. If we're on the main thread and running nontrivial code like MessageChannel::DispatchMessage, I'd really hope things are initialized, yes... Bobby, do you recall why you added that check?

I don't recall, but looks like Bill and I added it in bug 1103324 comment 5 (the patch in comment 4 didn't have it). So it was presumably in response to some kind of try failure - probably something very early in startup.

I think the simplest fix for this bug is to eliminate the boolean, and replace the check with CycleCollectedJSContext::Get() != nullptr, which should have identical lifetime characteristics.

I'll attach a patch.

Flags: needinfo?(bholley)
Pushed by bholley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/daf92f9b4bd8 Remove racey sScriptSettingsTLSInitialized. r=bzbarsky
Status: REOPENED → RESOLVED
Closed: 9 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: