Open Bug 1009881 Opened 11 years ago Updated 1 year ago

ThreadSanitizer reports a lock-order inversion in NSS in CERT_NewTempCertificate

Categories

(NSS :: Libraries, defect, P2)

3.15.4
x86_64
Linux

Tracking

(Not tracked)

People

(Reporter: wtc, Unassigned)

References

(Blocks 1 open bug)

Details

This bug was originally reported in Chromium issue 372807: http://code.google.com/p/chromium/issues/detail?id=372807 It was reported against NSS 3.15.4, but the line numbers in the call stacks still match the files in the current NSS trunk (NSS 3.16.1), so we can just debug this using the NSS trunk. Here is an excerpt of the ThreadSanitizer report from Chromium issue 372807: WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=15764) Cycle in lock order graph: M9666 (0x7d34001ca268) => M8875 (0x7d2c00006f60) => M9666 Mutex M8875 acquired here while holding mutex M9666 in main thread: #0 pthread_mutex_lock /usr/local/google/work/chromium/src/third_party/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:2627 (net_unittests+0x000000385380) #1 PR_Lock /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-libnspr4.gen/libnspr4/nspr-4.9.5/pr/src/pthreads/ptsynch.c:174 (libnspr4.so+0x000000032bb8) #2 nssCertificateStore_FindTrustForCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pkistore.c:636 (libnss3.so+0x0000000cf5c9) #3 nssCryptoContext_FindTrustForCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/cryptocontext.c:527 (libnss3.so+0x0000000ca2cf) #4 fill_CERTCertificateFields /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pki3hack.c:781 (libnss3.so+0x0000000d3f92) #5 stan_GetCERTCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pki3hack.c:890 (libnss3.so+0x0000000d3f92) #6 STAN_GetCERTCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pki3hack.c:922 (libnss3.so+0x0000000d4257) #7 CERT_NewTempCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/certdb/stanpcertdb.c:409 (libnss3.so+0x0000000c2f01) ... Mutex M9666 previously acquired by the same thread here: #0 pthread_mutex_lock /usr/local/google/work/chromium/src/third_party/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:2627 (net_unittests+0x000000385380) #1 PR_Lock /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-libnspr4.gen/libnspr4/nspr-4.9.5/pr/src/pthreads/ptsynch.c:174 (libnspr4.so+0x000000032bb8) #2 PR_EnterMonitor /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-libnspr4.gen/libnspr4/nspr-4.9.5/pr/src/pthreads/ptsynch.c:529 (libnspr4.so+0x000000033c60) #3 nssPKIObject_Lock /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pkibase.c:22 (libnss3.so+0x0000000cf979) #4 stan_GetCERTCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pki3hack.c:858 (libnss3.so+0x0000000d3ab8) #5 STAN_GetCERTCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pki3hack.c:922 (libnss3.so+0x0000000d4257) #6 CERT_NewTempCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/certdb/stanpcertdb.c:409 (libnss3.so+0x0000000c2f01) ... Mutex M9666 acquired here while holding mutex M8875 in main thread: #0 pthread_mutex_lock /usr/local/google/work/chromium/src/third_party/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:2627 (net_unittests+0x000000385380) #1 PR_Lock /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-libnspr4.gen/libnspr4/nspr-4.9.5/pr/src/pthreads/ptsynch.c:174 (libnspr4.so+0x000000032bb8) #2 PR_EnterMonitor /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-libnspr4.gen/libnspr4/nspr-4.9.5/pr/src/pthreads/ptsynch.c:529 (libnspr4.so+0x000000033c60) #3 nssPKIObject_Lock /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pkibase.c:22 (libnss3.so+0x0000000cf979) #4 nssCertificate_GetDecoding /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/certificate.c:288 (libnss3.so+0x0000000c8f1c) #5 nssCertificate_SubjectListSort /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/certificate.c:743 (libnss3.so+0x0000000c8f1c) #6 nsslist_add_element /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/base/list.c:192 (libnss3.so+0x0000000df405) #7 nssList_AddUnique /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/base/list.c:239 (libnss3.so+0x0000000df5e7) #8 add_subject_entry /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pkistore.c:174 (libnss3.so+0x0000000cea93) #9 nssCertificateStore_AddLocked /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pkistore.c:209 (libnss3.so+0x0000000cea93) #10 nssCertificateStore_FindOrAdd /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pkistore.c:231 (libnss3.so+0x0000000cea93) #11 NSSCryptoContext_FindOrImportCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/cryptocontext.c:110 (libnss3.so+0x0000000c9bce) #12 CERT_NewTempCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/certdb/stanpcertdb.c:441 (libnss3.so+0x0000000c30cd) ... Mutex M8875 previously acquired by the same thread here: #0 pthread_mutex_lock /usr/local/google/work/chromium/src/third_party/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:2627 (net_unittests+0x000000385380) #1 PR_Lock /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-libnspr4.gen/libnspr4/nspr-4.9.5/pr/src/pthreads/ptsynch.c:174 (libnspr4.so+0x000000032bb8) #2 nssCertificateStore_FindOrAdd /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/pkistore.c:227 (libnss3.so+0x0000000ce99a) #3 NSSCryptoContext_FindOrImportCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/pki/cryptocontext.c:110 (libnss3.so+0x0000000c9bce) #4 CERT_NewTempCertificate /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-nss.gen/nss/nss-3.15.4/nss/lib/certdb/stanpcertdb.c:441 (libnss3.so+0x0000000c30cd) ... SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) /usr/local/google/ssd/chrome-git/src/out/Release/obj/third_party/instrumented_libraries/tsan-libnspr4.gen/libnspr4/nspr-4.9.5/pr/src/pthreads/ptsynch.c:174 PR_Lock
One idea I have is to attack the nssCertificate_GetDecoding function in the third call stack: ========== NSS_IMPLEMENT nssDecodedCert * nssCertificate_GetDecoding ( NSSCertificate *c ) { nssDecodedCert* deco = NULL; if (c->type == NSSCertificateType_PKIX) { (void)STAN_GetCERTCertificate(c); } nssPKIObject_Lock(&c->object); if (!c->decoding) { deco = nssDecodedCert_Create(NULL, &c->encoding, c->type); PORT_Assert(!c->decoding); c->decoding = deco; } else { deco = c->decoding; } nssPKIObject_Unlock(&c->object); return deco; } ========== nssCertificate_GetDecoding needs to lock c->object because we are creating c->decoding lazily. I think if we create c->decoding in nssCertificate_Create, then nssCertificate_GetDecoding can return c->decoding without locking c->object. There are two complications: 1. The STAN_GetCERTCertificate call also locks c->object. So I need to find out whether we can delete the STAN_GetCERTCertificate call if c->decoding is created in nssCertificate_Create. 2. NSS may still lock c->object elsewhere. So I'll need to be able to reproduce the ThreadSanitizer report in order to verify the bug fix.

The bug assignee didn't login in Bugzilla in the last months and this bug has priority 'P2'.
:beurdouche, could you have a look please?
For more information, please visit auto_nag documentation.

Assignee: wtc → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(bbeurdouche)
Severity: normal → S3

We have modified the bot to only consider P1 as high priority, so I'm cancelling the needinfo here.

Flags: needinfo?(bbeurdouche)
Blocks: tsan
You need to log in before you can comment on or make changes to this bug.