Closed
      
        Bug 1257154
      
      
        Opened 9 years ago
          Closed 9 years ago
      
        
    
  
Room context keeps getting updated in e10s mode when you are in a conversation
Categories
(Hello (Loop) :: Client, defect, P1)
        Hello (Loop)
          
        
        
      
        
    
        Client
          
        
        
      
        
    Tracking
(e10s?, firefox47 fixed, firefox48 fixed)
People
(Reporter: standard8, Assigned: standard8)
References
Details
(Whiteboard: [btpp-fix-now])
Attachments
(1 file, 2 obsolete files)
| 10.41 KB,
          patch         | standard8
:
              
              review+ ritu
:
              
              approval-mozilla-aurora+ | Details | Diff | Splinter Review | 
When we're in e10s mode, for some reason the room context keeps getting updated - this is probably some sort of cycle/constant listener. It doesn't happen in a non-e10s window.
STR:
1) Set Firefox to start in E10s mode, set "loop.remote.autostart" to true and restart the browser.
2) Open up the "Browser Content Toolbox"
3) Enter a conversation on the link generator side.
=> The console on the content toolbox settles down and stays quiet.
4) Have a link clicker join the conversation
=> After the initial chatter, the following messages keep being repeated on the content toolbox's console:
[Dispatcher] Dispatching action Object { newRoomDescription: "Planet Mozilla", newRoomThumbnail: "data:image/x-icon;base64,iVBORw0KGg…", newRoomURL: "http://planet.mozilla.org/", roomToken: "pWr2FJ64hu0", name: "updateRoomContext" }dispatcher.js:95:9
[Dispatcher] Dispatching action Object { name: "updateRoomContextDone" }
|   | ||
| Updated•9 years ago
           | 
Rank: 15
Priority: -- → P1
Whiteboard: [btpp-fix-now]
| Updated•9 years ago
           | 
          tracking-e10s:
          --- → +
| Assignee | ||
| Comment 1•9 years ago
           | ||
I think I've found out what's happening here:
- In e10s mode, there's at two message managers. There's the global one for a window, and there's a "group" one for social.
- According to http://mxr.mozilla.org/mozilla-central/source/dom/base/nsIMessageManager.idl?force=1 a message sent to the "social" message manager propagates up to the tree.
- In our code, we're listening for a DOMTitleChanged from gBrowser.
- When a DOMTitleChanged is fired, we get the details of the current browser window.
- Once the details are obtained, the conversation window sets the document title.
- Setting the document title happens on the socialchat browser, which has the "social" group message manager.
- The social api sets the conversation window title correctly, but then:
- DOMTitleChanged propagates up the tree, back to our gBrowser listener....
and the cycle starts again.
| Assignee | ||
| Comment 2•9 years ago
           | ||
Dave Townsend pointed out to me on irc that the add-on compatibility shims might be getting in the way:
https://developer.mozilla.org/en-US/Firefox/Multiprocess_Firefox/Limitations_of_chrome_scripts#Compatibility_shims
and that we should turn on multiprocess compatible so that we're behaving properly:
https://developer.mozilla.org/en-US/Add-ons/Install_Manifests#multiprocessCompatible
This means making the DOMTitleChanged listener happen via a frame script loaded into the content process:
https://developer.mozilla.org/en-US/Firefox/Multiprocess_Firefox/Limitations_of_chrome_scripts#DOM_Events
https://developer.mozilla.org/en-US/Firefox/Multiprocess_Firefox/Message_Manager
I've got something hacked together now that works, I'll tidy it up tomorrow.
Assignee: nobody → standard8
|   | ||
| Comment 3•9 years ago
           | ||
Ooooh, exciting! Looking forward to the patch.
|   | ||
| Comment 4•9 years ago
           | ||
| Assignee | ||
| Comment 5•9 years ago
           | ||
Comment on attachment 8732170 [details] [review]
[loop] Standard8:bug-1257154-title > mozilla:master
I think I need to write a test for this, but the patch is now functional. Mike, can you give me some feedback whilst I'm still writing the test please?
        Attachment #8732170 -
        Flags: feedback?(mdeboer)
|   | ||
| Comment 6•9 years ago
           | ||
Comment on attachment 8732170 [details] [review]
[loop] Standard8:bug-1257154-title > mozilla:master
This could easily become an r+ when all comments are addressed. Good stuff! (I didn't know we had a DOMTitleChanged event listener there...)
        Attachment #8732170 -
        Flags: feedback?(mdeboer) → feedback+
| Assignee | ||
| Comment 7•9 years ago
           | ||
| Assignee | ||
| Comment 8•9 years ago
           | ||
Comment on attachment 8732170 [details] [review]
[loop] Standard8:bug-1257154-title > mozilla:master
Updated for the feedback comments and added a unit test.
        Attachment #8732170 -
        Flags: review?(mdeboer)
|   | ||
| Updated•9 years ago
           | 
        Attachment #8732170 -
        Flags: review?(mdeboer) → review+
| Assignee | ||
| Comment 9•9 years ago
           | ||
Iteration: --- → 48.2 - Apr 4
| Assignee | ||
| Comment 10•9 years ago
           | ||
This has landed in m-c with 1.2.3 (bug 1259245).
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
| Assignee | ||
| Comment 11•9 years ago
           | ||
Per bug 1259245, this was backed out of central due to crashes & leaks:
> Backed out in https://hg.mozilla.org/mozilla-central/rev/d5f3da0cfe7c for
> very frequent 10.10 opt e10s mochitest-5 crashes like
> https://treeherder.mozilla.org/logviewer.html#?job_id=8211122&repo=fx-team
> This also caused leaks in other Linux bc tests and in Windows 7 debug
> M-e10s(dt1):
> https://treeherder.mozilla.org/logviewer.html#?job_id=8210013&repo=fx-team
> This also caused leaks in other Linux bc tests and in Windows 7 debug
> M-e10s(dt1):
> https://treeherder.mozilla.org/logviewer.html#?job_id=8210013&repo=fx-team
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
| Assignee | ||
| Comment 12•9 years ago
           | ||
(In reply to Mark Banner (:standard8) from comment #11)
> > This also caused leaks in other Linux bc tests and in Windows 7 debug
> > M-e10s(dt1):
> > https://treeherder.mozilla.org/logviewer.html#?job_id=8210013&repo=fx-team
> 
> > This also caused leaks in other Linux bc tests and in Windows 7 debug
> > M-e10s(dt1):
> > https://treeherder.mozilla.org/logviewer.html#?job_id=8210013&repo=fx-team
My best guess for these: The frame scripts don't get cleaned up (and can't be, as they don't have that functionality), so this is causing extra leaks in the processes. However, the logs say that the tab process has a threshold of 10000 bytes, but its only leaking 2564 ish bytes.
So I'm confused. I certainly can't see anything we're doing wrong with this patch - we might need to bump the leak threshold, but I can't see where we'd do that.
Justin, any ideas who to ask here?
Flags: needinfo?(dolske)
| Assignee | ||
| Comment 13•9 years ago
           | ||
Updated patch that applies direct to fx-team.
| Assignee | ||
| Comment 14•9 years ago
           | ||
Try push with fresh fx-team baseline:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4f2ebdabfcea
Try push with only enabling Loop as a multiprocess compatible add-on:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1a53c51c753b
| Assignee | ||
| Comment 15•9 years ago
           | ||
Interestingly, just adding this to install.rdf is enough to cause the leaks:
<em:multiprocessCompatible>true</em:multiprocessCompatible>
I've now got another build in progress which drops everything in the add-on's startup & shutdown functions, to see if its likely to be something we're doing, or if there's an issue in the platform:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b328c195636
Flags: needinfo?(dolske)
|   | ||
| Comment 16•9 years ago
           | ||
(In reply to Mark Banner (:standard8) from comment #15)
> Interestingly, just adding this to install.rdf is enough to cause the leaks:
> 
> <em:multiprocessCompatible>true</em:multiprocessCompatible>
Sure seems like it! Perhaps Dave has a hint as to why this singular manifest setting might cause a system add-on to leak the world?
Status: REOPENED → ASSIGNED
Flags: needinfo?(dtownsend)
| Comment 17•9 years ago
           | ||
(In reply to Mike de Boer [:mikedeboer] from comment #16)
> (In reply to Mark Banner (:standard8) from comment #15)
> > Interestingly, just adding this to install.rdf is enough to cause the leaks:
> > 
> > <em:multiprocessCompatible>true</em:multiprocessCompatible>
> 
> Sure seems like it! Perhaps Dave has a hint as to why this singular manifest
> setting might cause a system add-on to leak the world?
That manifest change switches off the special shims that attempt to make non-e10s safe code work in an e10s world. It's possible that those shims were protecting you from leaks somehow.
Flags: needinfo?(dtownsend)
| Assignee | ||
| Comment 18•9 years ago
           | ||
(In reply to Dave Townsend [:mossop] from comment #17)
> (In reply to Mike de Boer [:mikedeboer] from comment #16)
> > (In reply to Mark Banner (:standard8) from comment #15)
> > > Interestingly, just adding this to install.rdf is enough to cause the leaks:
> > > 
> > > <em:multiprocessCompatible>true</em:multiprocessCompatible>
> > 
> > Sure seems like it! Perhaps Dave has a hint as to why this singular manifest
> > setting might cause a system add-on to leak the world?
> 
> That manifest change switches off the special shims that attempt to make
> non-e10s safe code work in an e10s world. It's possible that those shims
> were protecting you from leaks somehow.
I did a push to try server on Friday that turned off basically everything in Loop's bootstrap.js.
That basically gets us to loading the add-on, and a few overrides/other items mentioned in chrome.manifest.
However try server is still seeing leaks :-(
| Assignee | ||
| Comment 19•9 years ago
           | ||
Unassigning myself, as I feel I don't have the time to work on fixing the other issues this week. I'm trying to find another owner.
Assignee: standard8 → nobody
Status: ASSIGNED → NEW
| Assignee | ||
| Comment 20•9 years ago
           | ||
Bill, Dave said you might be able to help with understanding what the memory leaks are here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b328c195636&selectedJob=18927614
(bc4/bc5 e10s oranges).
Flags: needinfo?(wmccloskey)
| Assignee | ||
| Comment 22•9 years ago
           | ||
Ian Bicking gave me some ideas earlier today. I'm currently trying a try build without the em:multiprocessCompatible flag set. So far it seems to be working:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f6f39926d9c3
However, that still means there's an issue with dropping those shims. Although I tried killing most of bootstrap before, I didn't kill everything. So there may still be something interacting badly. Its concerning that it is non-obvious if that's the case.
If the try build continues successfully, I'll land the reduced patch in the morning, however, I'll be spinning a separate bug or two for the issues with the multiprocessCompatible flags. So any help in the meantime would be appreciated.
| Assignee | ||
| Comment 23•9 years ago
           | ||
This is the same as the previous patch, but without the multiprocessCompatible flag to remove the shims. Try server said this was good, so I'm going to land it and file a new bug for adding the flag.
        Attachment #8738655 -
        Flags: review+
| Assignee | ||
| Updated•9 years ago
           | 
        Attachment #8737170 -
        Attachment is obsolete: true
| Assignee | ||
| Updated•9 years ago
           | 
        Attachment #8732170 -
        Attachment is obsolete: true
| Assignee | ||
| Updated•9 years ago
           | 
Assignee: nobody → standard8
| Assignee | ||
| Updated•9 years ago
           | 
Iteration: 48.2 - Apr 4 → 48.3 - Apr 18
| Assignee | ||
| Comment 25•9 years ago
           | ||
I've moved the em:multiprocessCompatible issues out to bug 1262560.
Flags: needinfo?(wmccloskey)
|   | ||
| Comment 26•9 years ago
           | ||
| bugherder | ||
Status: NEW → RESOLVED
Closed: 9 years ago → 9 years ago
          status-firefox48:
          --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
| Assignee | ||
| Comment 27•9 years ago
           | ||
Also landed in Loop repo:
https://github.com/mozilla/loop/commit/36aa310233988caf716469f7397c0e7a95cd7d1d
| Assignee | ||
| Comment 28•9 years ago
           | ||
Comment on attachment 8738655 [details] [diff] [review]
Switch to getting DOM title updates via a frame script to work better with e10s.
Approval Request Comment
[Feature/regressing bug #]: Firefox Hello / e10s
[User impact if declined]: Performance is impacted whilst in a conversation in e10s mode (could cause flickering as well). 
[Describe test coverage new/current, TreeHerder]: Landed in m-c, has tests for the new work.
[Risks and why]: Low - moving to existing well-known frame script mechanism.
[String/UUID change made/needed]: None
        Attachment #8738655 -
        Flags: approval-mozilla-aurora?
          status-firefox47:
          --- → affected
Comment on attachment 8738655 [details] [diff] [review]
Switch to getting DOM title updates via a frame script to work better with e10s.
Improves Hello + e10s experience, Aurora47+
        Attachment #8738655 -
        Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
|   | ||
| Comment 30•9 years ago
           | ||
| bugherder uplift | ||
| Assignee | ||
| Comment 31•9 years ago
           | ||
(In reply to Mark Banner (:standard8) from comment #27)
> Also landed in Loop repo:
> 
> https://github.com/mozilla/loop/commit/
> 36aa310233988caf716469f7397c0e7a95cd7d1d
It appears I forgot to add the new file, which we detected on re-export:
https://github.com/mozilla/loop/commit/3365b697a32e5160d9cbf4d3b1694268d64716b6
          You need to log in
          before you can comment on or make changes to this bug.
        
Description
•