Closed
Bug 1198191
Opened 10 years ago
Closed 10 years ago
[e10s] Confirm existing tab-switch telemetry measurement are meaningful
Categories
(Core :: General, defect, P1)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
e10s | + | --- |
People
(Reporter: jimm, Assigned: gkrizsanits)
References
(Blocks 1 open bug)
Details
+++ This bug was initially created as a clone of Bug #1198188 +++
Part of the e10s release criteria work.
![]() |
Reporter | |
Updated•10 years ago
|
tracking-e10s:
--- → +
Updated•10 years ago
|
Blocks: e10s-measurement
![]() |
Reporter | |
Updated•10 years ago
|
![]() |
Reporter | |
Comment 1•10 years ago
|
||
Vladan, you had a line item here about this -
https://docs.google.com/document/d/1TyE0BehzYhii3qfmcrfjXlRJL64CcJk0B4Voup4Q0Pg/edit#
Do you have any more specifics?
We have a few tab switch performance probes and the talos tps test. We're currently blocking on regressions in the talos test. From t.m.o -
FX_TAB_CLICK_MS
FX_TAB_SWITCH_SPINNER_VISIBLE_MS
FX_TAB_SWITCH_TOTAL_E10S_MS
FX_TAB_SWITCH_TOTAL_MS
FX_TAB_SWITCH_UPDATE_MS
The two we might be interested in are FX_TAB_SWITCH_TOTAL_*. I did some analysis of these back in June, from what I remember FX_TAB_SWITCH_TOTAL_MS might not be a reliable probe anymore. Maybe we should look into that at least.
Flags: needinfo?(vladan.bugzilla)
Comment 2•10 years ago
|
||
> Do you have any more specifics?
It's basically what you described, I'd like to know which of the measurements you listed are meaningful & which can be used for an e10s vs non-e10s comparison
Flags: needinfo?(vladan.bugzilla)
Comment 3•10 years ago
|
||
And by meaningful, I mean "corresponds to the tab switch time seen by a user"
Updated•10 years ago
|
Assignee: gwright → gkrizsanits
![]() |
Assignee | |
Comment 4•10 years ago
|
||
(In reply to Jim Mathies [:jimm] from comment #1)
> The two we might be interested in are FX_TAB_SWITCH_TOTAL_*. I did some
> analysis of these back in June, from what I remember FX_TAB_SWITCH_TOTAL_MS
> might not be a reliable probe anymore. Maybe we should look into that at
> least.
Why is it not a reliable probe exactly?
Anyway, what worries me a lot here that in non-e10s when content hangs for whatever reason, tab switching is just not possible, while with e10s mode it is possible but it will show the spinner, so we report it as a long tab switch.
In general it's not trivial to justify if the spinner is on that it's a tab switching issue or not. If I get the code right we also count in the page loading time, do we do that for non-e10s too? If so maybe we should look at these numbers separately.
Finally in e10s it's not yet clear to me what do we measure if we use the tabswitcher to load multiple tabs simultaneously...
I think we should start with the simplest case and measure only the simplest scenario, and filter out every other cases.
Updated•10 years ago
|
Blocks: e10s-responsiveness
![]() |
Assignee | |
Comment 5•10 years ago
|
||
One thing that is weird that FX_TAB_SWITCH_TOTAL_MS does not seem to work at all any longer. On telemetry from aurora 46 and above we don't collect data from this. We call window.QueryInterface(Ci.nsIInterfaceRequestor).getInterface(Ci.nsIDOMWindowUtils)
.beginTabSwitch();
in tabbrowser.xml but PostPresent on that LayerManager is not called after. So we never report anything in none10s mode.
![]() |
Assignee | |
Comment 6•10 years ago
|
||
(In reply to Vladan Djeric (:vladan) -- last day at Moz Friday Jan 29 from comment #3)
> And by meaningful, I mean "corresponds to the tab switch time seen by a user"
One thing that is not clear to me if the browser hangs in non-e10s or when the content-process hangs in e10s there will be a fundamental difference between the 2. In non-e10s mode you cannot start a tab switch in this state, but in e10s mode you can, and it will be a long tab switch. How should we handle this difference? This might seem like a longer average tab switch time in e10s mode, yet I would argue that it's still a better user experience. What do you think? How can we do a fair comparison here?
Flags: needinfo?(vladan.bugzilla)
![]() |
Assignee | |
Comment 7•10 years ago
|
||
Jeff, I think it's not fair to compare the tab switching time of the e10s and the non-e10s version because the issue I've described in Comment 6. Especially since we don't even have the data for the e10s version right now (see Comment 5). I would focus on getting a reliable measurement up and running for the e10s case (or make sure that the current one really measures what it should) so we can make sure we don't regress from here. How does that sound to you?
Flags: needinfo?(vladan.bugzilla) → needinfo?(jgriffiths)
![]() |
||
Comment 8•10 years ago
|
||
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #7)
> Jeff, I think it's not fair to compare the tab switching time of the e10s
> and the non-e10s version because the issue I've described in Comment 6.
> Especially since we don't even have the data for the e10s version right now
> (see Comment 5). I would focus on getting a reliable measurement up and
> running for the e10s case (or make sure that the current one really measures
> what it should) so we can make sure we don't regress from here. How does
> that sound to you?
Focusing on something more reliable seems reasonable to me. I'm assuming the following:
1. this new measurement allows us to compare e10s and non-e10s right now
2. this new measurement will allow us to detect regressions going forward for e10s, and ideally for non-e10s for as long as we'd care to continue supporting that
Flags: needinfo?(jgriffiths)
![]() |
Assignee | |
Comment 9•10 years ago
|
||
(In reply to Jeff Griffiths (:canuckistani) (:⚡︎) from comment #8)
> 1. this new measurement allows us to compare e10s and non-e10s right now
Again, that will never be a fair comparison. If JS is busy or for some other reasons the process hangs, in non-e10s mode you cannot start a tab switch, in e10s mode you can, and typically you will, and we will report longer tab switching time because of that since we have to wait for the content process to send us the layers, probably show the spinner too. The e10s UX will be better in this case since at least the browser does not hangs/janks yet the tab switch numbers will show worse e10s numbers. I would not rely on a comparison like that. If there is a noticeable issue with the e10s tab switch we have to work on that but comparing times against non-e10s is not the way to go imo.
Flags: needinfo?(jgriffiths)
![]() |
||
Comment 10•10 years ago
|
||
(In reply to Gabor Krizsanits [:krizsa :gabor] from comment #9)
> (In reply to Jeff Griffiths (:canuckistani) (:⚡︎) from comment #8)
> > 1. this new measurement allows us to compare e10s and non-e10s right now
>
> Again, that will never be a fair comparison. If JS is busy or for some other
> reasons the process hangs, in non-e10s mode you cannot start a tab switch,
> in e10s mode you can, and typically you will, and we will report longer tab
> switching time because of that since we have to wait for the content process
> to send us the layers, probably show the spinner too. The e10s UX will be
> better in this case since at least the browser does not hangs/janks yet the
> tab switch numbers will show worse e10s numbers. I would not rely on a
> comparison like that. If there is a noticeable issue with the e10s tab
> switch we have to work on that but comparing times against non-e10s is not
> the way to go imo.
Ok, thanks for the re-iteration - I defer to your judgement.
Flags: needinfo?(jgriffiths)
I think the Win32 API can tell us when an event was actually dispatched to us. The MSG structure [1] has a time field, which is millisecond resolution.
It might be a bit tricky to actually use this, but I suspect we could make it work. I think it's definitely worth it if it allows us to measure tab switch time meaningfully.
[1] https://msdn.microsoft.com/en-us/library/windows/desktop/ms644958%28v=vs.85%29.aspx
![]() |
Assignee | |
Comment 12•10 years ago
|
||
Jim pointed out that we have _some_ data from the non-e10s builds. Probably not a sample we want to rely on too much because as Mike figured it out these samples must come from users with off main thread composition turned off (so it's not a representative sample). But here are the data we have right now:
Beta 45:
========
- non-e10s
sample: 130.64k
median: 82.5
95th Percentile: 297.03
- e10s
sample: 140.36M
median: 51.85
95th Percentile: 455.65
Aurora 46:
==========
- non-e10s
sample: 55.43k
median: 49.25
95th Percentile: 173.69
-e10s
sample: 112M
median: 54.92
95th Percentile: 383.91
Nightly 47:
===========
- non-e10s
sample: 86.69k
median: 124.17
95th Percentile: 334.43
- e10s
sample: 50.81M
median: 53.08
95th Percentile: 377.71
In general median is lower with e10s but looking at the e10s graph there are more significantly longer switch times. Some of these longer times must come from hangs I talked about earlier but it would be worth compare these e10s samples against the jank samples to see if my theory checks out or if there is some other factor too.
Anyway, once bug 1244684 lands we will have more reliable samples for the non-e10s case.
![]() |
||
Comment 13•10 years ago
|
||
Thanks for the update. I don't like it's useful at this stage to bother considering this data given the obvious bias and your opinion that once bug 1244684 lands we'll have better data.
As an aside: do we know if there is any reason for the seeming slide in performance for non-e10s in Nightly 47? The median is over 2x slower than Aurora 46. Can we even meaningfully compare those two numbers?
![]() |
Assignee | |
Comment 14•10 years ago
|
||
(In reply to Jeff Griffiths (:canuckistani) (:⚡︎) from comment #13)
> Thanks for the update. I don't like it's useful at this stage to bother
> considering this data given the obvious bias and your opinion that once bug
> 1244684 lands we'll have better data.
Yeah I agree, I've checked the new data and it looks pretty good:
non-e10s
========
Sample Count
1.17M
Selected Dates
2016/02/24 to 2016/02/27
5th Percentile
27.57
25th Percentile
51.1
Median
78.69
75th Percentile
128.46
95th Percentile
309.82
e10s
=====
Sample Count
62.51M
Selected Dates
2016/01/25 to 2016/02/27
5th Percentile
14.44
25th Percentile
30.98
Median
53.11
75th Percentile
98
95th Percentile
378.15
I'd say we can close this bug. And maybe we could open a followup bug to investigate the long tail in the e10s case, if it's all related to content process hangs or if there is something else too. But that should have a lot lower prio, since even with the tail e10s seem to improve tab switch times.
An interesting fact is that we're investigating a quite heavy tps regression (Bug 1186585) so these result are better than what I hoped for. The tps test measures a very specific scenario though, so that's not necessarily a contradiction.
>
> As an aside: do we know if there is any reason for the seeming slide in
> performance for non-e10s in Nightly 47? The median is over 2x slower than
> Aurora 46. Can we even meaningfully compare those two numbers?
That was quite surprising for me too. I'm kind of leaning toward ignoring it though, since it can be easily explained by the change in the population that gets off main thread composition. I think we should just uplift bug 1244684 instead and examine those numbers. If we can see the same shift, then we have a problem. The thing is, because this probe was broken, and we didn't get enough data from it, it was filtered out entirely, so I would assume we took any patch that caused regression for quite a long time. Anyway I vouch for the uplift in the other bug and we'll see.
![]() |
Assignee | |
Comment 15•10 years ago
|
||
Filed bug 1252030 and bug 1252031 as followups.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•