Closed
Bug 1278556
Opened 9 years ago
Closed 9 years ago
Support recording scalars in content processes
Categories
(Toolkit :: Telemetry, defect, P1)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
mozilla53
Tracking | Status | |
---|---|---|
firefox53 | --- | fixed |
People
(Reporter: gfritzsche, Assigned: Dexter)
References
(Depends on 3 open bugs, Blocks 1 open bug)
Details
(Whiteboard: [measurement:client])
Attachments
(6 files, 26 obsolete files)
7.21 KB,
patch
|
Dexter
:
review+
|
Details | Diff | Splinter Review |
9.00 KB,
patch
|
Dexter
:
review+
|
Details | Diff | Splinter Review |
7.87 KB,
patch
|
Dexter
:
review+
|
Details | Diff | Splinter Review |
22.53 KB,
patch
|
Dexter
:
review+
|
Details | Diff | Splinter Review |
82.75 KB,
patch
|
Dexter
:
review+
|
Details | Diff | Splinter Review |
49.48 KB,
patch
|
Dexter
:
review+
|
Details | Diff | Splinter Review |
We will need to support recording scalars from content processes too.
There is an open question on what the best path is to support that:
* allow specifying one process type to record in in Scalars.yaml
(i.e. record in *either* parent *or* content, not both)
* allow recording scalars from both process types, storing them separately
(optionally allowing to restrict this to parent-only recording etc. in Scalars.yaml)
* ...?
Comment 1•9 years ago
|
||
It would be nice to force people to identify from which process they want data collected. On the client and server we could then have some validation and omission of invalid-process data, and multi-process analyses wouldn't need to be clever to avoid the 400 data points present on the "wrong" process.
What do we do if coders place the accumulation in cross-process code? Do we throw out measurements that occur on the wrong process types? Do we assert that it doesn't happen and force people to guard their call sites? Do we store it, and only invalidate it on snapshot/submit?
With histograms, there's minimal extra effort to track hgrams that truly don't appear on a given process (unless they're flag or count, and so have a default value). So long as we don't store or transfer empty scalars (is there an "empty" value for scalars, or just the default 0/false/""/etc.?) so we could just treat scalars the same way.
I'm for requiring process types in the scalar definitions (and, eventually hgram definitions).
![]() |
Reporter | |
Updated•9 years ago
|
![]() |
Reporter | |
Comment 3•9 years ago
|
||
(In reply to Chris H-C :chutten from comment #1)
> It would be nice to force people to identify from which process they want
> data collected.
Agreed. I'm thinking:
* add an optional property "record_in_processes", which takes a string array
* have it default to ["main"]
* also support "content", "gpu", ... entries
* possibly support "all" or "all childs" to not make measurements that run everywhere too complicated
We are landing on common Telemetry API consistency rules per bug 1315906.
From that, recording any measurement in the wrong process type should:
* drop the accumulation/recording
* log a visible non-fatal, non-throwing warning or error
* make sure that warning or error triggers automation failures
Assignee | ||
Updated•9 years ago
|
Priority: P3 → P2
Assignee | ||
Updated•9 years ago
|
Priority: P2 → P1
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → alessio.placitelli
Points: --- → 3
Assignee | ||
Comment 4•9 years ago
|
||
I took some time to wrap my head around how we're accumulating child processes telemetry (histograms!), as I never touched that part before.
Here's what I got so far, for the current histograms workflow:
* Adding to an histograms happens as usual, with Histogram.Add -> internal_Accumulate -> internal_RemoteAccumulate (filters to child processes only)
* Stuff is accumulated in |gAccumulations| in the child process
* |gAccumulations| gets batch sent to the parent process every 2 seconds or if the accumulation gets too big stuff
* Histograms get stored in the parent with the process name as suffix
* TelemetrySession grabs the histograms and fill them into the ping
=== Planned changes ===
Per process recording:
* Extend Scalars.yaml to support "record_in_process", as documented in comment 3.
* Restrict accumulation to the requested process type.
IPC Accumulation Timer:
* Move the IPCTimer handling outside of TelemetryHistogram [1].
* Make the new IPCTimer drive *all* the measurement subsystems (Scalars, Histograms, etc.).
* Whenever any subsystem hits the watermark limit (per measurement type) or the timer expires, batch send everything to parent (for each subsystem).
The rest of the changes will basically follow what was already done by :chutten for the Histograms.
=== Testing and validation ===
As done in bug 1218576, we'd validate and tune the accumulation using the following methods:
* Talos, watching out for any regression.
* The Energia tool (Power Gadget), looking for unexpected spikes in the power consumption.
* xpcshell test coverage (added by the patch)
@Chris, @Georg do the planned change look reasonable to you? Should I expand on the points I mentioned?
@Chris, does the validation plan look acceptable given your experience with the child histograms or can you suggest anything else?
[1] - http://searchfox.org/mozilla-central/rev/4f5f9f3222f35fa4635ea96a0c963c130854ef19/toolkit/components/telemetry/TelemetryHistogram.cpp#2655
[2] - https://github.com/mozilla/energia
Status: NEW → ASSIGNED
Flags: needinfo?(gfritzsche)
Flags: needinfo?(chutten)
Comment 5•9 years ago
|
||
Seems legit to me. As for additional validation, be sure to check anything server-side that is currently using scalars. Should be fine, since unlike histograms which already were being aggregated in the childPayloads section, child-accumulated scalars are new. But you never know.
Flags: needinfo?(chutten)
![]() |
Reporter | |
Comment 6•9 years ago
|
||
This sounds good to me.
I don't think we need to spend much time on Talos or Energia except a sanity regression check. The principle should still be the same as in bug 1218576 and the behavior should be effectively the same after this change.
Histograms & Scalars both have their own locking, so lets make sure there are no deadlock scenarios when they are pushing to the same IPC batching helper.
Flags: needinfo?(gfritzsche)
Assignee | ||
Comment 7•9 years ago
|
||
Assignee | ||
Comment 8•9 years ago
|
||
Assignee | ||
Comment 9•9 years ago
|
||
Attachment #8822355 -
Attachment is obsolete: true
Assignee | ||
Comment 10•9 years ago
|
||
Assignee | ||
Comment 11•9 years ago
|
||
Assignee | ||
Comment 12•9 years ago
|
||
Attachment #8823111 -
Attachment is obsolete: true
Assignee | ||
Comment 13•9 years ago
|
||
Attachment #8823112 -
Attachment is obsolete: true
Assignee | ||
Comment 14•9 years ago
|
||
Assignee | ||
Updated•9 years ago
|
Attachment #8823110 -
Flags: review?(gfritzsche)
Assignee | ||
Comment 15•9 years ago
|
||
Comment on attachment 8823242 [details] [diff] [review]
part 3 - Implement the core scalar recording in content processes
Georg, Chris, would you mind having a look at the overall architecture of this?
This patch implements the core functionalities for recording scalars in child processes. It builds up on part 1 (which adds support for restricting recording to certain processes) and 2 (which refactors the IPC timer out of TelemetryHistogram.cpp).
The workflow is as follows:
1) When changing a scalar in a child process, the changes are stored in gScalarStorageMap/gKeyedScalarStorageMap (the same way as it's done in the parent process).
2) Once the IPC timer fires, |TelemetryScalar::IPCTimerFired| gets called and mirrors the scalars in a |ScalarChildData| TArray (keyed scalars work similarly, but *each key* is stored in a different |KeyedScalarChildData|).
3) The |ScalarChildData|/|KeyedScalarChildData| arrays get serialized in IPC messages using the stuff implemented in TelemetryComms.h
4) The serialized messages are sent to the parent process and |TelemetryScalar::RefreshChildData|/|TelemetryScalar::RefreshChildKeyedData| gets called to update the copy of the data in the parent process. The child process scalar data is stored in |gChildScalarsStorageMap| in the parent process.
No merge takes place, as child processes always sent the most recent data. We need the most recent data to still live in the child processes to support operations like |setMaximum|.
Another strategy for this could be to store the atomic actions, rather than the values, in the child processes (e.g. Increment by 1, setMaximum(200)) and send the ordered sequence of actions to the parent process. But this gets complicated and I can't spot a good reason to do that right now.
5) When clearing the scalars in the parent process, a message is sent to the child processes to clear their storage too (NOT CURRENTLY IMPLEMENTED).
Does this sound reasonable?
Attachment #8823242 -
Flags: feedback?(gfritzsche)
Attachment #8823242 -
Flags: feedback?(chutten)
Assignee | ||
Comment 16•9 years ago
|
||
Comment on attachment 8822356 [details] [diff] [review]
part 2 - Move the IPC Timer logic to TelemetryIPCTimer.cpp
This patch refactors the IPC timer out of TelemetryHistogram.cpp/h
Attachment #8822356 -
Flags: review?(chutten)
Attachment #8822356 -
Flags: feedback?(gfritzsche)
Assignee | ||
Comment 17•9 years ago
|
||
Comment on attachment 8823243 [details] [diff] [review]
part 4 - Modify Telemetry session to supporto child scalars (and add tests)
This patch adds child processes scalars to the main ping and test coverage.
Attachment #8823243 -
Flags: review?(gfritzsche)
Assignee | ||
Comment 18•9 years ago
|
||
Comment on attachment 8823244 [details] [diff] [review]
part 5 - Change about:telemetry to support content scalars
This patch changes about:telemetry to support child processes scalars.
Attachment #8823244 -
Flags: review?(chutten)
Assignee | ||
Comment 19•9 years ago
|
||
Attachment #8823249 -
Flags: review?(gfritzsche)
Assignee | ||
Comment 20•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #3)
> We are landing on common Telemetry API consistency rules per bug 1315906.
> From that, recording any measurement in the wrong process type should:
> * drop the accumulation/recording
> * log a visible non-fatal, non-throwing warning or error
> * make sure that warning or error triggers automation failures
This part must be implemented in part 3 (and it's not there right now).
![]() |
Reporter | |
Comment 21•9 years ago
|
||
(In reply to Alessio Placitelli [:Dexter] from comment #15)
> Another strategy for this could be to store the atomic actions, rather than
> the values, in the child processes (e.g. Increment by 1, setMaximum(200))
> and send the ordered sequence of actions to the parent process. But this
> gets complicated and I can't spot a good reason to do that right now.
I don't think storing & sending only the resulting values works generally for multiple processes (i.e. >1 content processes).
I'm not sure why this is complicated, can you expand?
The child process code for histograms does this already.
Assignee | ||
Comment 22•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #21)
> (In reply to Alessio Placitelli [:Dexter] from comment #15)
> > Another strategy for this could be to store the atomic actions, rather than
> > the values, in the child processes (e.g. Increment by 1, setMaximum(200))
> > and send the ordered sequence of actions to the parent process. But this
> > gets complicated and I can't spot a good reason to do that right now.
>
> I don't think storing & sending only the resulting values works generally
> for multiple processes (i.e. >1 content processes).
> I'm not sure why this is complicated, can you expand?
> The child process code for histograms does this already.
Yeah, I mistakenly assumed we had one process per type, but [1] clearly states we can have multiple content processes
(and bug 1303113 enables that).
As discussed over IRC, the issue is that Histograms only have to deal with "adding a sample": samples can be added to an histogram at any time and in any order, the resulting histogram won't be affected by this.
On the other hand, scalars support operations such as |setMaximum|: depending on the order they are called, the result might be different.
We agreed to ignore the race and let the scalar user deal with that. I'll add a note in the doc.
[1] - https://wiki.mozilla.org/Security/Sandbox/Process_model#Multiple_Content_Processes
Assignee | ||
Comment 23•9 years ago
|
||
Comment on attachment 8823242 [details] [diff] [review]
part 3 - Implement the core scalar recording in content processes
Cancelling that due to the previous comment.
Attachment #8823242 -
Flags: feedback?(gfritzsche)
Attachment #8823242 -
Flags: feedback?(chutten)
Assignee | ||
Comment 24•9 years ago
|
||
This changes Part 3 as discussed in comment 22: actions are batch sent to the parent process when needed and changes are assembled there, in order.
Attachment #8823242 -
Attachment is obsolete: true
Attachment #8823699 -
Flags: feedback?(gfritzsche)
Attachment #8823699 -
Flags: feedback?(chutten)
Assignee | ||
Comment 25•9 years ago
|
||
Attachment #8823249 -
Attachment is obsolete: true
Attachment #8823249 -
Flags: review?(gfritzsche)
Attachment #8823719 -
Flags: review?(gfritzsche)
![]() |
Reporter | |
Comment 26•9 years ago
|
||
Comment on attachment 8823110 [details] [diff] [review]
part 1 - Add support to "record_in_processes" in Scalars.yaml
Review of attachment 8823110 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/ScalarInfo.h
@@ +15,5 @@
>
> namespace {
> +
> +enum class RecordedProcessType : uint32_t {
> + all_child = 1, // All the children processes (i.e. content | gpu | ...)
The existing coding style is using CamelCase.
@@ +16,5 @@
> namespace {
> +
> +enum class RecordedProcessType : uint32_t {
> + all_child = 1, // All the children processes (i.e. content | gpu | ...)
> + main = (1 << 1), // Also known as "parent process"
Why not have `main` first, then followed by childs?
@@ +19,5 @@
> + all_child = 1, // All the children processes (i.e. content | gpu | ...)
> + main = (1 << 1), // Also known as "parent process"
> + content = (1 << 2),
> + gpu = (1 << 3),
> + all = (1 << 4) - 1 // All the processes, this is a shortcut to (all_child | main)
Why not just use a clear expression here that has all bits to 1?
::: toolkit/components/telemetry/parse_scalars.py
@@ +16,5 @@
>
> +# This is a list of flags that determine which process the scalar is allowed
> +# to record from.
> +KNOWN_PROCESS_FLAGS = [
> + 'all', 'all_child', 'main', 'content', 'cpu'
Nit: 'all_childs'
@@ +93,5 @@
> OPTIONAL_FIELDS = {
> 'cpp_guard': basestring,
> 'release_channel_collection': basestring,
> + 'keyed': bool,
> + 'record_in_processes': list
Nit: Add trailing comma for less change noise in the future.
@@ +100,5 @@
> # The types for the data within the fields that hold lists.
> LIST_FIELDS_CONTENT = {
> 'bug_numbers': int,
> + 'notification_emails': basestring,
> + 'record_in_processes': basestring
Ditto.
@@ +162,5 @@
> raise ValueError(self._name + ' - invalid cpp_guard: ' + cpp_guard)
>
> + # Validate record_in_processes.
> + record_in_processes = definition.get('record_in_processes', [])
> + for proc in record_in_processes:
Do we protect against this field being set to an empty lists?
@@ +223,5 @@
> return self._definition['notification_emails']
>
> @property
> + def record_in_processes(self):
> + """Get the list of processes to record data in"""
... non-empty list ...
Attachment #8823110 -
Flags: review?(gfritzsche) → feedback+
Assignee | ||
Updated•9 years ago
|
Attachment #8823699 -
Flags: review?(gfritzsche)
Attachment #8823699 -
Flags: review?(chutten)
Attachment #8823699 -
Flags: feedback?(gfritzsche)
Attachment #8823699 -
Flags: feedback?(chutten)
![]() |
Reporter | |
Comment 27•9 years ago
|
||
Comment on attachment 8822356 [details] [diff] [review]
part 2 - Move the IPC Timer logic to TelemetryIPCTimer.cpp
Review of attachment 8822356 [details] [diff] [review]:
-----------------------------------------------------------------
Cheers for breaking the refactoring into a separate patch, that makes reviews much saner.
::: toolkit/components/telemetry/TelemetryIPCTimer.cpp
@@ +35,5 @@
> +static StaticMutex gTelemetryIPCTimerMutex;
> +
> +namespace {
> +
> +void internal_armIPCTimerMainThread()
Based on Nathans feedback we started to move away from the `internal_*` pattern.
Lets drop the prefix and pass locking evidence as the first argument instead (see e.g. TelemetryEvent.cpp).
@@ +90,5 @@
> +{
> + MOZ_ASSERT(NS_IsMainThread());
> +
> + // Notify the collection subsystems.
> + TelemetryHistogram::IPCTimerFired(aTimer, aClosure);
This is not great: Two separate modules are mutually calling into each other, which makes it much harder to reason about.
A cleaner unidirectional flow would be to have TelemetryHistogram & TelemetryScalar exclusively push data into TelemetryIPCTimer.
TelemetryIPCTimer then encapsulates accumulation storage, logic for when to trigger IPC calls, doing IPC calls.
This would probably call for a renaming of this module too.
::: toolkit/components/telemetry/TelemetryIPCTimer.h
@@ +15,5 @@
> +void ArmIPCTimer();
> +void IPCTimerFired(nsITimer* aTimer, void* aClosure);
> +void DeInitializeGlobalState();
> +
> +// TODO Maybe move to TelemetryCommon::?
Left-over or actual TODO?
Attachment #8822356 -
Flags: feedback?(gfritzsche) → feedback-
![]() |
Reporter | |
Comment 28•9 years ago
|
||
Comment on attachment 8823699 [details] [diff] [review]
part 3 - Implement the core scalar recording in content processes
Review of attachment 8823699 [details] [diff] [review]:
-----------------------------------------------------------------
Part of this patch would be affected by how we decide on the feedback in comment 27, so i only took a quick higher-level look for now.
Overall this looks complete, but we should change the scalar child storage to not use the string hacks that we use for histograms.
::: toolkit/components/telemetry/TelemetryComms.h
@@ +37,5 @@
> + eAdd = 2,
> + eSetMaximum = 3
> +};
> +
> +struct ScalarChildData
How about calling this `ScalarAction` and making the above `ScalarActionType`?
`ChildData` seems to imply this is transferring the final state?
@@ +42,5 @@
> +{
> + ScalarID mId;
> + uint32_t mScalarType;
> + AtomicScalarAction mActionType;
> + nsCOMPtr<nsIVariant> mData;
Please check with someone more knowledgeable about our IPC mechanism (maybe Nathan?) if this is the best way to transfer variant data over IPC.
@@ +123,5 @@
> + aMsg->WriteUInt32(static_cast<uint32_t>(aParam.mId));
> + WriteParam(aMsg, aParam.mScalarType);
> + WriteParam(aMsg, static_cast<uint32_t>(aParam.mActionType));
> +
> + switch(aParam.mScalarType) {
This is missing a default branch for unknown types.
We should MOZ_ASSERT() and not write incomplete messages.
Ditto for the keyed version.
@@ +172,5 @@
> + // De-serialize the data based on the scalar type.
> + nsCOMPtr<nsIWritableVariant> outVar(new nsVariant());
> +
> + nsresult rv = NS_OK;
> + switch (aResult->mScalarType)
What about unknown types?
We should at least MOZ_ASSERT()?
Ditto for the keyed version.
::: toolkit/components/telemetry/TelemetryScalar.cpp
@@ +980,5 @@
> +
> + CharPtrEntryType* entry = nullptr;
> + GeckoProcessType process = GetProcessFromName(aName);
> + const char* suffix = SuffixForProcessType(process);
> + if (!suffix) {
With histograms we have an existing storage implementation based on string keys (histogram.h/.cc), so we crudely added keyed & child histogram support using string prefix & suffix.
With scalars we can do this properly without string hackery, using e.g. a mapping structure like:
(ProcessId -> (ScalarId -> Scalar))
::: toolkit/components/telemetry/TelemetryScalar.h
@@ +60,5 @@
>
> size_t GetMapShallowSizesOfExcludingThis(mozilla::MallocSizeOf aMallocSizeOf);
> size_t GetScalarSizesOfIncludingThis(mozilla::MallocSizeOf aMallocSizeOf);
>
> +void RefreshChildData(GeckoProcessType aProcessType,
`Refresh` doesn't seem to imply replaying of actions. `ProcessChildActions()` maybe or `UpdateChildData()`?
Attachment #8823699 -
Flags: review?(gfritzsche) → feedback-
![]() |
Reporter | |
Comment 29•9 years ago
|
||
Comment on attachment 8823243 [details] [diff] [review]
part 4 - Modify Telemetry session to supporto child scalars (and add tests)
Review of attachment 8823243 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/Scalars.yaml
@@ +296,5 @@
> + kind: uint
> + notification_emails:
> + - telemetry-client-dev@mozilla.com
> + record_in_processes:
> + - 'content'
Also smoke-test 'all' and 'all_childs'?
I assume you are testing one scalar with no `record_in_processes` set to see that the default works correctly?
::: toolkit/components/telemetry/TelemetrySession.jsm
@@ +1005,3 @@
> }
> + // Extract the scalar name and process suffix from the data.
> + const separatorIndex = name.lastIndexOf('#');
I think we should change the serialization to make this cleaner.
With histograms we did some crude changes to support multiple processes, with scalars you have no real historical baggage to worry about.
The `snapshot` functions could:
- return an object of the form `{'main': ..., 'content': ...}`
- or take a "process" argument
::: toolkit/components/telemetry/tests/unit/test_ChildScalars.js
@@ +17,5 @@
> +
> +const UINT_SCALAR = "telemetry.test.unsigned_int_kind";
> +const KEYED_UINT_SCALAR = "telemetry.test.keyed_unsigned_int";
> +const KEYED_BOOL_SCALAR = "telemetry.test.keyed_boolean_kind";
> +const CONTENT_UINT_SCALAR = "telemetry.test.content_only_uint";
`CONTENT_ONLY_UINT_SCALAR`?
@@ +43,5 @@
> + "Scalars must not be recorded in other processes unless allowed.");
> + Assert.ok(UINT_SCALAR in scalars,
> + `${UINT_SCALAR} must be recorded in the parent process.`);
> + Assert.equal(Object.keys(scalars).length, 1,
> + "We should only be recording one plain scalar in the parent process.");
This is bound to fail once more scalars are recorded across the code-base.
You need to change the logic here.
Ditto for the similar keyed & content checks here.
Attachment #8823243 -
Flags: review?(gfritzsche)
![]() |
Reporter | |
Comment 30•9 years ago
|
||
Comment on attachment 8823719 [details] [diff] [review]
part 6 - Update the docs
Review of attachment 8823719 [details] [diff] [review]:
-----------------------------------------------------------------
I'll drop this for now, please reflag when the other parts are settled.
Attachment #8823719 -
Flags: review?(gfritzsche)
Comment 31•9 years ago
|
||
Comment on attachment 8823244 [details] [diff] [review]
part 5 - Change about:telemetry to support content scalars
Review of attachment 8823244 [details] [diff] [review]:
-----------------------------------------------------------------
This reminds me how much I want to rewrite about:telemetry. So much duplicated code.
Comment nits.
::: toolkit/content/aboutTelemetry.js
@@ +2162,5 @@
>
> let hgramsSelect = document.getElementById("histograms-processes");
> let hgramsOption = hgramsSelect.selectedOptions.item(0);
> let hgramsProcess = hgramsOption.getAttribute("value");
> + // "parent" histograms/keyedHistograms are under "parent". Fix that up.
s/are/aren't/ ?
@@ +2198,5 @@
>
> let keyedHgramsSelect = document.getElementById("keyed-histograms-processes");
> let keyedHgramsOption = keyedHgramsSelect.selectedOptions.item(0);
> let keyedHgramsProcess = keyedHgramsOption.getAttribute("value");
> + // "parent" histograms/keyedHistograms are under "parent". Fix that up.
s/are/aren't/ ?
Basically, if they _were_ under "parent" then we wouldn't need this logic. Since they're under the root payload.keyedHistograms, we need this fiddle.
Attachment #8823244 -
Flags: review?(chutten) → review+
Updated•9 years ago
|
Attachment #8822356 -
Flags: review?(chutten)
Comment 32•9 years ago
|
||
Comment on attachment 8823699 [details] [diff] [review]
part 3 - Implement the core scalar recording in content processes
Clearing review flags as it seems as though there are some architectural issues that need addressing first.
Attachment #8823699 -
Flags: review?(chutten)
Assignee | ||
Comment 33•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #26)
> @@ +162,5 @@
> > raise ValueError(self._name + ' - invalid cpp_guard: ' + cpp_guard)
> >
> > + # Validate record_in_processes.
> > + record_in_processes = definition.get('record_in_processes', [])
> > + for proc in record_in_processes:
>
> Do we protect against this field being set to an empty lists?
Yes, in |validate_types|
Attachment #8823110 -
Attachment is obsolete: true
Attachment #8824067 -
Flags: review?(gfritzsche)
![]() |
Reporter | |
Comment 34•9 years ago
|
||
Comment on attachment 8824067 [details] [diff] [review]
part 1 - Add support to "record_in_processes" in Scalars.yaml
Review of attachment 8824067 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/ScalarInfo.h
@@ +15,5 @@
>
> namespace {
> +
> +enum class RecordedProcessType : uint32_t {
> + eMain = 1, // Also known as "parent process"
Currently we use CamelCase without an "e" prefix in telemetry/.
Consistency wins for readability, so let's stick with that existing style.
@@ +19,5 @@
> + eMain = 1, // Also known as "parent process"
> + eAllChilds = (1 << 1), // All the children processes (i.e. content, gpu, ...)
> + eContent = (1 << 2),
> + eGpu = (1 << 3),
> + eAll = 0xFFFFFFFF // All the processes, this is a shortcut to (all_child | main)
Nit: The "shortcut ..." explanation seems redundant?
::: toolkit/components/telemetry/parse_scalars.py
@@ +16,5 @@
>
> +# This is a list of flags that determine which process the scalar is allowed
> +# to record from.
> +KNOWN_PROCESS_FLAGS = {
> + 'all': 'RecordedProcessType::eAll',
Side-note: This looks like something we could factor out and share in bug 1328230 (after this bug landed).
@@ +168,5 @@
> + # Validate record_in_processes.
> + record_in_processes = definition.get('record_in_processes', [])
> + for proc in record_in_processes:
> + if proc not in KNOWN_PROCESS_FLAGS.keys():
> + raise ValueError(self._name + ' - unknown record_in_processes: ' + proc)
Nit: "unknown value in ..."
@@ +226,5 @@
> """Get the list of notification emails"""
> return self._definition['notification_emails']
>
> @property
> + def record_in_processes(self):
Please change this into `record_in_processes_enum()` or so.
Also add a `record_in_processes()` that returns the list of the original values.
This enables better extraction of the information for documentation etc.
Attachment #8824067 -
Flags: review?(gfritzsche) → review+
Assignee | ||
Comment 35•9 years ago
|
||
Attachment #8824067 -
Attachment is obsolete: true
Attachment #8824358 -
Flags: review+
Assignee | ||
Comment 36•9 years ago
|
||
Thanks Georg, Chris. This patch should address your concerns.
I'll wait for feedback on this before reworking part 3 but, once we're settled with this, it shouldn't take much time (tm).
Attachment #8822356 -
Attachment is obsolete: true
Attachment #8824412 -
Flags: review?(gfritzsche)
Attachment #8824412 -
Flags: review?(chutten)
Comment 37•9 years ago
|
||
Comment on attachment 8824412 [details] [diff] [review]
part 2 - Move the IPC Timer logic to TelemetryIPCTimer.cpp
Review of attachment 8824412 [details] [diff] [review]:
-----------------------------------------------------------------
Seems like a good and straightforward refactor (which is how I probably should have done it in the first place :S)
::: toolkit/components/telemetry/TelemetryIPCAccumulator.cpp
@@ +14,5 @@
> +#include "mozilla/StaticPtr.h"
> +#include "mozilla/Unused.h"
> +
> +#include "TelemetryHistogram.h"
> +#include "TelemetryIPCAccumulator.h"
Mozilla coding style says to put the File.h include at the top in its own block of #includes in File.cpp.
@@ +42,5 @@
> +// For batching and sending child process accumulations to the parent
> +StaticAutoPtr<nsTArray<Accumulation>> gHistogramAccumulations;
> +StaticAutoPtr<nsTArray<KeyedAccumulation>> gKeyedHistogramAccumulations;
> +
> +// This is a StaticMutex rather than a plain Mutex (1) so that
The (1) should come after "so that", I think?
::: toolkit/components/telemetry/TelemetryIPCAccumulator.h
@@ +5,5 @@
> +
> +#ifndef TelemetryIPCAccumulator_h__
> +#define TelemetryIPCAccumulator_h__
> +
> +#include "nsIRunnable.h"
Mozilla style prefers forward declaration of classes to includes. (there isn't some dumb C++ reason we need the whole header, is there?)
Attachment #8824412 -
Flags: review?(chutten) → review+
![]() |
Reporter | |
Comment 38•9 years ago
|
||
Comment on attachment 8824412 [details] [diff] [review]
part 2 - Move the IPC Timer logic to TelemetryIPCTimer.cpp
Review of attachment 8824412 [details] [diff] [review]:
-----------------------------------------------------------------
Chris already covered the review, so i only took a quick look at the new parts.
::: toolkit/components/telemetry/Telemetry.cpp
@@ +2163,5 @@
> NS_IF_RELEASE(sTelemetry);
>
> // Lastly, de-initialise the TelemetryHistogram and TelemetryScalar global states,
> // so as to release any heap storage that would otherwise be kept alive by it.
> + TelemetryIPCAccumulator::DeInitializeGlobalState();
Shouldn't we do this last, after the other modules stopped?
Otherwise we might lose some final accumulations on child process shutdown.
::: toolkit/components/telemetry/TelemetryIPCAccumulator.cpp
@@ +139,5 @@
> +TelemetryIPCAccumulator::IPCTimerFired(nsITimer* aTimer, void* aClosure)
> +{
> + MOZ_ASSERT(NS_IsMainThread());
> +
> + // Notify the collection subsystems.
Notify? Is the comment outdated?
Attachment #8824412 -
Flags: review?(gfritzsche) → feedback+
Assignee | ||
Comment 39•9 years ago
|
||
Thanks for the feedback and the review. I got rid of the nsIrunnable include, but due to that had to include two more basic ones for the alreadyAddRef and the nsCString.
Attachment #8824412 -
Attachment is obsolete: true
Attachment #8824921 -
Flags: review+
Assignee | ||
Comment 40•9 years ago
|
||
Forward declared nsCString as well (thanks Georg).
Attachment #8824921 -
Attachment is obsolete: true
Attachment #8824937 -
Flags: review+
Assignee | ||
Comment 41•9 years ago
|
||
Attachment #8823699 -
Attachment is obsolete: true
Assignee | ||
Comment 42•9 years ago
|
||
Comment on attachment 8825090 [details] [diff] [review]
part 3 - Implement the core scalar recording in content processes
@Georg, I think I've addressed all of your previous feedback. Let me know how if that's what you meant.
@Chris, I think this is ready for another look!
@Nathan, are you the right person to ask for IPC related questions? If so, would you double check the changes in TelemetryComms.h? Is this the best way to transfer variant data over IPC?
Attachment #8825090 -
Flags: review?(gfritzsche)
Attachment #8825090 -
Flags: review?(chutten)
Attachment #8825090 -
Flags: feedback?(nfroyd)
Comment 43•9 years ago
|
||
Comment on attachment 8825090 [details] [diff] [review]
part 3 - Implement the core scalar recording in content processes
Review of attachment 8825090 [details] [diff] [review]:
-----------------------------------------------------------------
The organization seems sane. I'm a little worried that the duplication will get out of hand if we add a third kind of information we need to work through the same mechanism.
::: toolkit/components/telemetry/Telemetry.cpp
@@ +2134,5 @@
>
> // First, initialize the TelemetryHistogram and TelemetryScalar global states.
> TelemetryHistogram::InitializeGlobalState(useTelemetry, useTelemetry);
>
> // Only record scalars from the parent process.
Out-of-date comment, I suspect?
::: toolkit/components/telemetry/Telemetry.h
@@ +145,5 @@
> */
> void AccumulateChildKeyed(GeckoProcessType aProcessType, const nsTArray<KeyedAccumulation>& aAccumulations);
>
> /**
> + * Refresh scalars for the given process type with the data coming from child process.
The word "Refresh" and the word "Update" have different connotations. I prefer Update in this case (and on :156, below)
::: toolkit/components/telemetry/TelemetryCommon.h
@@ +82,5 @@
> + * Read the process type from the given measurement name. The process type, if
> + * one exists, is embedded in a suffix.
> + *
> + * @param aString The histogram or scalar name.
> + * @return One of the GeckoProcessType values.
Copy-pasta error
::: toolkit/components/telemetry/TelemetryComms.h
@@ +143,5 @@
> + MOZ_ASSERT(false, "Conversion failed.");
> + return;
> + }
> + WriteParam(aMsg, val.get());*/
> + // TODO: It doesn't seem to like UTF16/nsAString very much
This is quite a big caveat. Do you have a plan to address this?
@@ +299,5 @@
> + } break;
> + case nsITelemetry::SCALAR_STRING:
> + {
> + // Keyed string scalars are not supported.
> + MOZ_ASSERT(false, "Not supported.");
I'd prefer more specific messages if this and its friends are going to stay like this. "Keyed String Scalar unable to be read from child process. Not supported."
::: toolkit/components/telemetry/TelemetryScalar.cpp
@@ +2285,5 @@
> + ScalarBase* scalar = static_cast<ScalarBase*>(childIter.Data());
> + n += scalar->SizeOfIncludingThis(aMallocSizeOf);
> + }
> + }
> + // Also account for scalar data coming from child processes.
Also account for _keyed_ scalar data coming from child processes.
::: toolkit/components/telemetry/TelemetryScalar.h
@@ +66,5 @@
> +
> +void UpdateChildKeyedData(GeckoProcessType aProcessType,
> + const nsTArray<mozilla::Telemetry::KeyedScalarAction>& aScalarActions);
> +
> +void IPCTimerFired(nsITimer* aTimer, void* aClosure);
I'm confused. What is this doing here? An artefact of a pre-IPCAccumulator rev?
Attachment #8825090 -
Flags: review?(chutten)
![]() |
Reporter | |
Comment 44•9 years ago
|
||
(In reply to Chris H-C :chutten from comment #43)
> The organization seems sane. I'm a little worried that the duplication will
> get out of hand if we add a third kind of information we need to work
> through the same mechanism.
We can use Telemetry Events (bug 1313326) as an example to think through this.
Which parts of the code are we talking about though?
The wiring in the processes Child/Parent implementation, routing through to Telemetry?
![]() |
Reporter | |
Comment 45•9 years ago
|
||
Comment on attachment 8825090 [details] [diff] [review]
part 3 - Implement the core scalar recording in content processes
Review of attachment 8825090 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/TelemetryCommon.cpp
@@ +14,5 @@
>
> #include <cstring>
>
> +#define CONTENT_PROCESS_SUFFIX "#content"
> +#define GPU_PROCESS_SUFFIX "#gpu"
We seem to duplicate some code here on process types:
https://dxr.mozilla.org/mozilla-central/source/xpcom/build/nsXULAppAPI.h#392,406
Good for a follow-up?
::: toolkit/components/telemetry/TelemetryCommon.h
@@ +75,5 @@
> + *
> + * @param aString The histogram or scalar name.
> + * @return One of the GeckoProcessType values.
> + */
> +GeckoProcessType GetProcessFromName(const nsACString& aString);
I think this and the one below are only used for Histograms?
In that case we don't need this to be shared.
::: toolkit/components/telemetry/TelemetryScalar.cpp
@@ +33,5 @@
> using mozilla::Telemetry::Common::CanRecordDataset;
> using mozilla::Telemetry::Common::IsInDataset;
> using mozilla::Telemetry::Common::LogToBrowserConsole;
> +using mozilla::Telemetry::Common::GetProcessFromName;
> +using mozilla::Telemetry::Common::SuffixForProcessType;
I don't think those two are used here anymore?
@@ +771,5 @@
>
> +// This is a (Process Id -> (Scalar ID -> Scalar Object)) map that owns the
> +// scalar instances and takes care of deallocating them when they are removed
> +// from the map.
> +ChildScalarsMapType gChildScalarsStorageMap;
Is there a reason to keep separate storage for child & parent data?
Can you just put the parent data into the same storage?
@@ +919,5 @@
> +
> + switch (thisProcessType)
> + {
> + case GeckoProcessType_Default:
> + return !!(info.record_in_processes & RecordedProcessType::Main);
Can't we just have the `RecordedProcessType` entries of the form `1 << GeckoProcessType_*` instead?
That would simplify this and avoid keeping a different process enumeration logic.
@@ +2012,5 @@
> nsresult
> TelemetryScalar::CreateSnapshots(unsigned int aDataset, bool aClearScalars, JSContext* aCx,
> uint8_t optional_argc, JS::MutableHandle<JS::Value> aResult)
> {
> + MOZ_ASSERT(XRE_IsParentProcess());
On non-debug, return `{}` in child processes?
@@ +2072,5 @@
> + return rv;
> + }
> + // Add the process suffix to the scalar name.
> + nsCString scalarName(info.name());
> + if (const char* suffix = SuffixForProcessType(processStorageType)) {
Why do we need to use a suffix here and then separate the probe data again in TelemetrySession.jsm etc.?
Can we change the snapshotting instead to either do:
scalarSnapshots() -> {"main": ..., "content": ...}
... or:
scalarSnapshots("gpu") -> {...}
@@ +2122,5 @@
> nsresult
> TelemetryScalar::CreateKeyedSnapshots(unsigned int aDataset, bool aClearScalars, JSContext* aCx,
> uint8_t optional_argc, JS::MutableHandle<JS::Value> aResult)
> {
> + MOZ_ASSERT(XRE_IsParentProcess());
Ditto, just return `{}` in childs?
Attachment #8825090 -
Flags: review?(gfritzsche)
Assignee | ||
Comment 46•9 years ago
|
||
Removed the unused IPCTimerFired from TelemetryHistogram.h
Attachment #8824937 -
Attachment is obsolete: true
Attachment #8825342 -
Flags: review+
Assignee | ||
Comment 47•9 years ago
|
||
Updated the RecordInProcesses flags.
Attachment #8824358 -
Attachment is obsolete: true
Attachment #8825428 -
Flags: review+
Assignee | ||
Comment 48•9 years ago
|
||
I've updated the patch with all the suggested changes.
@Chris, I forgot to mention that interesting UTF part in TelemetryComms.h to Nathan. Please don't consider that part, I'll flag you again on that bit once I've got feedback from Nathan.
@Nathan, sorry for flagging again, would you check the changes in TelemetryComms.h? Is this the best way to transfer variant data over IPC? Also, WriterParam doesn't seem to like nsString(s) too much (see the TODO in TelemetryComms.h). Do you know any way to make that happen?
Attachment #8825090 -
Attachment is obsolete: true
Attachment #8825090 -
Flags: feedback?(nfroyd)
Attachment #8825437 -
Flags: review?(gfritzsche)
Attachment #8825437 -
Flags: review?(chutten)
Attachment #8825437 -
Flags: feedback?(nfroyd)
Assignee | ||
Updated•9 years ago
|
Attachment #8825437 -
Attachment description: bug1278556_part3.patch → part 3 - Implement the core scalar recording in content processes
Assignee | ||
Updated•9 years ago
|
Attachment #8825437 -
Flags: review?(gfritzsche) → feedback?(gfritzsche)
Assignee | ||
Comment 49•9 years ago
|
||
Fixed the nits and rebased.
Attachment #8823244 -
Attachment is obsolete: true
Attachment #8825448 -
Flags: review+
![]() |
||
Comment 50•9 years ago
|
||
Comment on attachment 8825437 [details] [diff] [review]
part 3 - Implement the core scalar recording in content processes
Review of attachment 8825437 [details] [diff] [review]:
-----------------------------------------------------------------
This looks reasonable.
::: toolkit/components/telemetry/TelemetryComms.h
@@ +143,5 @@
> + MOZ_ASSERT(false, "Conversion failed.");
> + return;
> + }
> + WriteParam(aMsg, val.get());*/
> + // TODO: It doesn't seem to like UTF16/nsAString very much
This and the case below don't work because there's no template specialization for ParamTraits<nsAutoString> in IPCMessageUtils.h. So you have two options:
1) use nsString in these functions;
2) add the specialization in the same way that there's a specialization for nsAutoCString.
@@ +209,5 @@
> + // Fill the nsIVariant.
> + rv = outVar->SetAsBool(data);
> + } break;
> + default:
> + MOZ_ASSERT(false, "Unknown scalar type.");
I think you want to return false here, so that in release builds, you won't drop through and consider the Read() successful?
@@ +312,5 @@
> + // Fill the nsIVariant.
> + rv = outVar->SetAsBool(data);
> + } break;
> + default:
> + MOZ_ASSERT(false, "Unknown keyed scalar type.");
Same thing applies here.
Attachment #8825437 -
Flags: feedback?(nfroyd) → feedback+
Assignee | ||
Comment 51•9 years ago
|
||
Attachment #8823719 -
Attachment is obsolete: true
Assignee | ||
Comment 52•9 years ago
|
||
This also changes the other tests that were failing due to the different format of |snapshotScalars| introduced in part 3.
Attachment #8823243 -
Attachment is obsolete: true
Attachment #8825466 -
Flags: review?(gfritzsche)
Assignee | ||
Updated•9 years ago
|
Attachment #8825449 -
Flags: review?(gfritzsche)
Assignee | ||
Comment 53•9 years ago
|
||
This fixes the things pointed out by Nathan and uses nsString to enable string scalars in child processes.
Since nsAutoString will probably be more performant, I'm filing bug 1330247 to add the required template specialization and update this code once it lands.
But I'd love not to block on this right now.
Attachment #8825437 -
Attachment is obsolete: true
Attachment #8825437 -
Flags: review?(chutten)
Attachment #8825437 -
Flags: feedback?(gfritzsche)
Attachment #8825702 -
Flags: review?(chutten)
Attachment #8825702 -
Flags: feedback?(gfritzsche)
Assignee | ||
Comment 54•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #29)
> Comment on attachment 8823243 [details] [diff] [review]
> part 4 - Modify Telemetry session to supporto child scalars (and add tests)
>
> Review of attachment 8823243 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: toolkit/components/telemetry/Scalars.yaml
> @@ +296,5 @@
> > + kind: uint
> > + notification_emails:
> > + - telemetry-client-dev@mozilla.com
> > + record_in_processes:
> > + - 'content'
>
> Also smoke-test 'all' and 'all_childs'?
> I assume you are testing one scalar with no `record_in_processes` set to see
> that the default works correctly?
Yes, "telemetry.test.unsigned_int_kind".
Comment 55•9 years ago
|
||
Comment on attachment 8825702 [details] [diff] [review]
part 3 - Implement the core scalar recording in content processes
Review of attachment 8825702 [details] [diff] [review]:
-----------------------------------------------------------------
My comments were addressed, and it looks like a decent sort of solution to me.
Attachment #8825702 -
Flags: review?(chutten) → review+
![]() |
Reporter | |
Comment 56•9 years ago
|
||
Comment on attachment 8825702 [details] [diff] [review]
part 3 - Implement the core scalar recording in content processes
Review of attachment 8825702 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks. I left some comments, but i don't think this needs further input from me.
::: toolkit/components/telemetry/TelemetryComms.h
@@ +17,1 @@
> enum ID : uint32_t;
Ok, we really should rename this to HistogramID after this bug?
@@ +32,5 @@
> +// Scalar accumulation types.
> +enum class ScalarID : uint32_t;
> +
> +enum class ScalarActionType : uint32_t {
> + eSet = 1,
Is there a specific reason not to start at 0?
@@ +133,5 @@
> + MOZ_ASSERT(false, "Count Scalar unable to convert variant to bool from child process.");
> + return;
> + }
> + WriteParam(aMsg, val);
> + } break;
This is a very hidden statement.
Just use:
> ...
> break;
> }
Dito the ones below.
@@ +291,5 @@
> + if (!ReadParam(aMsg, aIter, &data)) {
> + return false;
> + }
> + // Fill the nsIVariant.
> + rv = outVar->SetAsUint32(data);
Why not directly return false here & below?
Does going through `rv` add anything?
@@ +296,5 @@
> + } break;
> + case nsITelemetry::SCALAR_STRING:
> + {
> + // Keyed string scalars are not supported.
> + MOZ_ASSERT(false, "Keyed String Scalar unable to be read from child process. Not supported.");
Return false?
::: toolkit/components/telemetry/TelemetryHistogram.cpp
@@ -403,5 @@
> return NS_OK;
> }
>
> -// Read the process type from the given histogram name. The process type, if
> -// one exists, is embedded in a suffix.
Nit: Should this be in this patch?
::: toolkit/components/telemetry/TelemetryScalar.cpp
@@ +15,5 @@
> #include "nsContentUtils.h"
> #include "nsThreadUtils.h"
> +#include "mozilla/dom/ContentChild.h"
> +#include "mozilla/gfx/GPUParent.h"
> +#include "mozilla/gfx/GPUProcessManager.h"
Aren't these 3 unused?
@@ +20,2 @@
> #include "mozilla/StaticMutex.h"
> +#include "mozilla/StaticPtr.h"
Unused?
@@ +27,5 @@
> #include "TelemetryScalarData.h"
>
> using mozilla::StaticMutex;
> using mozilla::StaticMutexAutoLock;
> +using mozilla::StaticAutoPtr;
Unused?
@@ +764,5 @@
> +
> +// The (Process Id -> (Scalar ID -> Scalar Object)) map. This is a nsClassHashtable,
> +// it owns the scalar instances and takes care of deallocating them when they are
> +// removed from the map.
> +ProcessesScalarsMapType gScalarStorageMap;
Not blocking:
Could we just use an array/vector, e.g. `mozilla::Array`?
- This would allow direct access using `storage[processId]`.
- There may be a (minimal?) size trade-off.
- static asserts can ensure that the process type values are contiguous.
@@ +999,5 @@
> +
> + // If we're in the parent process and requesting the storage for a child process,
> + // make sure to look in the right location.
> + if (XRE_IsParentProcess() && aProcessStorage != GeckoProcessType_Default) {
> + storageId = static_cast<uint32_t>(aProcessStorage);
Could you just directly use:
> const uint32_t storageId = static_cast<uint32_t>(aProcessStorage);
Dito for keyed scalars.
@@ +1009,5 @@
> + scalarStorage = new ScalarStorageMapType();
> + gScalarStorageMap.Put(storageId, scalarStorage);
> + }
> +
> + // Check if the scalar is already allocated in the parent or child storage.
Nit: "... in the storage."
@@ +1998,5 @@
> nsresult
> TelemetryScalar::CreateSnapshots(unsigned int aDataset, bool aClearScalars, JSContext* aCx,
> uint8_t optional_argc, JS::MutableHandle<JS::Value> aResult)
> {
> + MOZ_ASSERT(XRE_IsParentProcess());
No message?
@@ +2012,5 @@
> aResult.setObject(*root_obj);
>
> +#ifndef DEBUG
> + // On non-debug, return `{}` in child processes.
> + if (!XRE_IsParentProcess()) {
Why the non-debug check? Just always return this.
@@ +2103,5 @@
> nsresult
> TelemetryScalar::CreateKeyedSnapshots(unsigned int aDataset, bool aClearScalars, JSContext* aCx,
> uint8_t optional_argc, JS::MutableHandle<JS::Value> aResult)
> {
> + MOZ_ASSERT(XRE_IsParentProcess());
No message?
@@ +2116,5 @@
> }
> aResult.setObject(*root_obj);
>
> +#ifndef DEBUG
> + // On non-debug, return `{}` in child processes.
Why non-debug only?
@@ +2219,5 @@
> */
> void
> TelemetryScalar::ClearScalars()
> {
> + MOZ_ASSERT(XRE_IsParentProcess());
No message?
@@ +2265,5 @@
> +void
> +TelemetryScalar::UpdateChildData(GeckoProcessType aProcessType,
> + const nsTArray<mozilla::Telemetry::ScalarAction>& aScalarActions)
> +{
> + MOZ_ASSERT(XRE_IsParentProcess());
No message?
@@ +2313,5 @@
> +void
> +TelemetryScalar::UpdateChildKeyedData(GeckoProcessType aProcessType,
> + const nsTArray<mozilla::Telemetry::KeyedScalarAction>& aScalarActions)
> +{
> + MOZ_ASSERT(XRE_IsParentProcess());
No message?
::: toolkit/components/telemetry/TelemetryScalar.h
@@ +12,5 @@
> // scalar accumulation and storage logic. It should only be used by
> // Telemetry.cpp. These functions should not be used anywhere else.
> // For the public interface to Telemetry functionality, see Telemetry.h.
>
> +class nsITimer;
Unused?
Attachment #8825702 -
Flags: feedback?(gfritzsche) → feedback+
![]() |
Reporter | |
Comment 57•9 years ago
|
||
Comment on attachment 8825449 [details] [diff] [review]
part 6 - Update the docs
Review of attachment 8825449 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/docs/collection/scalars.rst
@@ -4,5 @@
>
> Historically we started to overload our histogram mechanism to also collect scalar data,
> such as flag values, counts, labels and others.
> The scalar measurement types are the suggested way to collect that kind of scalar data.
> -We currently only support recording of scalars from the parent process.
Can we add a "version history" section at the bottom?
Firefox NN: Initial scalar support.
Firefox MM: Added keyed scalars.
Firefox OO: Added child process scalars.
@@ +105,5 @@
>
> - ``cpp_guard``: A string that gets inserted as an ``#ifdef`` directive around the automatically generated C++ declaration. This is typically used for platform-specific scalars, e.g. ``ANDROID``.
> - ``release_channel_collection``: This can be either ``opt-in`` (default) or ``opt-out``. With the former the scalar is submitted by default on pre-release channels; on the release channel only if the user opted into additional data collection. With the latter the scalar is submitted by default on release and pre-release channels, unless the user opted out.
> - ``keyed``: A boolean that determines whether this is a keyed scalar. It defaults to ``False``.
> +- ``record_in_processes``: A list of processes the scalar is allowed to record in. Currently supported values are ``main`` (the default value), ``content``, ``gpu``, ``all_child`` (record in all the child processes) and ``all`` (record in all the processes).
Nicer to read as a sub list?
@@ +120,5 @@
> Keyed scalars should only be used if the set of keys are not known beforehand. If the keys are from a known set of strings, other options are preferred if suitable, like categorical histograms or splitting measurements up into separate scalars.
>
> +Multiple processes caveats
> +--------------------------
> +When recording data in different processes of the same type (e.g. multiple content processes), the user has to take care of preventing races between the operations on the scalars.
"the user is responsible for"
@@ +121,5 @@
>
> +Multiple processes caveats
> +--------------------------
> +When recording data in different processes of the same type (e.g. multiple content processes), the user has to take care of preventing races between the operations on the scalars.
> +Races can happen because scalar changes are send from each child process to the parent process, and then merged into the final storage location. Since there's no synchronization between the processes, operations like ``setMaximum`` can potentially produce different results.
"... produce different results if sent from more than one child process."?
::: toolkit/components/telemetry/docs/data/main-ping.rst
@@ +123,5 @@
> This section contains histograms and keyed histograms accumulated on content processes. Histograms recorded on a content child process have different character than parent histograms. For instance, ``GC_MS`` will be much different in ``processes.content`` as it has to contend with web content, whereas the instance in ``payload.histograms`` has only to contend with browser JS. Also, some histograms may be absent if never recorded on a content child process (``EVENTLOOP_UI_ACTIVITY`` is parent-process-only).
>
> This format was adopted in Firefox 51 via bug 1218576.
>
> scalars
and keyedScalars?
@@ +125,5 @@
> This format was adopted in Firefox 51 via bug 1218576.
>
> scalars
> ~~~~~~~
> +This section contains the :doc:`../collection/scalars` that are valid for the current platform. Scalars are not created nor submitted if no data was added to them, and are only reported with subsession pings. Their type and format is described by the ``Scalars.yaml`` file. Its most recent version is available `here <https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/Scalars.yaml>`_. The ``info.revision`` field indicates the revision of the file that describes the reported scalars.
Skip the double negations?
"Scalars are only submitted if ..."
"The record scalars are described in the ..."
Also just make the ``Scalars.yaml`` the link?
Attachment #8825449 -
Flags: review?(gfritzsche) → review+
Assignee | ||
Comment 58•9 years ago
|
||
Attachment #8825702 -
Attachment is obsolete: true
Attachment #8826522 -
Flags: review+
Assignee | ||
Comment 59•9 years ago
|
||
Attachment #8825449 -
Attachment is obsolete: true
Attachment #8826523 -
Flags: review+
Assignee | ||
Comment 60•9 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] [away Jan 14 - 24] from comment #56)
> Comment on attachment 8825702 [details] [diff] [review]
> part 3 - Implement the core scalar recording in content processes
>
> Review of attachment 8825702 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Thanks. I left some comments, but i don't think this needs further input
> from me.
>
> ::: toolkit/components/telemetry/TelemetryComms.h
> @@ +17,1 @@
> > enum ID : uint32_t;
>
> Ok, we really should rename this to HistogramID after this bug?
Eheh, yeah, good point. Filed bug 1330907.
> @@ +764,5 @@
> > +
> > +// The (Process Id -> (Scalar ID -> Scalar Object)) map. This is a nsClassHashtable,
> > +// it owns the scalar instances and takes care of deallocating them when they are
> > +// removed from the map.
> > +ProcessesScalarsMapType gScalarStorageMap;
>
> Not blocking:
> Could we just use an array/vector, e.g. `mozilla::Array`?
> - This would allow direct access using `storage[processId]`.
> - There may be a (minimal?) size trade-off.
> - static asserts can ensure that the process type values are contiguous.
I decided to leave things as they are for now: I'm not sure it's worth the time commitment right now. If you strongly feel like this should be change, I can take that as a follow-up.
![]() |
Reporter | |
Comment 61•9 years ago
|
||
Comment on attachment 8825466 [details] [diff] [review]
part 4 - Modify Telemetry session to supporto child scalars (and add tests)
I'm not getting this done before PTO anymore.
Chris, could you look over this?
Attachment #8825466 -
Flags: review?(gfritzsche) → review?(chutten)
Comment 62•9 years ago
|
||
Comment on attachment 8825466 [details] [diff] [review]
part 4 - Modify Telemetry session to supporto child scalars (and add tests)
Review of attachment 8825466 [details] [diff] [review]:
-----------------------------------------------------------------
Commit message says "supporto" instead of "support"
::: toolkit/components/telemetry/tests/unit/test_ChildScalars.js
@@ +22,5 @@
> +const ALL_PROCESSES_UINT_SCALAR = "telemetry.test.all_processes_uint";
> +const ALL_CHILD_PROCESSES_STRING_SCALAR = "telemetry.test.all_child_processes_string";
> +
> +function run_child_test() {
> + // Set some scalar values from the "content" process.
"Attempt to set", perhaps?
@@ +23,5 @@
> +const ALL_CHILD_PROCESSES_STRING_SCALAR = "telemetry.test.all_child_processes_string";
> +
> +function run_child_test() {
> + // Set some scalar values from the "content" process.
> + // The next scalars must NOT be recorded in the content process.
"must NOT"? Isn't it more like "are not allowed to be"?
@@ +132,5 @@
> +add_task(function*() {
> + if (!runningInParent) {
> + TelemetryController.testSetupContent();
> + run_child_test();
> + dump("... done with child test\n");
leftover debug message? (probably copied from one I accidentally left in another test...)
Attachment #8825466 -
Flags: review?(chutten) → review+
Assignee | ||
Comment 63•9 years ago
|
||
Thanks Chris!
Attachment #8825466 -
Attachment is obsolete: true
Attachment #8826835 -
Flags: review+
Assignee | ||
Comment 64•9 years ago
|
||
Assignee | ||
Comment 65•9 years ago
|
||
Update the comment in the IDL file.
Attachment #8826522 -
Attachment is obsolete: true
Attachment #8827107 -
Flags: review+
Assignee | ||
Comment 66•9 years ago
|
||
Updated the rest of the failing tests to account for the changed structure of the results from "snapshotScalars".
Attachment #8826835 -
Attachment is obsolete: true
Attachment #8827108 -
Flags: review+
Assignee | ||
Comment 67•9 years ago
|
||
Assignee | ||
Comment 68•9 years ago
|
||
The previous update seem to have fixed most of the test failures. There's one failure left on Android :(
Assignee | ||
Comment 69•9 years ago
|
||
Rebased
Attachment #8825342 -
Attachment is obsolete: true
Attachment #8827152 -
Flags: review+
Assignee | ||
Comment 70•9 years ago
|
||
Rebased
Attachment #8827107 -
Attachment is obsolete: true
Attachment #8827153 -
Flags: review+
Assignee | ||
Comment 71•9 years ago
|
||
Attachment #8827108 -
Attachment is obsolete: true
Attachment #8827164 -
Flags: review+
Assignee | ||
Comment 72•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/b4faacd5f34f71ad36b4fa3f4b1fb8da7d8d2f14
Bug 1278556 - Enable the support for "record_in_process" in Scalars.yaml. r=gfritzsche
https://hg.mozilla.org/integration/mozilla-inbound/rev/850f95f34e6dcd00611c95bcd398edee45515ec1
Bug 1278556 - Move the IPC timer logic to TelemetryIPCAccumulator.cpp. r=chutten
https://hg.mozilla.org/integration/mozilla-inbound/rev/2ae922a0d331c978a184592fe3e281a24bc641f6
Bug 1278556 - Enable child process scalar recording. r=chutten, f=gfritzsche,froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/94b3eae2ea2b1ca56097ef4306f911a5f67e0cfd
Bug 1278556 - Add scalar data from child processes to the main ping. r=chutten
https://hg.mozilla.org/integration/mozilla-inbound/rev/f60de5dd2579154891c5c858a286eddd69371a59
Bug 1278556 - Enable child processes scalars in about:telemetry. r=chutten
https://hg.mozilla.org/integration/mozilla-inbound/rev/da40eaad661d01bcc0622e103f71a8915820acb2
Bug 1278556 - Update the documentation. r=gfritzsche
Assignee | ||
Comment 73•9 years ago
|
||
Ah, forgot to mention that there doesn't seem to be any Talos regression: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0e5436cd79321602b1a70cfc99eb6cf8b8476187
Comment 74•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/b4faacd5f34f
https://hg.mozilla.org/mozilla-central/rev/850f95f34e6d
https://hg.mozilla.org/mozilla-central/rev/2ae922a0d331
https://hg.mozilla.org/mozilla-central/rev/94b3eae2ea2b
https://hg.mozilla.org/mozilla-central/rev/f60de5dd2579
https://hg.mozilla.org/mozilla-central/rev/da40eaad661d
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox53:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment 75•9 years ago
|
||
on linux32/64 debug builds we see a small increase in the number of constructors:
== Change summary for alert #4814 (as of January 16 2017 15:10 UTC) ==
Regressions:
1% compiler_metrics num_constructors linux32 debug 182 -> 183
1% compiler_metrics num_constructors linux64 debug 182 -> 183
For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=4814
this is more for informational purposes.
Assignee | ||
Comment 76•9 years ago
|
||
Joel, thanks for reporting that!
I'm not familiar with this metric, is this small increase something we need to look at or is there any risk of backout? (I know you said "informational purposes", but I'd love to be safe :D )
Flags: needinfo?(jmaher)
Comment 77•9 years ago
|
||
this is not a metric we track for regressions, more of just information and if we do want to figure out why our number of constructors have increased over a time window we have a list of bugs. No worries, this isn't a perf regression.
Flags: needinfo?(jmaher)
Updated•9 years ago
|
status-firefox50:
affected → ---
You need to log in
before you can comment on or make changes to this bug.
Description
•