Closed Bug 1270183 Opened 9 years ago Closed 8 years ago

Add "Client Timestamp" and "Clock Skew" to main_summary

Categories

(Data Platform and Tools :: General, defect, P1)

defect
Points:
1

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mreid, Assigned: mreid)

References

Details

Attachments

(1 file)

Since bug 1144778 has landed, we receive a new field for Telemetry submissions for the current time on the client. It appears in a header field called "Date", which is in turn stored in a Heka message field. The date is in RFC 1123 format, an example is: Wed, 04 May 2016 16:42:00 GMT We should do some preprocessing[1] on this field and store two more Heka fields: Client Timestamp: The "nanos since epoch" version of the long-form date to make date math simpler (and make this date directly comparable to the server-assigned Timestamp for the message). Clock Skew: The difference between the server-assigned Timestamp and the Client Timestamp above. For convenience, we may want to consider storing it in some coarser form than nanos, such as seconds. Note that I expect that the Clock Skew will be nonzero in pretty much every case, and that the interesting part will be very large (or negative) values. [1] https://github.com/mozilla-services/data-pipeline/blob/master/heka/sandbox/decoders/extract_telemetry_dimensions.lua
Points: --- → 1
Priority: -- → P3
Priority: P3 → P2
Component: Metrics: Pipeline → Pipeline Ingestion
Product: Cloud Services → Data Platform and Tools
Blocks: 1357749
Assignee: nobody → mtrinkala
Since this field is derived from existing fields in the message I recommend that we don't bloat the incoming message and just calculate it when necessary. Reassigning this to Frank to add the derived field to the main summary data set.
Assignee: mtrinkala → fbertsch
Summary: Add new calculated fields for Telemetry data: "Client Timestamp" and "Clock Skew" → Add "Client Timestamp" and "Clock Skew" to main_summary
While you're in the neighbourhood, could you calculate some latency for me? Per [1] there are a few delays we can calculate at ingestion: reporting delay and submission delay. Reporting delay is just subsessionLength, so no calculation needed. Submission delay is essentially how long from when the ping is finalized until we receive it and start operating on it: Timestamp - creationTimestamp. Unfortunately Timestamp is server clock and creationTimestamp is a client clock, so it needs to be adjusted for clock skew. To add to the bikeshed this could be called: submission_delay, client_delay, client_latency, or Fred. [1]: http://reports.telemetry.mozilla.org/post/projects/ping_delays.kp
Component: Pipeline Ingestion → Datasets: Main Summary
Assignee: fbertsch → mreid
Priority: P2 → P1
Blocks: 1438927
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Component: Datasets: Main Summary → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: