Closed Bug 1321316 Opened 9 years ago Closed 9 years ago

Backfill: Reprocess 1 month of main_summary to add engagement scalars

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mreid, Assigned: mreid)

References

Details

Target time period: 20161029 to 20161129
Assignee: nobody → mreid
Blocks: 1255755
Points: --- → 2
Priority: -- → P1
Can we make this 20161001 to 20161129 (since last 15 days is always wonky)
Can we make this 20161001 to 20161129 (since last 15 days is always wonky)
I'm just about done with 20161029 to 20161129 (part way though swapping in the updated files for 20161128). Please take a look at that period first, and if it looks like we'll need more data I will backfill further. Note that if you are loading this data in Spark, you may need to set the 'mergeSchema' flag per: https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.DataFrameReader.parquet I will update again when this period is completely finished, should be within the hour.
Ok, import of the original period is done. Saptarshi, please take a look and let me know if you'll need more backfill.
Flags: needinfo?(sguha)
Calling this done for now. Please re-open if we need more backfill.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
Flags: needinfo?(sguha)
You need to log in before you can comment on or make changes to this bug.