Closed Bug 1307087 Opened 9 years ago Closed 9 years ago

ATMO V2: scheduled jobs are failing

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: mdoglio, Assigned: mdoglio)

References

Details

Attachments

(5 files)

[telemetry-analysis-service] mozilla:bug-1307087-fix-scheduled-jobs > mozilla:master 9 years ago GitHub Autolander Bot 61 bytes, text/x-github-pull-request		Details \| Review
[telemetry-analysis-service] mozilla:bug1307087 > mozilla:master 9 years ago GitHub Autolander Bot 61 bytes, text/x-github-pull-request		Details \| Review
[telemetry-analysis-service] mozilla:bug-1307087-run-uploaded-notebook > mozilla:master 9 years ago GitHub Autolander Bot 61 bytes, text/x-github-pull-request		Details \| Review
[telemetry-analysis-service] mozilla:job-status-refactor > mozilla:master 9 years ago GitHub Autolander Bot 61 bytes, text/x-github-pull-request		Details \| Review
[telemetry-analysis-service] mozilla:bug1307087-2 > mozilla:master 9 years ago GitHub Autolander Bot 62 bytes, text/x-github-pull-request		Details \| Review

Mauro Doglio [:mdoglio]

Assignee

Description

•

9 years ago

This is what I found in the logs: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/rq/worker.py", line 588, in perform_job rv = job.perform() File "/usr/local/lib/python2.7/dist-packages/rq/job.py", line 498, in perform self._result = self.func(*self.args, **self.kwargs) File "/app/atmo/jobs/jobs.py", line 8, in launch_jobs SparkJob.step_all() File "/app/atmo/jobs/models.py", line 136, in step_all if spark_join.should_run(now): File "/app/atmo/jobs/models.py", line 89, in should_run active = self.start_date <= at_time <= self.end_date TypeError: can't compare offset-naive and offset-aware datetimes

GitHub Autolander Bot

Comment 1

•

9 years ago

Attached file [telemetry-analysis-service] mozilla:bug-1307087-fix-scheduled-jobs > mozilla:master — Details

Mauro Doglio [:mdoglio]

Assignee

Comment 2

•

9 years ago

Another error similar to the one in comment 0: Traceback (most recent call last): File "/app/.heroku/python/lib/python2.7/site-packages/rq/worker.py", line 588, in perform_job rv = job.perform() File "/app/.heroku/python/lib/python2.7/site-packages/rq/job.py", line 498, in perform self._result = self.func(*self.args, **self.kwargs) File "/app/atmo/jobs/jobs.py", line 8, in launch_jobs SparkJob.step_all() File "/app/atmo/jobs/models.py", line 139, in step_all if spark_join.should_run(now): File "/app/atmo/jobs/models.py", line 90, in should_run active = self.start_date <= at_time TypeError: can't compare offset-naive and offset-aware datetimes

Roberto Agostino Vitillo (:rvitillo)

Updated

•

9 years ago

Blocks: 1248688

Mauro Doglio [:mdoglio]

Assignee

Comment 3

•

9 years ago

I can't reproduce it anymore. I think it was fixed as a side-effect of fixing bug 1309227

No longer blocks: 1248688

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Roberto Agostino Vitillo (:rvitillo)

Comment 4

•

9 years ago

I am still unable to successfuly run a scheduled job. It looks like jobs are not started at all on EMR.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Roberto Agostino Vitillo (:rvitillo)

Updated

•

9 years ago

Blocks: 1248688

Roberto Agostino Vitillo (:rvitillo)

Updated

•

9 years ago

Severity: normal → major

Jannis Leidel [:jezdez]

Comment 5

•

9 years ago

:rvitillo I dug a bit and found that spark job clusters aren’t launched with VisibleToAllUsers=True but False which is why I wasn’t able to see the cluster in the AWS console. Should I add that to the spark jobs?

Flags: needinfo?(rvitillo)

GitHub Autolander Bot

Comment 6

•

9 years ago

Attached file [telemetry-analysis-service] mozilla:bug1307087 > mozilla:master — Details

Roberto Agostino Vitillo (:rvitillo)

Comment 7

•

9 years ago

(In reply to Jannis Leidel [:jezdez] from comment #5) > :rvitillo I dug a bit and found that spark job clusters aren’t launched > with VisibleToAllUsers=True but False > which is why I wasn’t able to see the cluster in the AWS console. Should I > add that to the spark jobs? Go ahead.

Flags: needinfo?(rvitillo)

Mauro Doglio [:mdoglio]

Assignee

Comment 8

•

9 years ago

This also explains why I wasn't able to reproduce this issue locally where the jobs are spawned with my own credentials. Good spot!

Roberto Agostino Vitillo (:rvitillo)

Comment 9

•

9 years ago

Scheduled jobs appear to be still failing (tested on stage). I scheduled this simple notebook [1] which should not fail but it still did. As there are also no logs (filed Bug 1312749 for that one) I don't know what happened. [1] https://raw.githubusercontent.com/mozilla/telemetry-airflow/master/examples/spark/example_date.ipynb

Mauro Doglio [:mdoglio]

Assignee

Comment 10

•

9 years ago

I can reproduce this locally.

GitHub Autolander Bot

Comment 11

•

9 years ago

Attached file [telemetry-analysis-service] mozilla:bug-1307087-run-uploaded-notebook > mozilla:master — Details

Jannis Leidel [:jezdez]

Comment 12

•

9 years ago

Closing this as this was fixed in staging.

Status: REOPENED → RESOLVED

Closed: 9 years ago → 9 years ago

Resolution: --- → FIXED

Marco Castelluccio [:marco]

Comment 13

•

9 years ago

Jannis, did this land on prod?

Flags: needinfo?(jezdez)

Jannis Leidel [:jezdez]

Comment 14

•

9 years ago

:marco We just landed it (with an unplaned delay) in the atmo-prod.herokuapp.com environment.

Flags: needinfo?(jezdez)

Roberto Agostino Vitillo (:rvitillo)

Comment 15

•

9 years ago

This is still not working for me on stage. I don't see the jobs being scheduled from the AWS console and the UI doesn't give me any indication that it tried to run my jobs.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

GitHub Autolander Bot

Comment 16

•

9 years ago

Attached file [telemetry-analysis-service] mozilla:job-status-refactor > mozilla:master — Details

Mauro Doglio [:mdoglio]

Assignee

Comment 17

•

9 years ago

Now that bug 1316623 is solved we can close this as well.

Status: REOPENED → RESOLVED

Closed: 9 years ago → 9 years ago

Resolution: --- → FIXED

GitHub Autolander Bot

Comment 18

•

9 years ago

Attached file [telemetry-analysis-service] mozilla:bug1307087-2 > mozilla:master — Details

BMO Automation

Updated

•

7 years ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.