Closed Bug 1286825 Opened 9 years ago Closed 9 years ago

Airflow scheduler stopped working silently

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect)

defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rvitillo, Unassigned)

References

Details

Attachments

(2 files)

Apparently we are not the only ones experiencing this issue [1]. A workaround is to restart the scheduler "frequently". [1] https://medium.com/handy-tech/airflow-tips-tricks-and-pitfalls-9ba53fba14eb#.q8na2qbpu "The scheduler should be restarted frequently"
It looks like the way this is typically handled is to set a limit on the number of runs the scheduler will process before stopping, then have some supervisor keep restarting it. I'm currently testing to see if docker / ecs will handle the restarting for us if we just add a "-n 5" argument to the scheduler task. If so, that should be all we need to do to work around the scheduler getting jammed up.
Attachment #8771049 - Flags: review?(rvitillo)
Attachment #8771049 - Flags: review?(rvitillo) → review+
Attachment #8771049 - Flags: review+ → review-
This worked for local testing, but failed on ECS deployment. Next up: apply the hack mentioned in Comment 1
Attachment #8771988 - Flags: review?(rvitillo)
Attachment #8771049 - Flags: review- → review+
Attachment #8771988 - Flags: review?(rvitillo) → review+
Roberto has deployed this change. We should keep an eye on this for a few days and make sure the scheduler no longer stops.
I've confirmed that the scheduler is being restarted as expected in ECS: $ cat /tmp/airflow_scheduler_errors.txt ...snip... Tue Jul 19 16:11:09 UTC 2016 Tue Jul 19 16:11:35 UTC 2016 Tue Jul 19 16:12:01 UTC 2016 Tue Jul 19 16:12:27 UTC 2016 Tue Jul 19 16:12:53 UTC 2016 Tue Jul 19 16:13:19 UTC 2016
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: