Closed
Bug 1279147
Opened 9 years ago
Closed 9 years ago
Move parquet2hive jobs off of the Presto EMR master
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P2)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rvitillo, Assigned: robotblake)
References
Details
(Whiteboard: [SvcOps])
I had to restart Presto on the master node as the JVM was using nearly all the available memory which was causing other things to fail (like parquet2hive). Blake, could you please limit [1] the total memory size that the Presto processes are allowed to consume?
[1] https://github.com/vitillo/emr-bootstrap-presto/blob/master/ansible/files/telemetry.sh#L57
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(bimsland)
Reporter | ||
Updated•9 years ago
|
Whiteboard: [SvcOps]
Assignee | ||
Comment 1•9 years ago
|
||
This brings up a bit of a meta question, we're already running hive and presto on these instances and it seems like a waste to limit their memory all the time to deal with the case of needing to run something like parquet2hive. I'm happy to make the change in this case but would it make sense to run those jobs somewhere else?
Flags: needinfo?(bimsland) → needinfo?(rvitillo)
Reporter | ||
Comment 2•9 years ago
|
||
Blake, we can't run parquet2hive somewhere else until the centralized metastore lands. This would be a temporary fix and I don't expect to reduce the amount of memory dedicated to Presto by more than a few GB. What's the ETA for the centralized metastore? And once it lands, how long would it take you to setup a remote metastore update process? Do you have an alternate suggestion on how to avoid this situation from happening again until then?
Flags: needinfo?(rvitillo) → needinfo?(bimsland)
Updated•9 years ago
|
Assignee: nobody → bimsland
Priority: -- → P4
Assignee | ||
Comment 3•9 years ago
|
||
I've got a tentative way to handle these outside of the presto machines that I'm currently investigating, assuming everything looks good this can land very soon after we verify that the new Presto EMR cluster is working as intended.
Flags: needinfo?(bimsland)
Assignee | ||
Updated•9 years ago
|
Points: --- → 1
Priority: P4 → P1
Summary: Presto is eating all the available memory → Move parquet2hive jobs off of the Presto EMR master
Assignee | ||
Updated•9 years ago
|
Priority: P1 → P2
Comment 4•9 years ago
|
||
Isn't this obsolete now that the centralized metastore has landed?
Flags: needinfo?(bimsland)
Assignee | ||
Comment 5•9 years ago
|
||
This will be taken care of by https://github.com/mozilla/emr-bootstrap-presto/pull/8
Flags: needinfo?(bimsland)
Assignee | ||
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•