Closed Bug 1279147 Opened 9 years ago Closed 9 years ago

Move parquet2hive jobs off of the Presto EMR master

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: rvitillo, Assigned: robotblake)

References

Details

(Whiteboard: [SvcOps])

Roberto Agostino Vitillo (:rvitillo)

Reporter

Description

•

9 years ago

I had to restart Presto on the master node as the JVM was using nearly all the available memory which was causing other things to fail (like parquet2hive). Blake, could you please limit [1] the total memory size that the Presto processes are allowed to consume? [1] https://github.com/vitillo/emr-bootstrap-presto/blob/master/ansible/files/telemetry.sh#L57

Roberto Agostino Vitillo (:rvitillo)

Reporter

Updated

•

9 years ago

Flags: needinfo?(bimsland)

Roberto Agostino Vitillo (:rvitillo)

Reporter

Updated

•

9 years ago

Whiteboard: [SvcOps]

Roberto Agostino Vitillo (:rvitillo)

Reporter

Updated

•

9 years ago

Blocks: 1255751

Blake Imsland [:robotblake]

Assignee

Comment 1

•

9 years ago

This brings up a bit of a meta question, we're already running hive and presto on these instances and it seems like a waste to limit their memory all the time to deal with the case of needing to run something like parquet2hive. I'm happy to make the change in this case but would it make sense to run those jobs somewhere else?

Flags: needinfo?(bimsland) → needinfo?(rvitillo)

Roberto Agostino Vitillo (:rvitillo)

Reporter

Comment 2

•

9 years ago

Blake, we can't run parquet2hive somewhere else until the centralized metastore lands. This would be a temporary fix and I don't expect to reduce the amount of memory dedicated to Presto by more than a few GB. What's the ETA for the centralized metastore? And once it lands, how long would it take you to setup a remote metastore update process? Do you have an alternate suggestion on how to avoid this situation from happening again until then?

Flags: needinfo?(rvitillo) → needinfo?(bimsland)

Thomas Huelbert

Updated

•

9 years ago

Assignee: nobody → bimsland

Priority: -- → P4

Blake Imsland [:robotblake]

Assignee

Comment 3

•

9 years ago

I've got a tentative way to handle these outside of the presto machines that I'm currently investigating, assuming everything looks good this can land very soon after we verify that the new Presto EMR cluster is working as intended.

Flags: needinfo?(bimsland)

Blake Imsland [:robotblake]

Assignee

Updated

•

9 years ago

Points: --- → 1

Priority: P4 → P1

Summary: Presto is eating all the available memory → Move parquet2hive jobs off of the Presto EMR master

Blake Imsland [:robotblake]

Assignee

Updated

•

9 years ago

Blocks: 1269781

Blake Imsland [:robotblake]

Assignee

Updated

•

9 years ago

No longer blocks: 1269781

Blake Imsland [:robotblake]

Assignee

Updated

•

9 years ago

Priority: P1 → P2

Frank Bertsch [:frank]

Comment 4

•

9 years ago

Isn't this obsolete now that the centralized metastore has landed?

Flags: needinfo?(bimsland)

Blake Imsland [:robotblake]

Assignee

Comment 5

•

9 years ago

This will be taken care of by https://github.com/mozilla/emr-bootstrap-presto/pull/8

Flags: needinfo?(bimsland)

Blake Imsland [:robotblake]

Assignee

Updated

•

9 years ago

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

BMO Automation

Updated

•

7 years ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Move parquet2hive jobs off of the Presto EMR master

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P2)

Tracking

(Not tracked)

People

(Reporter: rvitillo, Assigned: robotblake)

References

Details

(Whiteboard: [SvcOps])

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Updated

Comment 1

Comment 2

Updated

Comment 3

Updated

Updated

Updated

Updated

Comment 4

Comment 5

Updated

Updated