Closed
Bug 1254547
Opened 10 years ago
Closed 10 years ago
Parquet datasets are no longer accessible from Spark clusters
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rvitillo, Unassigned)
References
Details
[hadoop@ip-172-31-20-15 ~]$ aws s3 ls s3://telemetry-parquet/longitudinal/
A client error (AccessDenied) occurred when calling the ListObjects operation: Access Denied
Reporter | ||
Updated•10 years ago
|
Severity: normal → blocker
Flags: needinfo?(whd)
Priority: -- → P1
Comment 1•10 years ago
|
||
This didn't happen as part of the Spark 1.6 deploy. This happened because during that deploy I did a manual diff and noticed that somebody had added the parquet IAM permissions manually to the spark role. I made a note of this in https://bugzilla.mozilla.org/show_bug.cgi?id=1253392#c1 and was going to file a PR for it today (still am). As a consequence I did the CFN portion of the deploy manually to add the permissions for https://bugzilla.mozilla.org/show_bug.cgi?id=1253392 without losing the parquet permissions.
Later :rvitillo and :mreid attempted to deploy spark bootstrap updates with ansible for https://github.com/mozilla/emr-bootstrap-spark/pull/15 which aside from failing due to other IAM permissions issues wiped out the permissions not captured in version control.
I had a copy of the old permissions from when I did a diff so I've filed the PR to fix this: https://github.com/mozilla/emr-bootstrap-spark/pull/20
I'll close this once I've deployed it.
Flags: needinfo?(whd)
Updated•7 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•