-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to load credentials from any of the providers in the chain file with native s3 support #23545
Comments
Here is the stack trace:
and thats the error code received with DEBUG mode enabled:
|
It's important to note that the failed query succeed when running it again shortly after, so it's not about s3 access but more about retries |
The root cause might be the same as in #15267. I don't think we should retry on these kind of errors, as we can't tell if they're intermediate (caused by some network or IO issue) or permanent (invalid credentials). How did you set up S3 authentication? In native FS, it's not possible to use a specific credential provider in the AWS SDK, instead of relying on the default chain of providers, except for using the |
I'm running Trino on EC2 and the S3 authentication is done through IAM role that has specific policy allows access to s3 bucket (read & write) |
@electrum can you advise what can we do in this situation? |
I found that the issue is not related to reading csv files while they are being updated (overwritten), but it can happen sporadically for any s3 file. |
UPDATE: Issue happens only when
setting in
After disabling the hive security mapping with the native s3, issue doesn't happen. With the Legacy S3 support ( |
Which credential provider should be used for your setup? Does the problem go away if you set the following?
|
After setting Also I had to remove the security mapping file to avoid the errors of |
It seems that while running cluster with native s3 support some queries become much more slower. I succeeded to reproduce the issue with 2 clusters: A (legacy support) and B (native support) |
@guyco33 can we keep the performance issues in a different issue or on slack? it looks like we are discussing multiple issues here. This issue can be just focussed on how the security mapping file is causing issues intermittently |
@electrum with S3 security mapping enabled, we create a new S3 client for every query. This increases the chance of hitting issues with setting up IAM role authentication. WDYT about adding some LRU client cache per location? I'm not sure how this works in HDFS, I'm looking it up now. |
@nineinchnick That sound like a good idea. |
I have an hive external table that points to an s3 location with some CSV gzipped files.
Each 1 minute, one file in the s3 location is updated (replaced) by some external flow.
In
458
lots of queries on this hive external table starts to fail with HIVE_CANNOT_OPEN_SPLITSetting back
fs.hadoop.enabled=true
(as of #23343) fixed the issue.I tried to workaround the issue by setting
s3.retry-mode
parameter to STANDARD / ADAPTIVE but with no luck.I wonder if there is something else that can be done here as it seems that Legacy S3 support will be removed in the future ?
Hive security mapping is enabled with the following
hive_security_mapping.json
file:hive properties for s3 legacy support (no issue)
hive properties for s3 native support
Slack discussion here
The text was updated successfully, but these errors were encountered: