-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GLUTEN-5659][VL] Add more configs for AWS s3 #5660
Conversation
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
val AWS_S3_RETRY_MODE = | ||
buildConf("spark.gluten.velox.fs.s3a.retry.mode") | ||
.internal() | ||
.doc("Retry mode for AWS s3 connection error.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can add "standard" and "adaptive"
as well for the doc?
@@ -687,6 +699,10 @@ object GlutenConfig { | |||
(SPARK_S3_USE_INSTANCE_CREDENTIALS, "false"), | |||
(SPARK_S3_IAM, ""), | |||
(SPARK_S3_IAM_SESSION_NAME, ""), | |||
(SPARK_S3_RETRY_MAX_ATTEMPTS, "3"), | |||
(SPARK_S3_CONNECTION_MAXIMUM, "96"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
96
/3
can also be the default value for the config and use something like AWS_S3_CONNECT_TIMEOUT.defaultValueString
as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. These keys can also use defaultValueString
but existing code also use the value directly for easy reading unless it's a new config in gluten itself. So let's keep it.
cpp/velox/utils/ConfigExtractor.cc
Outdated
const std::string kVeloxS3RetryModeDefault = "legacy"; | ||
// Connection timeout for AWS s3 | ||
const std::string kVeloxS3ConnectTimeout = "spark.gluten.velox.fs.s3a.connect.timeout"; | ||
const std::string kVeloxS3ConnectTimeoutDefault = "1s"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we align the default config value with the hadoop config. https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/core-default.xml , such as the default value for fs.s3a.connection.timeout
is 200000ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
200s
seems quite a large number. @FelixYBW what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use hadoop's default one
Is the Velox PR merged? can this PR be merged? |
Yes, related changes have been merged in velox. facebookincubator/velox@74fe2ba |
Run Gluten Clickhouse CI |
@leesf Thanks for comments. Updated and please help review again. |
@@ -688,6 +700,10 @@ object GlutenConfig { | |||
(SPARK_S3_USE_INSTANCE_CREDENTIALS, "false"), | |||
(SPARK_S3_IAM, ""), | |||
(SPARK_S3_IAM_SESSION_NAME, ""), | |||
(SPARK_S3_RETRY_MAX_ATTEMPTS, "3"), | |||
(SPARK_S3_CONNECTION_MAXIMUM, "96"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs.s3a.attempts.maximum
and fs.s3a.connection.maximum
can also be aligned with hadoop config, default value is 20 and 15.
.internal() | ||
.doc("Timeout for AWS s3 connection.") | ||
.stringConf | ||
.createWithDefault("1s") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can be changed to 200s as well.
Run Gluten Clickhouse CI |
cpp/velox/utils/ConfigExtractor.cc
Outdated
@@ -64,6 +71,10 @@ std::shared_ptr<facebook::velox::core::MemConfig> getHiveConfig(std::shared_ptr< | |||
bool useInstanceCredentials = conf->get<bool>("spark.hadoop.fs.s3a.use.instance.credentials", false); | |||
std::string iamRole = conf->get<std::string>("spark.hadoop.fs.s3a.iam.role", ""); | |||
std::string iamRoleSessionName = conf->get<std::string>("spark.hadoop.fs.s3a.iam.role.session.name", ""); | |||
std::string retryMaxAttempts = conf->get<std::string>("spark.hadoop.fs.s3a.retry.limit", "3"); | |||
std::string retryMode = conf->get<std::string>(kVeloxS3RetryMode, kVeloxS3RetryModeDefault); | |||
std::string maxConnections = conf->get<std::string>("spark.hadoop.fs.s3a.connection.maximum", "96"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 and 96 should be changed as well.
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
Add more configs for AWS s3 spark.gluten.velox.fs.s3a.retry.mode spark.gluten.velox.fs.s3a.connect.timeout spark.hadoop.fs.s3a.retry.limit spark.hadoop.fs.s3a.connection.maximum
What changes were proposed in this pull request?
Add more configs for AWS s3
spark.gluten.velox.fs.s3a.retry.mode
spark.gluten.velox.fs.s3a.connect.timeout
spark.hadoop.fs.s3a.retry.limit
spark.hadoop.fs.s3a.connection.maximum
How was this patch tested?
CI