Skip to content

Commit

Permalink
doc: Add S3 Bucket config, fix Generic config and Spark-specific conf…
Browse files Browse the repository at this point in the history
…ig (facebookincubator#11418)

Summary:
Fix doc for Generic config and Spark-specific config
https://facebookincubator.github.io/velox/configs.html

Pull Request resolved: facebookincubator#11418

Reviewed By: xiaoxmeng

Differential Revision: D65831333

Pulled By: kevinwilfong

fbshipit-source-id: 10273424f60d86495ef3d5e9965615477e178feb
  • Loading branch information
majetideepak authored and facebook-github-bot committed Nov 13, 2024
1 parent 7437093 commit f6276bb
Showing 1 changed file with 14 additions and 6 deletions.
20 changes: 14 additions & 6 deletions velox/docs/configs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ Generic Configuration
- integer
- 32MB
- Used for backpressure to block local exchange producers when the local exchange buffer reaches or exceeds this size.
* - max_local_exchange_partition_count
* - max_local_exchange_partition_count
- integer
- 2^32
- Limits the number of partitions created by a local exchange. Partitioning data too granularly can lead to poor performance.
Expand Down Expand Up @@ -583,7 +583,7 @@ Each query can override the config by setting corresponding query session proper
- Default AWS secret key to use.
* - hive.s3.endpoint
- string
-
- us-east-1
- The S3 storage endpoint server. This can be used to connect to an S3-compatible storage system instead of AWS.
* - hive.s3.path-style-access
- bool
Expand Down Expand Up @@ -636,6 +636,14 @@ Each query can override the config by setting corresponding query session proper
Standard mode is built on top of legacy mode and has throttled retry enabled for throttling errors apart from transient errors.
Adaptive retry mode dynamically limits the rate of AWS requests to maximize success rate.

Bucket Level Configuration
""""""""""""""""""""""""""
All "hive.s3.*" config (except "hive.s3.log-level") can be set on a per-bucket basis. The bucket-specific option is set by
replacing the "hive.s3." prefix on a config with "hive.s3.bucket.BUCKETNAME.", where BUCKETNAME is the name of the
bucket. e.g. the endpoint for a bucket named "velox" can be specified by the config "hive.s3.bucket.velox.endpoint".
When connecting to a bucket, all options explicitly set will override the base "hive.s3." values.
These semantics are similar to the `Apache Hadoop-Aws module <https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html>`_.

``Google Cloud Storage Configuration``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. list-table::
Expand Down Expand Up @@ -729,10 +737,10 @@ Spark-specific Configuration
* - spark.legacy_date_formatter
- bool
- false
- If true, `Simple <https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html>` date formatter is used for time formatting and parsing. Joda date formatter is used by default.
- Joda date formatter performs strict checking of its input and uses different pattern string.
- For example, the 2015-07-22 10:00:00 timestamp cannot be parse if pattern is yyyy-MM-dd because the parser does not consume whole input.
- Another example is that the 'W' pattern, which means week in month, is not supported. For more differences, see :issue:`10354`.
- If true, `Simple Date Format <https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html>`_ is used for time formatting and parsing. Joda date formatter is used by default.
Joda date formatter performs strict checking of its input and uses different pattern string.
For example, the 2015-07-22 10:00:00 timestamp cannot be parsed if pattern is yyyy-MM-dd because the parser does not consume whole input.
Another example is that the 'W' pattern, which means week in month, is not supported. For more differences, see :issue:`10354`.

Tracing
--------
Expand Down

0 comments on commit f6276bb

Please sign in to comment.