Skip to content

Commit

Permalink
doc: Update outdated spark.comet.columnar.shuffle.enabled configurati…
Browse files Browse the repository at this point in the history
…on doc (#738)
  • Loading branch information
wForget authored Aug 1, 2024
1 parent 2c9be0a commit c555885
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 5 deletions.
6 changes: 3 additions & 3 deletions common/src/main/scala/org/apache/comet/CometConf.scala
Original file line number Diff line number Diff line change
Expand Up @@ -222,8 +222,8 @@ object CometConf extends ShimCometConf {
conf("spark.comet.columnar.shuffle.memorySize")
.doc(
"The optional maximum size of the memory used for Comet columnar shuffle, in MiB. " +
"Note that this config is only used when `spark.comet.columnar.shuffle.enabled` is " +
"true. Once allocated memory size reaches this config, the current batch will be " +
"Note that this config is only used when `spark.comet.exec.shuffle.mode` is " +
"`jvm`. Once allocated memory size reaches this config, the current batch will be " +
"flushed to disk immediately. If this is not configured, Comet will use " +
"`spark.comet.shuffle.memory.factor` * `spark.comet.memoryOverhead` as " +
"shuffle memory size. If final calculated value is larger than Comet memory " +
Expand Down Expand Up @@ -259,7 +259,7 @@ object CometConf extends ShimCometConf {
"prefer dictionary encoding when shuffling the column. If the ratio is higher than " +
"this config, dictionary encoding will be used on shuffling string column. This config " +
"is effective if it is higher than 1.0. By default, this config is 10.0. Note that this " +
"config is only used when 'spark.comet.columnar.shuffle.enabled' is true.")
"config is only used when `spark.comet.exec.shuffle.mode` is `jvm`.")
.doubleConf
.createWithDefault(10.0)

Expand Down
2 changes: 1 addition & 1 deletion docs/source/user-guide/configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,4 +48,4 @@ Comet provides the following configuration settings.
| spark.comet.scan.enabled | Whether to enable Comet scan. When this is turned on, Spark will use Comet to read Parquet data source. Note that to enable native vectorized execution, both this config and 'spark.comet.exec.enabled' need to be enabled. By default, this config is true. | true |
| spark.comet.scan.preFetch.enabled | Whether to enable pre-fetching feature of CometScan. By default is disabled. | false |
| spark.comet.scan.preFetch.threadNum | The number of threads running pre-fetching for CometScan. Effective if spark.comet.scan.preFetch.enabled is enabled. By default it is 2. Note that more pre-fetching threads means more memory requirement to store pre-fetched row groups. | 2 |
| spark.comet.shuffle.preferDictionary.ratio | The ratio of total values to distinct values in a string column to decide whether to prefer dictionary encoding when shuffling the column. If the ratio is higher than this config, dictionary encoding will be used on shuffling string column. This config is effective if it is higher than 1.0. By default, this config is 10.0. Note that this config is only used when 'spark.comet.columnar.shuffle.enabled' is true. | 10.0 |
| spark.comet.shuffle.preferDictionary.ratio | The ratio of total values to distinct values in a string column to decide whether to prefer dictionary encoding when shuffling the column. If the ratio is higher than this config, dictionary encoding will be used on shuffling string column. This config is effective if it is higher than 1.0. By default, this config is 10.0. Note that this config is only used when `spark.comet.exec.shuffle.mode` is `jvm`. | 10.0 |
2 changes: 1 addition & 1 deletion docs/source/user-guide/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,5 +150,5 @@ Some cluster managers may require additional configuration, see <https://spark.a
To enable columnar shuffle which supports all partitioning and basic complex types, one more config is required:

```
--conf spark.comet.columnar.shuffle.enabled=true
--conf spark.comet.exec.shuffle.mode=jvm
```

0 comments on commit c555885

Please sign in to comment.