Skip to content

Commit

Permalink
update doc
Browse files Browse the repository at this point in the history
  • Loading branch information
zhli1142015 committed Dec 5, 2023
1 parent f04d6d1 commit a0fba76
Showing 1 changed file with 3 additions and 5 deletions.
8 changes: 3 additions & 5 deletions docs/velox-backend-limitations.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,6 @@ Gluten avoids to modify Spark's existing code and use Spark APIs if possible. Ho

So you need to ensure preferentially load the Gluten jar to overwrite the jar of vanilla spark. Refer to [How to prioritize loading Gluten jars in Spark](https://github.com/oap-project/gluten/blob/main/docs/velox-backend-troubleshooting.md#incompatible-class-error-when-using-native-writer).


### Runtime BloomFilter

Velox BloomFilter's serialization format is different from Spark's. BloomFilter binary generated by Velox can't be deserialized by vanilla spark. So if `might_contain` falls back, we fall back `bloom_filter_agg` to vanilla spark also.

### Fallbacks
Except the unsupported operators, functions, file formats, data sources listed in , there are some known cases also fall back to Vanilla Spark.

Expand All @@ -29,6 +24,9 @@ Gluten only supports spark default case-insensitive mode. If case-sensitive mode
In velox, lookaround (lookahead/lookbehind) pattern is not supported in RE2-based implementations for Spark functions,
such as `rlike`, `regexp_extract`, etc.

#### Runtime BloomFilter
Velox BloomFilter's serialization format is different from Spark's. BloomFilter binary generated by Velox can't be deserialized by vanilla spark. So if `might_contain` falls back, we fall back `bloom_filter_agg` to vanilla spark also.

#### FileSource format
Currently, Gluten only fully supports parquet file format and partially support ORC. If other format is used, scan operator falls back to vanilla spark.

Expand Down

0 comments on commit a0fba76

Please sign in to comment.