-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1563484 Remove Advanced Query Pushdown Feature #572
Conversation
Can you share some insights why advanced pushdown was removed ? That was open and pending review Since Nov. 2023. Despite requesting multiple time reviews. How does this change will improve performance ? |
@sfc-gh-bli @sfc-gh-yuwang This is quite unexpected! What is the justification for removing this feature?
It is unclear if this will work from any spark cluster using the It would be great if there is clarification on the timeline and how "soon" this will be available. |
@sfc-gh-bli @sfc-gh-yuwang |
We decided to remove Advanced query pushdown feature because:
The improvement of removal of Advanced Query Pushdown feature
|
The conversion tool should works with any Spark cluster where the Spark connector works now. It pretty similar to the Advanced Query Pushdown feature. for example, loading data from Snowflake to Spark. val df = spark.read.format("snowflake").options(...).load()
df.select(...).filter(...).union(...).join(...).collect() // connector will try to push down this operators but not guaranteed with conversion tool val snowparkDataFrame = snowpark.table(...).select(...).filter(...).union(...).join(...) // all of these operators will be processed in Snowflake.
val sparkDataFrame = toSpark(snowparkDataFrame, sparkSession)
// all operations on sparkDataFrame will be processed in Spark cluster. Unlike Advanced Query Pushdown, the new conversion tool also support Spark to Snowpark conversion, for example val sparkDataFrame = ...
val snowparkDataFrame = toSnowpark(sparkDataFrame, snowparkSession) // all operators on snowparkDataframe will be processed in Snowflake.
We are working on it now. It will be available in September, the connector |
In the development of Spark 3.5 support, we saw may internal changes of Spark logical plan and internal row system, which significantly declined the coverage of Advanced Query Pushdown. We also saw some wrong results due to the change of internal row system. There are two alternatives of Advanced Query Pushdown. |
@sfc-gh-bli We are still waiting on the new release of the connector with snowpark integration, to evaluate if we can use it. Can you please help with some issue / PR where the progress is being tracked? The initial estimation for the same was September. |
Remove Advanced Query Pushdown Feature