[SPARK-51050] [SQL] Add group by alias tests to the group-by.sql #49750

mihailoale-db · 2025-01-31T14:40:40Z

What changes were proposed in this pull request?

I propose that we extend group-by.sql with some cases where we group byaliases.

Why are the changes needed?

Extend the testing coverage.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added tests.

Was this patch authored or co-authored using generative AI tooling?

No.

sql/core/src/test/resources/sql-tests/inputs/group-by.sql

vladimirg-db · 2025-01-31T16:55:40Z

sql/core/src/test/resources/sql-tests/inputs/group-by.sql

@@ -64,6 +78,10 @@ set spark.sql.groupByAliases=false;

 -- Check analysis exceptions
 SELECT a AS k, COUNT(b) FROM testData GROUP BY k;
+SELECT 1 GROUP BY `1`;


This is a duplicate.

Idea was to add some tests that should fail (with set spark.sql.groupByAliases=false;). I can remove them if needed

vladimirg-db · 2025-01-31T17:03:11Z

sql/core/src/test/resources/sql-tests/inputs/group-by.sql

+SELECT 1 AS a FROM testData GROUP BY `a`;
+
+-- GROUP BY implicit alias
+SELECT 1 GROUP BY `1`;


Suggested change

SELECT 1 GROUP BY `1`;

SELECT 1 GROUP BY `1`;

-- GROUP BY alias with the subquery name

SELECT (SELECT a FROM testData LIMIT 1) + (SELECT b FROM testData LIMIT 1) FROM VALUES (1, 2) GROUP BY `(SELECTaFROMtestDataLIMIT1)+(SELECTbFROMtestDataLIMIT1)`

-- GROUP BY with expression subqueries

SELECT a, count(*) FROM testData GROUP BY (SELECT b FROM testData)

SELECT a, count(*) FROM testData GROUP BY a, (SELECT b FROM testData)

SELECT a, count(*) FROM testData GROUP BY a, (SELECT b FROM testData LIMIT 1)

SELECT a, count(*) FROM testData GROUP BY a, b IN (SELECT a FROM testData)

SELECT a, count(*) FROM testData GROUP BY a, a IN (SELECT b FROM testData)

SELECT a, count(*) FROM testData GROUP BY a, EXISTS(SELECT b FROM testData)

vladimirg-db · 2025-01-31T19:40:57Z

sql/core/src/test/resources/sql-tests/analyzer-results/group-by.sql.out

+-- !query analysis
+org.apache.spark.sql.catalyst.ExtendedAnalysisException
+{
+  "errorClass" : "UNRESOLVED_COLUMN.WITHOUT_SUGGESTION",


Please address this.

Idea was to add some tests that should fail (with set spark.sql.groupByAliases=false;). I can remove them if needed

beliefer · 2025-02-02T08:04:18Z

Why you add these test cases here?

initial commit

2c6a742

github-actions bot added the SQL label Jan 31, 2025

vladimirg-db reviewed Jan 31, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-51050] [SQL] Add group by alias tests to the group-by.sql #49750

[SPARK-51050] [SQL] Add group by alias tests to the group-by.sql #49750

mihailoale-db commented Jan 31, 2025

vladimirg-db Jan 31, 2025

mihailoale-db Feb 1, 2025

vladimirg-db Jan 31, 2025

vladimirg-db Jan 31, 2025

mihailoale-db Feb 1, 2025

beliefer commented Feb 2, 2025

-SELECT 1 GROUP BY `1`;
+SELECT 1 GROUP BY `1`;
+-- GROUP BY alias with the subquery name
+SELECT (SELECT a FROM testData LIMIT 1) + (SELECT b FROM testData LIMIT 1) FROM VALUES (1, 2) GROUP BY `(SELECTaFROMtestDataLIMIT1)+(SELECTbFROMtestDataLIMIT1)`
+-- GROUP BY with expression subqueries
+SELECT a, count(*) FROM testData GROUP BY (SELECT b FROM testData)
+SELECT a, count(*) FROM testData GROUP BY a, (SELECT b FROM testData)
+SELECT a, count(*) FROM testData GROUP BY a, (SELECT b FROM testData LIMIT 1)
+SELECT a, count(*) FROM testData GROUP BY a, b IN (SELECT a FROM testData)
+SELECT a, count(*) FROM testData GROUP BY a, a IN (SELECT b FROM testData)
+SELECT a, count(*) FROM testData GROUP BY a, EXISTS(SELECT b FROM testData)

[SPARK-51050] [SQL] Add group by alias tests to the group-by.sql #49750

Are you sure you want to change the base?

[SPARK-51050] [SQL] Add group by alias tests to the group-by.sql #49750

Conversation

mihailoale-db commented Jan 31, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

vladimirg-db Jan 31, 2025

Choose a reason for hiding this comment

mihailoale-db Feb 1, 2025

Choose a reason for hiding this comment

vladimirg-db Jan 31, 2025

Choose a reason for hiding this comment

vladimirg-db Jan 31, 2025

Choose a reason for hiding this comment

mihailoale-db Feb 1, 2025

Choose a reason for hiding this comment

beliefer commented Feb 2, 2025