Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translated latest questions_gen #186

Merged
merged 5 commits into from
Jun 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 69 additions & 59 deletions data/questions_gen_bigquery.csv

Large diffs are not rendered by default.

436 changes: 223 additions & 213 deletions data/questions_gen_mysql.csv

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion data/questions_gen_postgres.csv
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ What are the publications written by authors from the 'Sociology' domain and pre
What's the average predicted time to graduation since admission in no. of days?,SELECT avg(predicted_graduation_semester - admit_term) AS average_predicted_time_to_graduation FROM student;,advising,date_functions,
How many students were predicted to graduate in the last 10 years?,"SELECT count(*) AS num_students_graduated FROM student WHERE predicted_graduation_semester >= DATE_TRUNC('year', CURRENT_DATE) - interval '10 year';",advising,date_functions,
How long has it been in days since the last admitted student?,SELECT CURRENT_DATE - max(admit_term) AS duration_since_last_admitted_student FROM student;,advising,date_functions,
Return the course id's that are offered in either semesters 1 or 2 and ends before 1pm and had an instructor on thursday,"SELECT DISTINCT co.course_id FROM public.course_offering co JOIN public.offering_instructor oi ON co.offering_id = oi.offering_id WHERE (co.semester = 1 OR co.semester = 2) AND co.end_time < '13:00:00' AND co.thursday IS NOT NULL;",advising,date_functions,
Return the course id's that are offered in either semesters 1 or 2 and ends before 1pm and had an instructor on thursday,"SELECT DISTINCT co.course_id FROM course_offering co JOIN offering_instructor oi ON co.offering_id = oi.offering_id WHERE (co.semester = 1 OR co.semester = 2) AND co.end_time < '13:00:00' AND co.thursday IS NOT NULL;",advising,date_functions,
What is the total number of students who found the instructor to be hilarious per course id?,"SELECT course_tags_count.course_id, SUM(course_tags_count.hilarious) AS total_hilarious FROM course_tags_count GROUP BY course_tags_count.course_id;",advising,group_by,
What is the average clarity score for each instructor who taught a course?,"SELECT {i.name, i.instructor_id}, AVG(c.clarity_score) FROM course c JOIN course_offering co ON c.course_id = co.course_id JOIN offering_instructor oi ON co.offering_id = oi.offering_id JOIN instructor i ON oi.instructor_id = i.instructor_id GROUP BY {};",advising,group_by,
How many course offerings have a final exam and how many do not?,"SELECT course_offering.has_final_exam, COUNT(offering_id) AS num_courses FROM course_offering GROUP BY course_offering.has_final_exam;SELECT COUNT(CASE WHEN co.has_final_exam THEN 1 END) AS num_with_final_exam, COUNT(CASE WHEN NOT co.has_final_exam THEN 1 END) AS num_without_final_exam FROM course_offering co;",advising,group_by,
Expand Down
298 changes: 154 additions & 144 deletions data/questions_gen_sqlite.csv

Large diffs are not rendered by default.

149 changes: 80 additions & 69 deletions data/questions_gen_tsql.csv

Large diffs are not rendered by default.

3 changes: 3 additions & 0 deletions translate_sql_dialect.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,9 @@
df["valid"] = ""
df["err_msg"] = ""

# fill na with empty string
df.fillna("", inplace=True)

# create db_type col where if "Snowflake" in file name, db_type = "snowflake", else db_type = "postgres"
if "snowflake" in dataset_file:
df["db_type"] = "snowflake"
Expand Down
8 changes: 7 additions & 1 deletion utils/dialects.py
Original file line number Diff line number Diff line change
Expand Up @@ -736,7 +736,11 @@ def create_sqlite_db(db_name, table_metadata_string_test, row_idx):
try:
conn = sqlite3.connect(f"{test_db_name}.db")
cursor = conn.cursor()
for table in table_metadata_string_test.split("\n"):
for table in table_metadata_string_test.split(");"):
if table.strip() == "":
continue
if not table.endswith(");"):
table += ");"
cursor.execute(table)
# print(f"Tables for `{test_db_name}` created successfully")
except Exception as err:
Expand Down Expand Up @@ -989,6 +993,8 @@ def test_valid_md_tsql(sql_test_list, db_name, table_metadata_string_test, row_i
validity_added = False
while tries < 3 and not validity_added:
try:
import pyodbc

conn = pyodbc.connect(
f"DRIVER={creds['tsql']['driver']};SERVER={creds['tsql']['server']};DATABASE={test_db};UID={creds['tsql']['user']};PWD={creds['tsql']['password']}"
)
Expand Down
Loading