Skip to content

Commit

Permalink
Update HW2
Browse files Browse the repository at this point in the history
  • Loading branch information
gjanee committed Apr 17, 2024
1 parent 22ac875 commit 9ea9343
Showing 1 changed file with 21 additions and 18 deletions.
39 changes: 21 additions & 18 deletions modules/week03/hw-03-2.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,44 +2,47 @@
title: "Week 3 - SQL problem 2"
---

If you want to know which site has the largest area, it's tempting to say
# Part 1

```
If we want to know which site has the largest area, it's tempting to say

```
SELECT Site_name, MAX(Area) FROM Site;
```

but as explained in class, databases will correctly compute the maximum but will select an arbitrary row to fill in the Site_name column. No good! This misleading behavior is more apparent if we do an average instead of a maximum:
Wouldn't that be great? But DuckDB gives an error. And right it should! This query is conceptually flawed. Please describe what is wrong with this query. Don't just quote DuckDB's error message--- explain why DuckDB is objecting to performing this query.

```
SELECT Site_name, AVG(Area) FROM Site;
To help you answer this question, you may want to consider:

┌───────────┬───────────┐
│ Site_name │ AVG(Area) │
├───────────┼───────────┤
│ Barrow │ 440.6125 │
└───────────┴───────────┘
```
- To the database, the above query is no different from

for there is no site whose area exactly equals the average, and so there is nothing you could reasonably put in Site_name, and it certainly wouldn't be Barrow. (SQLite is special in that if you do a MIN or MAX, it will return the row (or one of the rows, if there are multiple rows) that matches the minimum or maximum. But other databases do not do that.) So, we need a plan B.
- `SELECT Site_name, AVG(Area) FROM Site`
- `SELECT Site_name, COUNT(*) FROM Site`
- `SELECT Site_name, SUM(Area) FROM Site`

# Part 1
In all these examples, the database sees that it is being asked to apply an aggregate function to a table column.

Find the site name and area of the site having the largest area. Do so by ordering the rows in a particularly convenient order, and using LIMIT to select just the first row. Your result should look like:
- When performing an aggregation, SQL wants to collapse the requested columns down to a single row. (For a table-level aggregation such as requested above, it wants to collapse the entire table down to a single row. For a `GROUP BY`, it wants to collapse each group down to a single row.)

```
# Part 2

Time for plan B. Find the site name and area of the site having the largest area. Do so by ordering the rows in a particularly convenient order, and using LIMIT to select just the first row. Your result should look like:

```
┌──────────────┬────────┐
│ Site_name │ Area │
│ varchar │ float │
├──────────────┼────────┤
│ Coats Island │ 1239.1 │
└──────────────┴────────┘
```

Please submit your SQL.

# Part 2
# Part 3

Do the same, but use a nested query. First, create a query that finds the maximum area. Then, create a query that selects the site name and area of the site whose area equals the maximum. Your overall query will look something like:
Do the same, but use a nested query. First, create a query that finds the maximum area. Then, create a query that selects the site name and area of the site whose area equals the maximum. Your overall query will look something like:

```
```
SELECT Site_name, Area FROM Site WHERE Area = (SELECT ...);
```

0 comments on commit 9ea9343

Please sign in to comment.