-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs(sql): update SQL syntax for WINDOW and JOIN #3555
Changes from 5 commits
59c36bc
0786574
26111cc
bdcf0ec
7502dc0
a1f6e6a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,10 +12,10 @@ OpenMLDB仅支持上线[SELECT查询语句](../dql/SELECT_STATEMENT.md)。 | |
|
||
下表列出了在线请求模式支持的 `SELECT` 子句。 | ||
|
||
| SELECT 子句 | 说明 | | ||
|:-------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------| | ||
| 单张表的简单表达式计算 | 简单的单表查询是对一张表进行列运算、使用运算表达式或单行处理函数(Scalar Function)以及它们的组合表达式作计算。需要遵循[在线请求模式下单表查询的使用规范](#在线请求模式下单表查询的使用规范) | | ||
| [`JOIN` 子句](../dql/JOIN_CLAUSE.md) | OpenMLDB目前仅支持**LAST JOIN**。需要遵循[在线请求模式下LAST JOIN的使用规范](#在线请求模式下-last-join-的使用规范) | | ||
| SELECT 子句 | 说明 | | ||
| :--------------------------------------- | :----------------------------------------------------------- | | ||
| 单张表的简单表达式计算 | 简单的单表查询是对一张表进行列运算、使用运算表达式或单行处理函数(Scalar Function)以及它们的组合表达式作计算。需要遵循[在线请求模式下单表查询的使用规范](#在线请求模式下单表查询的使用规范) | | ||
| [`JOIN` 子句](../dql/JOIN_CLAUSE.md) | OpenMLDB目前仅支持**LAST JOIN**。需要遵循[在线请求模式下LAST JOIN的使用规范](#在线请求模式下-last-join-的使用规范) | | ||
| [`WINDOW` 子句](../dql/WINDOW_CLAUSE.md) | 窗口子句用于定义一个或者若干个窗口。窗口可以是有名或者匿名的。用户可以在窗口上调用聚合函数进行分析计算。需要遵循[在线请求模式下Window的使用规范](#在线请求模式下window的使用规范) | | ||
|
||
## 在线请求模式下 `SELECT` 子句的使用规范 | ||
|
@@ -57,15 +57,19 @@ SELECT substr(COL7, 3, 6) FROM t1; | |
|
||
### 在线请求模式下 `LAST JOIN` 的使用规范 | ||
|
||
- 仅支持`LAST JOIN`类型。 | ||
- 至少有一个JOIN条件是形如`left_source.column=right_source.column`的EQUAL条件,**并且`right_source.column`列需要命中右表的索引(key 列)**。 | ||
- 带排序LAST JOIN的情况下,`ORDER BY`只支持单列的列引用表达式,列类型为 int16, int32, int64 or timestamp, **并且列需要命中右表索引的时间列**。 | ||
- 右表 TableRef | ||
1. 仅支持`LAST JOIN`类型。 | ||
2. 至少有一个JOIN条件是形如`left_source.column=right_source.column`的EQUAL条件,**并且`right_source.column`列需要命中右表的索引(key 列)**。 | ||
3. 带排序LAST JOIN的情况下,`ORDER BY`只支持单列的列引用表达式,列类型为 int16, int32, int64 or timestamp, **并且列需要命中右表索引的时间列**。满足条件 2 和 3 的情况我们简单称做表能被 LAST JOIN 的 JOIN 条件优化 | ||
4. 右表 TableRef | ||
- 可以指一张物理表, 或者子查询语句 | ||
- 子查询情况, 只支持 | ||
- 子查询情况, 目前支持 | ||
- 简单列筛选 (`select * from tb` or `select id, val from tb`) | ||
- 窗口聚合子查询, 例如 `select id, count(val) over w as cnt from t1 window w as (...)`. 这种情况下, 子查询和 last join 的左表必须有相同的主表, 主表指计划树下最左边的物理表节点. | ||
- **Since OpenMLDB 0.8.0** 带 WHERE 条件过滤的简单列筛选 ( 例如 `select * from tb where id > 10`) | ||
- 窗口聚合子查询, 例如 `select id, count(val) over w as cnt from t1 window w as (...)`. | ||
- OpenMLDB 0.8.4 之前, LAST JOIN 的窗口聚合子查询需要和 LAST JOIN 的左边输入 source 有相同的主表 | ||
aceforeverd marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- [ALPHA] OpenMLDB >= 0.8.4, 允许 LAST JOIN 下的窗口聚合子查询不带主表. 详细见下面的例子 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 不带主表的例子是? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这个例子不能体现?可否加一个不是主表的例子。也要和“没有主表”区别开,“不带主表”这个词有点歧义 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
- **OpenMLDB >= 0.8.0** 带 WHERE 条件过滤的简单列筛选 ( 例如 `select * from tb where id > 10`) | ||
- **[ALPHA] OpenMLDB >= 0.8.4** 右表是带 LAST JOIN 的子查询 `subquery`, 要求 `subquery` 最左的表能被 JOIN 条件优化, `subquery`剩余表能被自身 LAST JOIN 的 JOIN 条件优化 | ||
- **[ALPHA] OpenMLDB >= 0.8.4** LEFT JOIN. 要求 LEFT JOIN 的右表能被 LEFT JOIN 条件优化, LEFT JOIN 的左表能被上层的 LAST JOIN 条件优化 | ||
|
||
**Example: 支持上线的 `LAST JOIN` 语句范例** | ||
创建两张表以供后续`LAST JOIN`。 | ||
|
@@ -115,15 +119,82 @@ desc t1; | |
t1.col0 as t1_col0, | ||
t1.col1 + t2.col1 + 1 as test_col1, | ||
FROM t1 | ||
LAST JOIN t2 ORDER BY t2.std_time ON t1.col1=t2.col1; | ||
LAST JOIN t2 ORDER BY t2.std_time ON t1.col1=t2.col1; | ||
``` | ||
|
||
右表是带 LAST JOIN 或者 WHERE 条件过滤的情况 | ||
|
||
```sql | ||
CREATE TABLE t3 (col0 STRING, col1 int, std_time TIMESTAMP, INDEX(KEY=col1, TS=std_time, TTL_TYPE=absolute, TTL=30d)); | ||
-- SUCCEED | ||
|
||
SELECT | ||
t1.col1 as t1_col1, | ||
t2.col1 as t2_col1, | ||
t2.col0 as t2_col0 | ||
FROM t1 LAST JOIN ( | ||
SELECT * FROM t2 WHERE strlen(col0) > 0 | ||
) t2 | ||
ON t1.col1 = t2.col1 | ||
|
||
-- t2 被 JOIN 条件 't1.col1 = tx.t2_co1l' 优化, t3 被 JOIN 条件 't2.col1 = t3.col1' | ||
SELECT | ||
t1.col1 as t1_col1, | ||
tx.t2_col1, | ||
tx.t3_col1 | ||
FROM t1 LAST JOIN ( | ||
SELECT t2.col1 as t2_col1, t3.col1 as t3_col1 | ||
FROM t2 LAST JOIN t3 | ||
ON t2.col1 = t3.col1 | ||
) tx | ||
ON t1.col1 = tx.t2_col1 | ||
|
||
-- 右表是 LEFT JOIN | ||
SELECT | ||
t1.col1 as t1_col1, | ||
tx.t2_col1, | ||
tx.t3_col1 | ||
FROM t1 LAST JOIN ( | ||
SELECT t2.col1 as t2_col1, t3.col1 as t3_col1 | ||
FROM t2 LEFT JOIN t3 | ||
ON t2.col1 = t3.col1 | ||
) tx | ||
ON t1.col1 = tx.t2_col1 | ||
|
||
-- OpenMLDB 0.8.4 之前, LAST JOIN 窗口子查询需要窗口的子查询主表和当前主表一致 | ||
-- 这里都是 t1 | ||
SELECT | ||
t1.col1, | ||
tx.agg | ||
FROM t1 LAST JOIN ( | ||
SELECT col1, count(col2) over w as agg | ||
FROM t1 WINDOW w AS ( | ||
UNION t2 | ||
PARTITION BY col2 order by std_time ROWS BETWEEN 2 PRECEDING AND CURRENT ROW | ||
INSTANCE_NOT_IN_WINDOW EXCLUDE CURRENT_ROW | ||
) | ||
) | ||
|
||
-- 右表是窗口聚合计算 | ||
-- OpenMLDB >= 0.8.4, 允许 t1 LAST JOIN WINDOW (t2). t1 是主表, t2 是一张副表 | ||
-- 此 SQL 和上一个例子语义一致 | ||
SELECT | ||
t1.col1, | ||
tx.agg | ||
FROM t1 LAST JOIN ( | ||
SELECT col1, count(col2) over w as agg | ||
FROM t2 WINDOW w AS (PARTITION BY col2 order by std_time ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) | ||
) | ||
``` | ||
|
||
|
||
|
||
### 在线请求模式下Window的使用规范 | ||
|
||
- 窗口边界仅支持`PRECEDING`和`CURRENT ROW` | ||
- 窗口类型仅支持`ROWS`和`ROWS_RANGE`。 | ||
- 窗口`PARTITION BY`只支持列表达式,可以是多列,并且所有列需要命中索引,主表和 union source 的表都需要符合要求 | ||
- 窗口`ORDER BY`只支持列表达式,只能是单列,并且列需要命中索引的时间列,主表和 union source 的表都需要符合要求 | ||
- 窗口`ORDER BY`只支持列表达式,只能是单列,并且列需要命中索引的时间列,主表和 union source 的表都需要符合要求. 从 OpenMLDB 0.8.4 开始, ORDER BY 可以不写, 但需要满足额外的要求, 详见 [WINDOW CLAUSE](../dql/WINDOW_CLAUSE.md) | ||
- 可支持使用 `EXCLUDE CURRENT_ROW`,`EXCLUDE CURRENT_TIME`,`MAXSIZE`,`INSTANCE_NOT_IN_WINDOW`对窗口进行其他特殊限制,详见[OpenMLDB特有的 WindowSpec 元素](#openmldb特有的-windowspec-元素)。 | ||
- `WINDOW UNION` source 要求,支持如下格式的子查询: | ||
- 表引用或者简单列筛选,例如 `t1` 或者 `select id, val from t1`。union source 和 主表的 schema 必须完全一致,并且 union source 对应的 `PARTITION BY`, `ORDER BY` 也需要命中索引 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ts column type should be bigint or timestamp? Can we
order by
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the five types is supported in SQL engine, is there any difference in deployment ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if you deploy, creating index will be failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's indeed index's restriction, anyway I've remove the two types in case confusing.