Skip to content

Commit

Permalink
chore: update sql docs (#319)
Browse files Browse the repository at this point in the history
Co-authored-by: Luis Moreno <[email protected]>
  • Loading branch information
morenol and morenol authored Dec 17, 2024
1 parent 98de88a commit 93eb074
Show file tree
Hide file tree
Showing 36 changed files with 235 additions and 164 deletions.
6 changes: 3 additions & 3 deletions sdf/cli/deploy.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -90,11 +90,11 @@ Where:
#### SQL mode
Use the SQL mode in the CLI, to be able to run SQL queries on SDF states. See more details in [sql mode for sdf run]
Please see the section of [sql mode].
### Managing dataflow in interactive shell
Please see the [deployment] section for more details.
[deployment]: /sdf/deployment
[sql mode for sdf run]: /sdf/cli/run.mdx#sql-mode
[deployment]: ../deployment.mdx
[sql mode]: ./run_sql.mdx
52 changes: 2 additions & 50 deletions sdf/cli/run.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -128,55 +128,7 @@ Show the detailed information:

#### SQL mode

Use the SQL mode in the CLI, to be able to run SQL queries on SDF states. For a given dataflow, we will have in context for SQL all the dataframe states, which are basically the states with an `arrow-row` value.
Please see the section of [sql mode].

For states that are scoped to a window, we will have access to the last flush state. For states that are not window aware we will have access to the global state.

In order to enter the SQL mode, type `sql` in the SDF interactive shell. In the SQL mode we could perform any sql command supported by the polars engine.

#### Examples:

##### Run command

Navigate to the directory with `dataflow.yaml` file, and run the command:

```bash
$ sdf run
```

##### Enter the SQL mode

Using the sql command:

```bash
>> sql
SDF SQL version sdf-beta5
Type .help for help.
```

#### Show tables in context
```bash
sql>> show tables
shape: (1, 1)
┌────────────────┐
│ name │
│ --- │
│ str │
╞════════════════╡
│ count_per_word │
└────────────────┘
```

#### Perform a query

```bash
sql>> select * from count_per_word;
shape: (0, 2)
┌──────┬─────────────┐
│ _key ┆ occurrences │
│ --- ┆ --- │
│ str ┆ u32 │
╞══════╪═════════════╡
│ abc │ 10 |
└──────┴─────────────┘
```
[sql mode]: ./run_sql.mdx
84 changes: 84 additions & 0 deletions sdf/cli/run_sql.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
---
description: SDF SQL subcommand
title: sdf run sql
sidebar_position: 55
---

# `sdf run sql`

Use the SQL mode in the CLI, to be able to run SQL queries on SDF states. For a given dataflow, we will have in context for SQL all the [dataframe states], which are essentially the states with an `arrow-row` value.

For states that are scoped to a window, we will have access to the last flushed state. For states that are not window aware we will have access to the global state.

In order to enter the SQL mode, type `sql` in the SDF interactive shell created through `sdf run` or `sdf deploy`. In the SQL mode we could perform any sql command supported by the polars engine.

We can use the `.help` command to see the available options.

```
sql >> .help
shape: (3, 2)
┌─────────┬─────────────────────────────────┐
│ command ┆ description │
│ --- ┆ --- │
│ str ┆ str │
╞═════════╪═════════════════════════════════╡
│ .exit ┆ Exit this program │
│ .quit ┆ Exit this program │
│ .help ┆ Display this help. │
└─────────┴─────────────────────────────────┘
```

#### Examples:

##### Run command

Navigate to the directory with `dataflow.yaml` file, and run the command:

```bash
$ sdf run
```

##### Enter the SQL mode

Using the sql command:

```bash
>> sql
SDF SQL version sdf-beta5
Type .help for help.
```

#### Show tables in context
```bash
sql >> show tables
shape: (1, 1)
┌────────────────┐
│ name │
│ --- │
│ str │
╞════════════════╡
│ count_per_word │
└────────────────┘
```

#### Perform a query

```bash
sql >> select * from count_per_word;
shape: (0, 2)
┌──────┬─────────────┐
│ _key ┆ occurrences │
│ --- ┆ --- │
│ str ┆ u32 │
╞══════╪═════════════╡
│ abc │ 10 |
└──────┴─────────────┘
```
#### Exit the SQL mode
```bash
sql >> .exit
```
[dataframe states]: ../concepts/sql.mdx#dataframe-states
2 changes: 1 addition & 1 deletion sdf/cli/test.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
description: SDF Test Command
sidebar_position: 50
sidebar_position: 98
---

# `sdf test`
Expand Down
2 changes: 1 addition & 1 deletion sdf/concepts/schema_validation.mdx
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Schema Validation
description: Schema Validation
sidebar_position: 60
sidebar_position: 45
---

import CodeBlock from '@theme/CodeBlock';
Expand Down
20 changes: 10 additions & 10 deletions sdf/concepts/state-dataframe.mdx → sdf/concepts/sql.mdx
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
---
title: State - Arrow Dataframe
description: Stateful Dataflows Dataframe Integration
title: SQL Interface
description: Stateful Dataflows Dataframe SQL Integration
sidebar_position: 60
---

## Dataframe states

States are a key concept in SDF. They hold a simple value or they can be divided into multiple partitions based on a key. Each partition state can have many different types.

One of the type is `arrow-row`. For example, follow snippet define state `count_per_word` that store each row into arrow dataframe.
Expand All @@ -22,8 +24,7 @@ One of the type is `arrow-row`. For example, follow snippet define state `count
type: u32
```
In here, we are defining a state `count_per_word` to track frequency of the words. For each key, we have a row that has a column `count` that store the count of the word.
An `arrow-row` state can be seen as an SQL table. In the previous example, we are defining a state `count_per_word` to track frequency of the words. For each key, we have a row that has a column `count` that store the count of the word.
For example, let's we have following word: `apple`, `orange`, `banana`, `orange`, `grape`, `orange`, `banana`.

Then this will be mapped to arrow dataframe as follows:
Expand Down Expand Up @@ -52,9 +53,9 @@ update-state:

Note that state value can be access using `count_per_word` state function which is automatically injected by SDF builder.

This API is invoked by the `update-state` operator, which only returns the value of the partition state.
This API is invoked by the `update-state` operator, which only is able to update the value of the partition state.

In the example, `count_per_word` represents a row value of the dataframe. If operator sees `apple`, it will be first row in the dataframe above.
In the example, `count_per_word` represents a row value of the dataframe. If operator sees `apple`, it will be the first row in the dataframe above.

## SQL function

Expand Down Expand Up @@ -83,16 +84,15 @@ flush:
}
```

The output of the `sql` function implements also the following methods that will be described above: sql, rows, col, key, next
The output of the `sql` function implements also the following methods that will be described below: sql, rows, col, key, next

## SQL API

For any state that is dataframe, you can use SQL API to perform dataframe operation. SDF uses polar SQL to perform dataframe operation.
The result of the SQL operation is always dataframe. So you can perform multiple SQL operation to get the desired result.
For any state that is dataframe, you can use SQL API to perform dataframe operation. SDF uses polars SQL to perform dataframe operations.
The result of a SQL operation is always a dataframe. So you can perform multiple SQL operation to get the desired result.

The SQL is executed in the context of all the available dataframes, so you can perform any JOIN or complex SQL operations with them. Each dataframe is represented as a table, and each table name is their state name replacing hyphens(-) with underscores(_) as illustrated below.


```rust
let top3 = sql("select * from count_per_word order by count desc limit 3")?;
```
Expand Down
2 changes: 1 addition & 1 deletion sdf/concepts/state.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ State can be defined in a package [sdf-package.yaml] file. The approach allows y
We'll use an example to show how to implement, test, and import a state from a package file.

[types]: types.mdx
[arrow dataframe]: state-dataframe.mdx
[arrow dataframe]: sql.mdx#dataframe-states
[sdf-package.yaml]: sdf/composition/sdf-package-yaml.mdx
[dataflow.yaml]: sdf/index.mdx
[window processing]: sdf/concepts/window-processing.mdx
Expand Down
2 changes: 1 addition & 1 deletion sdf/how-to/key_value-chained.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -161,5 +161,5 @@ This how-to focused on using key-values as output as well as the input. The foll


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/sql.mdx#dataframe-states
[key_value_chain]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/key-value/chained
2 changes: 1 addition & 1 deletion sdf/how-to/key_value-input.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -150,5 +150,5 @@ This how-to focused on using key-values as inputs. The following pages contains


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/sql.mdx#dataframe-states
[key_value_input]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/key-value/input
2 changes: 1 addition & 1 deletion sdf/how-to/key_value-output.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -134,5 +134,5 @@ This how-to focused on using key-values as out. The following pages contains ano


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/sql.mdx#dataframe-states
[key_value_output]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/key-value/output
4 changes: 2 additions & 2 deletions sdf/how-to/state_example_arrow_row.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ partition:

States are terminal so no other action will be run.

### Iterface
### Interface
The second service serves as a way to read from the state.

```YAML
Expand Down Expand Up @@ -172,5 +172,5 @@ We just implement example using arrow states. The following link contains anothe


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/sql.mdx#dataframe-states
[github_arrow_1]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/update-state
6 changes: 3 additions & 3 deletions sdf/whatsnew.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ For upgrading cloud workers, please contact [InfinyOn support](#infinyon-support

### CLI changes

- `sdf run` not longer accepts `--dev`. Develoment mode is now the default for `sdf run`. If you want to run in non-development mode use `--prod`.
- `sdf run` not longer accepts `--dev`. Development mode is now the default for `sdf run`. If you want to run in non-development mode use `--prod`.

### Improvements

Expand All @@ -46,5 +46,5 @@ For upgrading cloud workers, please contact [InfinyOn support](#infinyon-support
For any questions or issues, please contact InfinyOn support at [email protected] or https://discordapp.com/invite/bBG2dTz

[sql mode]: cli/run.mdx#sql-mode
[sql function]: concepts/state-dataframe.mdx#sql-function
[dataframe states]: concepts/state-dataframe.mdx
[sql function]: concepts/sql.mdx#sql-function
[dataframe states]: concepts/sql.mdx#dataframe-states
Original file line number Diff line number Diff line change
Expand Up @@ -161,5 +161,5 @@ This how-to focused on using key-values as output as well as the input. The foll


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/state-dataframe.mdx
[key_value_chain]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/key-value/chained
Original file line number Diff line number Diff line change
Expand Up @@ -150,5 +150,5 @@ This how-to focused on using key-values as inputs. The following pages contains


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/state-dataframe.mdx
[key_value_input]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/key-value/input
Original file line number Diff line number Diff line change
Expand Up @@ -134,5 +134,5 @@ This how-to focused on using key-values as out. The following pages contains ano


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/state-dataframe.mdx
[key_value_output]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/key-value/output
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ partition:

States are terminal so no other action will be run.

### Iterface
### Interface
The second service serves as a way to read from the state.

```YAML
Expand Down Expand Up @@ -172,5 +172,5 @@ We just implement example using arrow states. The following link contains anothe


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/state-dataframe.mdx
[github_arrow_1]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/update-state
Original file line number Diff line number Diff line change
Expand Up @@ -161,5 +161,5 @@ This how-to focused on using key-values as output as well as the input. The foll


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/state-dataframe.mdx
[key_value_chain]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/key-value/chained
Original file line number Diff line number Diff line change
Expand Up @@ -150,5 +150,5 @@ This how-to focused on using key-values as inputs. The following pages contains


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/state-dataframe.mdx
[key_value_input]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/key-value/input
Original file line number Diff line number Diff line change
Expand Up @@ -134,5 +134,5 @@ This how-to focused on using key-values as out. The following pages contains ano


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/state-dataframe.mdx
[key_value_output]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/key-value/output
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ partition:

States are terminal so no other action will be run.

### Iterface
### Interface
The second service serves as a way to read from the state.

```YAML
Expand Down Expand Up @@ -172,5 +172,5 @@ We just implement example using arrow states. The following link contains anothe


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/state-dataframe.mdx
[github_arrow_1]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/update-state
Original file line number Diff line number Diff line change
Expand Up @@ -161,5 +161,5 @@ This how-to focused on using key-values as output as well as the input. The foll


[installation]: /docs/fluvio/quickstart#install-fluvio
[arrow]: /sdf/concepts/state-dataframe
[arrow]: ../concepts/state-dataframe.mdx
[key_value_chain]: https://github.com/infinyon/stateful-dataflows-examples/tree/main/primitives/key-value/chained
Loading

0 comments on commit 93eb074

Please sign in to comment.