Skip to content

Commit

Permalink
Improve and expand client protocol docs
Browse files Browse the repository at this point in the history
Include the new spooling protocol and its configuration for CLI and JDBC
driver.
  • Loading branch information
mosabua committed Nov 26, 2024
1 parent 2d3adc8 commit c2849c6
Show file tree
Hide file tree
Showing 10 changed files with 371 additions and 48 deletions.
1 change: 1 addition & 0 deletions docs/src/main/sphinx/admin.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ admin/properties

* [Properties reference overview](admin/properties)
* [](admin/properties-general)
* [](admin/properties-client-protocol)
* [](admin/properties-http-server)
* [](admin/properties-resource-management)
* [](admin/properties-query-management)
Expand Down
183 changes: 183 additions & 0 deletions docs/src/main/sphinx/admin/properties-client-protocol.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
# Client protocol properties

The following sections provide a reference for all properties related to the
[client protocol](/client/client-protocol).

(prop-protocol-spooling)=
## Spooling protocol

The following properties are related to the [](protocol-spooling).

### `protocol.spooling.enabled`

- **Type:** [](prop-type-boolean)
- **Default value:** `true` (is that automatically configured if shares secret or so is set up)

Enable spooled client protocol support.

### `protocol.spooling.shared-secret-key`

- **Type:** [](prop-type-string)

256 bit, base64-encoded secret key used to secure segment identifiers.

### `protocol.spooling.retrieval-mode`

- **Type:** [](prop-type-string)
- **Default value:** `STORAGE`

Determines how the client retrieves the segment. Following are possible values:

* `STORAGE` - client accesses the storage directly with the pre-signed URI, 1
round trip
* `COORDINATOR_STORAGE_REDIRECT` - client accesses the coordinator and is
redirected to the storage with the pre-signed URI, 2 round trips
* `COORDINATOR_PROXY` - client accesses the coordinator and gets segment data
through it, 1 round trip per segment
* `WORKER_PROXY` - tbd

### `protocol.spooling.initial-segment-size`

- **Type:** [](prop-type-data-size)
- **Default value:** 8MB

Initial size of the spooled segments.

### `protocol.spooling.maximum-segment-size`

- **Type:** [](prop-type-data-size)
- **Default value:** 16MB

Maximum size per segment.

### `protocol.spooling.inlining.enabled`

- **Type:** [](prop-type-boolean)
- **Default value:** `true`

Allow spooled protocol to inline data.


### `protocol.spooling.inlining.max-rows`

- **Type:** [](prop-type-integer)
- **Default value:** 1000

Maximum number of rows that are allowed to be inlined per worker.


### `protocol.spooling.inlining.max-size`

- **Type:** [](prop-type-data-size)
- **Default value:** 128kB

Maximum size of rows that are allowed to be inlined per worker.

(prop-spooling-filesystem)=
## Spooling filesystem

The following properties are used to configure the object storage used with the
[](protocol-spooling).

### `fs.azure.enabled`

- **Type:** [](prop-type-boolean)
- **Default value:** `false`

Enable Azure Sotobject storage for more details, exclusive to other


### `fs.s3.enabled`

- **Type:** [](prop-type-boolean)
- **Default value:** `false`

### `fs.gcs.enabled`

- **Type:** [](prop-type-boolean)
- **Default value:** `false`

### `fs.location`

- **Type:**

similar to https://trino.io/docs/current/connector/delta-lake.html#register-table

location with some schema

### `fs.layout`

- **Type:**

layout class, some sort of `SIMPLE` or `PARTITIONED`

Spooling segments file system layout




### `fs.segment.ttl`

- **Type:** [](prop-type-duration)

Maximum duration for the client to retrieve spooled segment before it expires

### `fs.segment.direct.ttl`

- **Type:** [](prop-type-duration)

Maximum duration for the client to retrieve spooled segment from the direct URI

### `fs.segment.encryption`

- **Type:**

Encrypt segments with ephemeral keys

### `fs.segment.explicit-ack`

Enables deletion of segments on client acknowledgment.

### `fs.segment.pruning.enabled`

- **Type:**

Prune expired segments periodically

### `fs.segment.pruning.interval`

- **Type:**

Interval to prune expired segments

### `fs.segment.pruning.batch-size`

- **Type:**

Prune expired segments in batches of provided size


(prop-protocol-v1)=
## v1 protocol

The following properties are related to the [](protocol-direct).

### `protocol.v1.prepared-statement-compression.length-threshold`

- **Type:** [](prop-type-integer)
- **Default value:** `2048`

Prepared statements that are submitted to Trino for processing, and are longer
than the value of this property, are compressed for transport via the HTTP
header to improve handling, and to avoid failures due to hitting HTTP header
size limits.

### `protocol.v1.prepared-statement-compression.min-gain`

- **Type:** [](prop-type-integer)
- **Default value:** `512`

Prepared statement compression is not applied if the size gain is less than the
configured value. Smaller statements do not benefit from compression, and are
left uncompressed.

19 changes: 0 additions & 19 deletions docs/src/main/sphinx/admin/properties-general.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,25 +33,6 @@ across nodes in the cluster. It can be disabled, when it is known that the
output data set is not skewed, in order to avoid the overhead of hashing and
redistributing all the data across the network.

## `protocol.v1.prepared-statement-compression.length-threshold`

- **Type:** {ref}`prop-type-integer`
- **Default value:** `2048`

Prepared statements that are submitted to Trino for processing, and are longer
than the value of this property, are compressed for transport via the HTTP
header to improve handling, and to avoid failures due to hitting HTTP header
size limits.

## `protocol.v1.prepared-statement-compression.min-gain`

- **Type:** {ref}`prop-type-integer`
- **Default value:** `512`

Prepared statement compression is not applied if the size gain is less than the
configured value. Smaller statements do not benefit from compression, and are
left uncompressed.

(file-compression)=
## File compression and decompression

Expand Down
1 change: 1 addition & 0 deletions docs/src/main/sphinx/admin/properties.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ properties, refer to the {doc}`connector documentation </connector/>`.
:titlesonly: true
General <properties-general>
Client protocol <properties-client-protocol>
HTTP server <properties-http-server>
Resource management <properties-resource-management>
Query management <properties-query-management>
Expand Down
55 changes: 39 additions & 16 deletions docs/src/main/sphinx/client.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,50 @@
# Clients

A [client](trino-concept-client) is used to send queries to Trino and receive
results, or otherwise interact with Trino and the connected data sources.
A [client](trino-concept-client) is used to send SQL queries to Trino, and
therefore any [connected data sources](trino-concept-data-source), and receive
results.

Some clients, such as the [command line interface](/client/cli), can provide a
user interface directly. Clients like the [JDBC driver](/client/jdbc), provide a
mechanism for other applications, including your own custom applications, to
connect to Trino.
## Client drivers

The following clients are available as part of every Trino release:
Client drivers, also called client libraries, provide a mechanism for other
applications to connect to Trino. The application are called client application
and include your own custom applications or scripts. The Trino project maintains the
following client drivers:

* [Trino JDBC driver](/client/jdbc)
* [trino-go-client](https://github.com/trinodb/trino-go-client)
* [trino-js-client](https://github.com/trinodb/trino-js-client)
* [trino-python-client](https://github.com/trinodb/trino-python-client)

Other communities and vendors provide [other client
drivers](https://trino.io/ecosystem/client.html).

## Client applications

Client applications provide a user interface and other user-facing features to
run queries with Trino. You can inspect the results, perform analytics with
further queries, and create visualizations. Client applications typically use a
client driver.

The Trino project maintains the [Trino command line interface](/client/cli) as a
client application.

Other communities and vendors provide [numerous other client
applications](https://trino.io/ecosystem/client.html)

## Client protocol

All client drivers and client applications communicate with the Trino
coordinator using the [client protocol](/client/client-protocol).

Configure support for the [spooling protocol](protocol-spooling) on the cluster
to improve performance for client interactions with higher data transfer
demands.

```{toctree}
:maxdepth: 1
client/client-protocol
client/cli
client/jdbc
```

The Trino project maintains the following other client libraries:

* [trino-go-client](https://github.com/trinodb/trino-go-client)
* [trino-js-client](https://github.com/trinodb/trino-js-client)
* [trino-python-client](https://github.com/trinodb/trino-python-client)

In addition, other communities and vendors provide [numerous other client
libraries, drivers, and applications](https://trino.io/ecosystem/client)
13 changes: 13 additions & 0 deletions docs/src/main/sphinx/client/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -604,6 +604,19 @@ Query 20200707_170726_00030_2iup9 failed: line 1:25: Column 'region' cannot be r
SELECT nationkey, name, region FROM tpch.sf1.nation LIMIT 3
```

(cli-spooling-protocol)=
## Spooling protocol

The Trino CLI supports the use of the spooling protocol to improve performance
for client interactions with higher data transfer demands.

Configure the Trino cluster as detailed in [](protocol-spooling) and specify the
desired encoding with the `--encoding` option and the available values ,
`json+zstd` (recommended) for JSON with Zstandard compression, and `json+lz4`
for JSON with LZ4 compression, and `json` for uncompressed JSON.

The CLI process must have network access to the spooling object storage.

(cli-output-format)=
## Output formats

Expand Down
Loading

0 comments on commit c2849c6

Please sign in to comment.