Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc](stmt) update add backends, cancel decommission, decommission backends, drop backends, modify backend, show backends and show frontends #1896

Merged
merged 13 commits into from
Jan 23, 2025
Original file line number Diff line number Diff line change
Expand Up @@ -26,47 +26,68 @@ under the License.

## Description

The ADD BACKEND command is used to add one or more backend nodes to a Doris OLAP database cluster. This command allows administrators to specify the host and port of the new backend nodes, along with optional properties that configure their behavior.
The ADD BACKEND is used to add one or more BE nodes to the Doris cluster. This command allows administrators to specify the host and port of the new BE nodes, as well as optional properties to configure their behavior.

grammar:
## Syntax

```sql
-- Add nodes (add this method if you do not use the multi-tenancy function)
ALTER SYSTEM ADD BACKEND "host:heartbeat_port"[,"host:heartbeat_port"...] [PROPERTIES ("key"="value", ...)];
ALTER SYSTEM ADD BACKEND "<host>:<heartbeat_port>"[,"<host>:<heartbeat_port>"...] [PROPERTIES ("<key>"="<value>", ...)]
yagagagaga marked this conversation as resolved.
Show resolved Hide resolved
```

### Parameters
## Required Parameters

* `host` can be a hostname or an ip address of the backend node while `heartbeat_port` is the heartbeat port of the node
* `PROPERTIES ("key"="value", ...)`: (Optional) A set of key-value pairs that define additional properties for the backend nodes. These properties can be used to customize the configuration of the backends being added. Available properties include:
**<host>**
yagagagaga marked this conversation as resolved.
Show resolved Hide resolved

* tag.location: Specifies the resource group where the backend node belongs. For example, PROPERTIES ("tag.location" = "groupb").
> It can be the hostname or IP address of the BE node.

## Example
**<heartbeat_port>**
yagagagaga marked this conversation as resolved.
Show resolved Hide resolved

1. Adding Backends Without Additional Properties
> The heartbeat port of the BE node, the default is 9050.

```sql
ALTER SYSTEM ADD BACKEND "host1:9020,host2:9020";
````
## Optional Parameters

This command adds two backend nodes to the cluster:
**1. `PROPERTIES ("<key>"="<value>" [, ... ] )`**

* host1 with port 9020
* host2 with port 9020
> A set of key-value pairs used to define additional properties of the BE node. These properties can be used to customize the configuration of the BE being added. Available properties include:
> - `tag.location`: Used to specify the Resource Group to which the BE node belongs in the integrated storage and computing mode.
> - `tag.compute_group_name`: Used to specify the compute group to which the BE node belongs in the decoupling storage and computing mode.

No additional properties are specified, so the default settings will apply to these backends.
## Access Control Requirements

2. Adding Backends With Resource Group
The user executing this SQL must have at least the following permissions:

```sql
ALTER SYSTEM ADD BACKEND "host3:9020" PROPERTIES ("tag.location" = "groupb");
```
| Privilege | Object | Notes |
|-----------|----|-------|
| NODE_PRIV | | |

## Usage Notes

This command adds a single backend node (host3 with port 9020) to the cluster in resource group `groupb`:
1. Before adding a new BE node, make sure the node is correctly configured and running.
2. Using [Resource Group](../../../../admin-manual/workload-management/resource-group.md) can help you better manage and organize the BE nodes in the cluster.
3. When adding multiple BE nodes, you can specify them in one command to improve efficiency.
3. After adding the BE nodes, use the [`SHOW BACKENDS`](./SHOW-BACKENDS.md) to verify whether they have been successfully added and are in a normal state.
4. Consider adding BE nodes in different physical locations or racks to improve the availability and fault tolerance of the cluster.
5. Regularly check and balance the load in the cluster to ensure that the newly added BE nodes are properly utilized.

## Keywords
## Examples

1. Add BE nodes without additional properties
```sql
ALTER SYSTEM ADD BACKEND "192.168.0.1:9050,192.168.0.2:9050";
```
This command adds two BE nodes to the cluster:
* 192.168.0.1,port 9050
* 192.168.0.2,port 9050
No additional properties are specified, so the default settings will be applied.

ALTER, SYSTEM, ADD, BACKEND, PROPERTIES
2. In the integrated storage and computing mode, add a BE node to a specified Resource Group
```sql
ALTER SYSTEM ADD BACKEND "doris-be01:9050" PROPERTIES ("tag.location" = "groupb");
```
This command adds a single BE node (hostname doris-be01, port 9050) to the Resource Group `groupb` in the cluster.

## Best Practice
3. In the decoupling storage and computing mode, add a BE node to a specified compute group
```sql
ALTER SYSTEM ADD BACKEND "192.168.0.3:9050" PROPERTIES ("tag.compute_group_name" = "cloud_groupc");
```
This command adds a single BE node (IP 192.168.0.3, port 9050) to the compute group `cloud_groupc` in the cluster.
Original file line number Diff line number Diff line change
Expand Up @@ -24,45 +24,65 @@ specific language governing permissions and limitations
under the License.
-->





## Description

This statement is used to undo a node offline operation. (Administrator only!)
This statement is used to cancel the decommissioning operation of a BE node.

grammar:
> This statement is not supported in decoupling storage and computing mode.

- Find backend through host and port
## Syntax

```sql
CANCEL DECOMMISSION BACKEND "host:heartbeat_port"[,"host:heartbeat_port"...];
CANCEL DECOMMISSION BACKEND "<be_identifier>" [, "<be_identifier>" ... ]
```

- Find backend through backend_id
Where:

```sql
CANCEL DECOMMISSION BACKEND "id1","id2","id3...";
be_identifier
: "<be_host>:<be_heartbeat_port>"
| "<backend_id>"
```

## Example
## Required Parameters

**<be_host>**

> It can be the hostname or IP address of the BE node.

**<heartbeat_port>**

> The heartbeat port of the BE node, the default is 9050.

**<backend_id>**

> The ID of the BE node.

:::tip
`<be_host>`, `<be_heartbeat_port>`, and `<backend_id>` can all be obtained by querying with the [SHOW BACKENDS](./SHOW-BACKENDS.md) statement.
:::

## Access Control Requirements

1. Cancel the offline operation of both nodes:
The user who executes this SQL must have at least the following permissions:

```sql
CANCEL DECOMMISSION BACKEND "host1:port", "host2:port";
```
| Privilege | Object | Notes |
|-----------|----|-------|
| NODE_PRIV | | |

2. Cancel the offline operation of the node with backend_id 1:

```sql
CANCEL DECOMMISSION BACKEND "1","2";
```
## Usage Notes

## Keywords
1. After executing this command, you can view the decommissioning status (the value of the `SystemDecommissioned` column is false) and the decommissioning progress (the value of the `TabletNum` column no longer decreases slowly) through the [SHOW BACKENDS](./SHOW-BACKENDS.md) statement.
2. The cluster will slowly migrate the tablets from other nodes back to the current BE, so that the number of tablets on each BE will eventually tend to approach.

CANCEL, DECOMMISSION, CANCEL ALTER
## Examples

## Best Practice
1. Safely decommission two nodes from the cluster according to the Host and HeartbeatPort of the BE.
```sql
CANCEL DECOMMISSION BACKEND "192.168.0.1:9050", "192.168.0.2:9050";
```

2. Safely decommission one node from the cluster according to the ID of the BE.
```sql
CANCEL DECOMMISSION BACKEND "10002";
```
Original file line number Diff line number Diff line change
Expand Up @@ -24,50 +24,70 @@ specific language governing permissions and limitations
under the License.
-->





## Description

The node offline operation is used to safely log off the node. The operation is asynchronous. If successful, the node is eventually removed from the metadata. If it fails, the logout will not be done (only for admins!)

grammar:
This statement is used to safely decommission a BE node from the cluster. This operation is asynchronous.

- Find backend through host and port
## Syntax

```sql
ALTER SYSTEM DECOMMISSION BACKEND "host:heartbeat_port"[,"host:heartbeat_port"...];
ALTER SYSTEM DECOMMISSION BACKEND "<be_identifier>" [, "<be_identifier>" ... ]
```

- Find backend through backend_id
Where:

```sql
ALTER SYSTEM DECOMMISSION BACKEND "id1","id2"...;
be_identifier
: "<be_host>:<be_heartbeat_port>"
| "<backend_id>"
```

illustrate:
## Required Parameters

1. host can be a hostname or an ip address
2. heartbeat_port is the heartbeat port of the node
3. The node offline operation is used to safely log off the node. The operation is asynchronous. If successful, the node is eventually removed from the metadata. If it fails, the logout will not be completed.
4. You can manually cancel the node offline operation. See CANCEL DECOMMISSION
**<be_host>**

## Example
> It can be the hostname or IP address of the BE node.

1. Offline two nodes
**<heartbeat_port>**

```sql
ALTER SYSTEM DECOMMISSION BACKEND "host1:port", "host2:port";
```
> The heartbeat port of the BE node, the default is 9050.

```sql
ALTER SYSTEM DECOMMISSION BACKEND "id1", "id2";
```
**<backend_id>**

## Keywords
> The ID of the BE node.

ALTER, SYSTEM, DECOMMISSION, BACKEND, ALTER SYSTEM
:::tip
`<be_host>`, `<be_heartbeat_port>`, and `<backend_id>` can all be obtained by querying with the [SHOW BACKENDS](./SHOW-BACKENDS.md) statement.
:::

## Best Practice
## Access Control Requirements

The user who executes this SQL must have at least the following permissions:

| Privilege | Object | Notes |
|-----------|----|-------|
| NODE_PRIV | | |

## Usage Notes

1. After executing this command, you can use the [SHOW BACKENDS](./SHOW-BACKENDS.md) statement to view the decommissioning status (the value of the `SystemDecommissioned` column is `true`) and the decommissioning progress (the value of the `TabletNum` column will slowly drop to 0).
2. Under normal circumstances, after the value of the `TabletNum` column drops to 0, this BE node will be deleted. If you do not want Doris to automatically delete the BE, you can change the configuration `drop_backend_after_decommission` of the FE Master to false.
3. If the current BE stores a relatively large amount of data, the DECOMMISSION operation may last for several hours or even days.
4. If the progress of the DECOMMISSION operation gets stuck, specifically, the `TabletNum` column in the [SHOW BACKENDS](./SHOW-BACKENDS.md) statement remains fixed at a certain value, it may be due to the following situations:
- There is no suitable other BE to migrate the tablets on the current BE. For example, in a 3-node cluster with a table having 3 replicas, if one of the nodes is to be decommissioned, this node cannot find other BEs to migrate the data (the other two BEs already have one replica each).
- The tablets on the current BE are still in the [Recycle Bin](../../recycle/SHOW-CATALOG-RECYCLE-BIN.md). You can [empty the recycle bin](../../recycle/DROP-CATALOG-RECYCLE-BIN.md) and then wait for decommission.
- The tablet on the current BE is too large, causing the migration of a single tablet to always timeout and unable to migrate this tablet away. You can adjust the configuration `max_clone_task_timeout_sec` of the FE Master to a larger value (the default is 7200 seconds).
- There are unfinished transactions on the tablets of the current BE. You can wait for the transactions to complete or manually abort the transactions.
- In other cases, you can filter the keyword `replicas to decommission` in the logs of the FE Master to find the abnormal tablet, use the [SHOW TABLET](../../table-and-view/data-and-status-management/SHOW-TABLET.md) statement to find the table to which this tablet belongs, then create a new table, migrate the data from the old table to the new table, and finally use the [DROP TABLE FORCE](../../table-and-view/table/DROP-TABLE.md) to delete the old table.

## Examples

1. Safely decommission two nodes from the cluster according to the Host and HeartbeatPort of the BE.
```sql
ALTER SYSTEM DECOMMISSION BACKEND "192.168.0.1:9050", "192.168.0.2:9050";
```

2. Safely decommission a node from the cluster according to the ID of the BE.
```sql
ALTER SYSTEM DECOMMISSION BACKEND "10002";
```
Original file line number Diff line number Diff line change
Expand Up @@ -24,47 +24,73 @@ specific language governing permissions and limitations
under the License.
-->




## Description

This statement is used to delete the BACKEND node (administrator only!)
This statement is used to remove BE nodes from the Doris cluster.

grammar:

- Find backend through host and port
## Syntax

```sql
ALTER SYSTEM DROP BACKEND "host:heartbeat_port"[,"host:heartbeat_port"...]
ALTER SYSTEM DROP BACKEND "<be_identifier>" [, "<be_identifier>" ... ]
```
- Find backend through backend_id

Where:

```sql
ALTER SYSTEM DROP BACKEND "id1","id2"...;
be_identifier
: "<be_host>:<be_heartbeat_port>"
| "<backend_id>"
```

illustrate:
## Required Parameters

1. host can be a hostname or an ip address
2. heartbeat_port is the heartbeat port of the node
3. Adding and deleting nodes is a synchronous operation. These two operations do not consider the existing data on the node, and the node is directly deleted from the metadata, please use it with caution.
**<be_host>**

## Example
> It can be the hostname or IP address of the BE node.

1. Delete two nodes
**<heartbeat_port>**

```sql
ALTER SYSTEM DROP BACKEND "host1:port", "host2:port";
```
> The heartbeat port of the BE node, the default is 9050.

```sql
ALTER SYSTEM DROP BACKEND "ids1", "ids2";
```
**<backend_id>**

## Keywords
> The ID of the BE node.

ALTER, SYSTEM, DROP, BACKEND, ALTER SYSTEM
:::tip
`<be_host>`, `<be_heartbeat_port>`, and `<backend_id>` can all be obtained by querying with the [SHOW BACKENDS](./SHOW-BACKENDS.md) statement.
:::

## Best Practice
## Access Control Requirements

The user who executes this SQL must have at least the following permissions:

| Privilege | Object | Notes |
|-----------|----|-------|
| NODE_PRIV | | |

## Usage Notes

1. It is not recommended to use this command to take a BE node offline. This command will directly remove the BE node from the cluster. The data on the current node will not be load-balanced to other BE nodes. Data loss may occur if there are single-replica tables in the cluster. A better approach is to use the [DECOMMISSION BACKEND](./DECOMMISSION-BACKEND.md) command to gracefully take the BE node offline.
2. Since this operation is a high-risk operation, when you directly run this command:
```sql
ALTER SYSTEM DROP BACKEND "127.0.0.1:9050";
```
```text
ERROR 1105 (HY000): errCode = 2, detailMessage = It is highly NOT RECOMMENDED to use DROP BACKEND stmt.It is not safe to directly drop a backend. All data on this backend will be discarded permanently. If you insist, use DROPP instead of DROP
```
The above prompt message will appear. If you understand what you are doing, you can replace the `DROP` keyword with `DROPP` and continue:
```sql
ALTER SYSTEM DROPP BACKEND "127.0.0.1:9050";
```

## Examples

1. Remove two nodes from the cluster based on the Host and HeartbeatPort of the BE nodes:
```sql
ALTER SYSTEM DROPP BACKEND "192.168.0.1:9050", "192.168.0.2:9050";
```

2. Remove one node from the cluster based on the ID of the BE node:
```sql
ALTER SYSTEM DROPP BACKEND "10002";
```
Loading