New Table Model Deployment and Operations Document

apache · Dec 10, 2024 · e59a294 · e59a294
1 parent 75324c7
commit e59a294
Show file tree

Hide file tree

Showing 12 changed files with 3,432 additions and 0 deletions.
diff --git a/...UserGuide/Master/Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md b/...UserGuide/Master/Table/Deployment-and-Maintenance/Cluster-Deployment_timecho.md
diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Database-Resources.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/Database-Resources.md
@@ -0,0 +1,194 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+    
+        http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+# Database Resources
+## CPU
+<table style="text-align: center;">
+      <tr>
+            <th rowspan="2">Number of timeseries (frequency<=1HZ)</th>
+            <th rowspan="2">CPU</th>        
+            <th colspan="3">Number of nodes</th>
+      </tr>
+      <tr>
+      <th>standalone mode</th>   
+      <th>Double active</th> 
+      <th>Distributed</th> 
+      </tr>
+      <tr>
+            <td>Within 100000</td>
+            <td>2core-4core</td>
+            <td>1</td>
+            <td>2</td>
+            <td>3</td>
+      </tr>
+      <tr>
+            <td>Within 300000</td>
+            <td>4core-8core</td>
+            <td>1</td>
+            <td>2</td>
+            <td>3</td>
+      </tr>
+      <tr>
+            <td>Within 500000</td>
+            <td>8core-26core</td>
+            <td>1</td>
+            <td>2</td>
+            <td>3</td>
+      </tr>
+      <tr>
+            <td>Within 1000000</td>
+            <td>16core-32core</td>
+            <td>1</td>
+            <td>2</td>
+            <td>3</td>
+      </tr>
+      <tr>
+            <td>Within 2000000</td>
+            <td>32core-48core</td>
+            <td>1</td>
+            <td>2</td>
+            <td>3</td>
+      </tr>
+      <tr>
+            <td>Within 10000000</td>
+            <td>48core</td>
+            <td>1</td>
+            <td>2</td>
+            <td>Please contact Timecho Business for consultation</td>
+      </tr>
+      <tr>
+            <td>Over 10000000</td>
+            <td colspan="4">Please contact Timecho Business for consultation</td>
+      </tr>
+</table>
+
+## Memory 
+<table style="text-align: center;">
+      <tr>
+            <th rowspan="2">Number of timeseries (frequency<=1HZ)</th>
+            <th rowspan="2">Memory</th>        
+            <th colspan="3">Number of nodes</th>
+      </tr>
+      <tr>
+      <th>standalone mode</th>   
+      <th>Double active</th> 
+      <th>Distributed</th> 
+      </tr>
+      <tr>
+            <td>Within 100000</td>
+            <td>4G-8G</td>
+            <td>1</td>
+            <td>2</td>
+            <td>3</td>
+      </tr>
+      <tr>
+            <td>Within 300000</td>
+            <td>12G-32G</td>
+            <td>1</td>
+            <td>2</td>
+            <td>3</td>
+      </tr>
+      <tr>
+            <td>Within 500000</td>
+            <td>24G-48G</td>
+            <td>1</td>
+            <td>2</td>
+            <td>3</td>
+      </tr>
+      <tr>
+            <td>Within 1000000</td>
+            <td>32G-96G</td>
+            <td>1</td>
+            <td>2</td>
+            <td>3</td>
+      </tr>
+      <tr>
+            <td>Within 2000000</td>
+            <td>64G-128G</td>
+            <td>1</td>
+            <td>2</td>
+            <td>3</td>
+      </tr>
+      <tr>
+            <td>Within 10000000</td>
+            <td>128G</td>
+            <td>1</td>
+            <td>2</td>
+            <td>Please contact Timecho Business for consultation</td>
+      </tr>
+      <tr>
+            <td>Over 10000000</td>
+            <td colspan="4">Please contact Timecho Business for consultation</td>
+      </tr>
+</table>
+
+## Storage (Disk)
+### Storage space
+Calculation formula: Number of measurement points * Sampling frequency (Hz) * Size of each data point (Byte, different data types may vary, see table below) * Storage time (seconds) * Number of copies (usually 1 copy for a single node and 2 copies for a cluster) ÷ Compression ratio (can be estimated at 5-10 times, but may be higher in actual situations)
+<table style="text-align: center;">
+      <tr>
+            <th colspan="4">Data point size calculation</th>
+      </tr>
+      <tr>
+            <th>data type</th>   
+            <th>Timestamp (Bytes)</th> 
+            <th> Value (Bytes)</th> 
+            <th> Total size of data points (in bytes) 
+      </th> 
+      </tr>
+      <tr>
+            <td>Boolean</td>
+            <td>8</td>
+            <td>1</td>
+            <td>9</td>
+      </tr>
+      <tr>
+            <td> INT32/FLOAT</td>
+            <td>8</td>
+            <td>4</td>
+            <td>12</td>
+      </tr>
+      <tr>
+            <td>INT64/DOUBLE</td>
+            <td>8</td>
+            <td>8</td>
+            <td>16</td>
+      </tr>
+      <tr>
+            <td>TEXT</td>
+            <td>8</td>
+            <td>The average is a</td>
+            <td>8+a</td>
+      </tr>
+</table>
+
+Example: 1000 devices, each with 100 measurement points, a total of 100000 sequences, INT32 type. Sampling frequency 1Hz (once per second), storage for 1 year, 3 copies.
+- Complete calculation formula: 1000 devices * 100 measurement points * 12 bytes per data point * 86400 seconds per day * 365 days per year * 3 copies/10 compression ratio=11T
+- Simplified calculation formula: 1000 * 100 * 12 * 86400 * 365 * 3/10=11T
+### Storage Configuration
+If the number of nodes is over 10000000 or the query load is high, it is recommended to configure SSD
+## Network (Network card)
+If the write throughput does not exceed 10 million points/second, configure 1Gbps network card. When the write throughput exceeds 10 million points per second, a 10Gbps network card needs to be configured.
+| **Write throughput (data points per second)** | **NIC rate** |
+| ------------------- | ------------- |
+| <10 million | 1Gbps |
+| >=10 million | 10Gbps |
+## Other instructions
+IoTDB has the ability to scale up clusters in seconds, and expanding node data does not require migration. Therefore, you do not need to worry about the limited cluster capacity estimated based on existing data. In the future, you can add new nodes to the cluster when you need to scale up.
diff --git a/src/UserGuide/Master/Table/Deployment-and-Maintenance/Environment-Requirements.md b/src/UserGuide/Master/Table/Deployment-and-Maintenance/Environment-Requirements.md
@@ -0,0 +1,191 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+    
+        http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+# System Requirements
+
+## Disk Array
+
+### Configuration Suggestions
+
+IoTDB has no strict operation requirements on disk array configuration. It is recommended to use multiple disk arrays to store IoTDB data to achieve the goal of concurrent writing to multiple disk arrays. For configuration, refer to the following suggestions:
+
+1. Physical environment
+    System disk: You are advised to use two disks as Raid1, considering only the space occupied by the operating system itself, and do not reserve system disk space for the IoTDB
+    Data disk：
+    Raid is recommended to protect data on disks
+    It is recommended to provide multiple disks (1-6 disks) or disk groups for the IoTDB. (It is not recommended to create a disk array for all disks, as this will affect the maximum performance of the IoTDB.)
+2. Virtual environment
+    You are advised to mount multiple hard disks (1-6 disks).
+
+### Configuration Example
+
+- Example 1: Four 3.5-inch hard disks
+
+Only a few hard disks are installed on the server. Configure Raid5 directly.
+The recommended configurations are as follows:
+| **Use classification** | **Raid type**  | **Disk number** | **Redundancy** | **Available capacity** |
+| ----------- | -------- | -------- | --------- | -------- |
+| system/data disk | RAID5 | 4 | 1 | 3 | is allowed to fail|
+
+- Example 2: Twelve 3.5-inch hard disks
+
+The server is configured with twelve 3.5-inch disks.
+Two disks are recommended as Raid1 system disks. The two data disks can be divided into two Raid5 groups. Each group of five disks can be used as four disks.
+The recommended configurations are as follows:
+| **Use classification** | **Raid type**  | **Disk number** | **Redundancy** | **Available capacity** |
+| -------- | -------- | -------- | --------- | -------- |
+| system disk   | RAID1    | 2        | 1 | 1        |
+| data disk   | RAID5    | 5        | 1 | 4        |
+| data disk   | RAID5    | 5        | 1 | 4        |
+- Example 3:24 2.5-inch disks
+
+The server is configured with 24 2.5-inch disks.
+Two disks are recommended as Raid1 system disks. The last two disks can be divided into three Raid5 groups. Each group of seven disks can be used as six disks. The remaining block can be idle or used to store pre-write logs.
+The recommended configurations are as follows:
+| **Use classification** | **Raid type**  | **Disk number** | **Redundancy** | **Available capacity** |
+| -------- | -------- | -------- | --------- | -------- |
+| system disk   | RAID1    | 2        | 1 | 1        |
+| data disk   | RAID5    | 7        | 1 | 6        |
+| data disk   | RAID5    | 7        | 1 | 6        |
+| data disk   | RAID5    | 7        | 1 | 6        |
+| data disk   | NoRaid   | 1        | 0 | 1        |
+
+## Operating System
+
+### Version Requirements
+
+IoTDB supports operating systems such as Linux, Windows, and MacOS, while the enterprise version supports domestic CPUs such as Loongson, Phytium, and Kunpeng. It also supports domestic server operating systems such as Neokylin, KylinOS, UOS, and Linx.
+
+### Disk Partition
+
+- The default standard partition mode is recommended. LVM extension and hard disk encryption are not recommended.
+- The system disk needs only the space used by the operating system, and does not need to reserve space for the IoTDB.
+- Each disk group corresponds to only one partition. Data disks (with multiple disk groups, corresponding to raid) do not need additional partitions. All space is used by the IoTDB.
+The following table lists the recommended disk partitioning methods.
+<table>
+      <tr>
+            <th>Disk classification</th>
+            <th>Disk set</th>        
+            <th>Drive</th>
+            <th>Capacity</th>
+            <th>File system type</th>
+      </tr>
+    <tr>
+            <td rowspan="2">System disk</td>
+            <td rowspan="2">Disk group0</td> 
+            <td>/boot</td>  
+            <td>1GB</td> 
+            <td>Acquiesce</td> 
+      </tr>
+      <tr>
+            <td>/</td>  
+            <td>Remaining space of the disk group</td> 
+            <td>Acquiesce</td> 
+      </tr>
+      <tr>
+            <td rowspan="3">Data disk</td>
+            <td>Disk set1</td> 
+            <td>/data1</td>  
+            <td>Full space of disk group1</td> 
+            <td>Acquiesce</td> 
+      </tr>
+      <tr>
+            <td>Disk set2</td> 
+            <td>/data2</td>  
+            <td>Full space of disk group2</td> 
+            <td>Acquiesce</td> 
+      </tr>
+      <tr>
+            <td colspan="4">......</td>   
+      </tr>
+</table>
+### Network Configuration
+
+1. Disable the firewall
+
+```Bash
+# View firewall
+systemctl status firewalld
+# Disable firewall
+systemctl stop firewalld
+# Disable firewall permanently
+systemctl disable firewalld
+```
+2. Ensure that the required port is not occupied
+
+(1) Check the ports occupied by the cluster: In the default cluster configuration, ConfigNode occupies ports 10710 and 10720, and DataNode occupies ports 6667, 10730, 10740, 10750, 10760, 9090, 9190, and 3000. Ensure that these ports are not occupied. Check methods are as follows:
+
+```Bash
+lsof -i:6667 or netstat -tunp | grep 6667
+lsof -i:10710 or netstat -tunp | grep 10710
+lsof -i:10720 or netstat -tunp | grep 10720
+# If the command outputs, the port is occupied.
+```
+
+(2) Checking the port occupied by the cluster deployment tool: When using the cluster management tool opskit to install and deploy the cluster, enable the SSH remote connection service configuration and open port 22.
+
+```Bash
+yum install openssh-server # Install the ssh service
+systemctl start sshd # Enable port 22
+```
+
+3. Ensure that servers are connected to each other
+
+### Other Configuration
+
+1. Disable the system swap memory
+
+```Bash
+echo "vm.swappiness = 0">> /etc/sysctl.conf
+# The swapoff -a and swapon -a commands are executed together to dump the data in swap back to memory and to empty the data in swap.
+# Do not omit the swappiness setting and just execute swapoff -a; Otherwise, swap automatically opens again after the restart, making the operation invalid.
+swapoff -a && swapon -a
+# Make the configuration take effect without restarting.
+sysctl -p
+# Check memory allocation, expecting swap to be 0
+free -m
+```
+2. Set the maximum number of open files to 65535 to avoid the error of "too many open files".
+
+```Bash
+# View current restrictions
+ulimit -n
+# Temporary changes
+ulimit -n 65535
+# Permanent modification
+echo "* soft nofile 65535" >>  /etc/security/limits.conf
+echo "* hard nofile 65535" >>  /etc/security/limits.conf
+# View after exiting the current terminal session, expect to display 65535
+ulimit -n
+```
+## Software Dependence
+
+Install the Java runtime environment (Java version >= 1.8). Ensure that jdk environment variables are set. (It is recommended to deploy JDK17 for V1.3.2.2 or later. In some scenarios, the performance of JDK of earlier versions is compromised, and Datanodes cannot be stopped.)
+
+```Bash
+# The following is an example of installing in centos7 using JDK-17:
+tar -zxvf JDk-17_linux-x64_bin.tar # Decompress the JDK file
+Vim ~/.bashrc # Configure the JDK environment
+{   export JAVA_HOME=/usr/lib/jvm/jdk-17.0.9
+    export PATH=$JAVA_HOME/bin:$PATH
+} # Add JDK environment variables
+source ~/.bashrc # The configuration takes effect
+java -version # Check the JDK environment
+```