diff --git a/.resources/user_guide/tnet/goroutine_per_connection.png b/.resources/user_guide/tnet/goroutine_per_connection.png new file mode 100644 index 0000000..e89243c Binary files /dev/null and b/.resources/user_guide/tnet/goroutine_per_connection.png differ diff --git a/.resources/user_guide/tnet/goroutine_per_connection_zh_CN.png b/.resources/user_guide/tnet/goroutine_per_connection_zh_CN.png new file mode 100644 index 0000000..34bf585 Binary files /dev/null and b/.resources/user_guide/tnet/goroutine_per_connection_zh_CN.png differ diff --git a/.resources/user_guide/tnet/reactor.png b/.resources/user_guide/tnet/reactor.png new file mode 100644 index 0000000..5f833fc Binary files /dev/null and b/.resources/user_guide/tnet/reactor.png differ diff --git a/docs/user_guide/tnet.md b/docs/user_guide/tnet.md new file mode 100644 index 0000000..8584d28 --- /dev/null +++ b/docs/user_guide/tnet.md @@ -0,0 +1,185 @@ +# Using high-performance networking framework tnet with tRPC-Go + +English | [中文](./tnet_zh_CN.md) + +## Introduction + +The Golang Net library provides a simple non-blocking interface, with a network model that employs `Goroutine-per-Connection`. In most scenarios, this model is straightforward and user-friendly. However, when dealing with thousands or even millions of connections, allocating a goroutine for each connection consumes a significant amount of memory, and managing a large number of goroutines becomes challenging. + +To support the capability of handling millions of connections, it is essential to break away from the `Goroutine-per-Connection` model. The high-performance networking library [tnet](https://github.com/trpc-group/tnet) is built on a `Reactor` network model, enabling the handling of millions of connections. The tRPC-Go framework integrates the tnet network library, thereby providing support for handling millions of connections. In addition to this, tnet also offers features such as batch packet transmission and reception, zero-copy buffering, and fine-grained memory management optimizations, making it outperform the native Golang Net library in terms of performance. + +## Principle + +We use two diagrams to illustrate the basic principles of the `Goroutine-per-Connection` model and the `Reactor` model in Golang. + +### Goroutine-per-Connection + +![goroutine_per_connection](/.resources/user_guide/tnet/goroutine_per_connection.png) + +In the Goroutine-per-Connection model, when the server accepts a new connection, it creates a goroutine for that connection, and then reads data from the connection, processes the data, and sends data back to the connection in that goroutine. + +The scenario of millions of connections usually refers to long connection scenarios. Although the total number of connections is huge, only a small number of connections are active at any given time. Active connections refer to connections that have data to be read or written at a certain moment. Conversely, when there is no data to be read or written on a connection, it is called an idle connection. The idle connection goroutine will block on the Read call. Although the goroutine does not occupy scheduling resources, it still occupies memory resources, which ultimately leads to huge memory consumption. In this model, allocating a goroutine for each connection in a scenario of millions of connections is expensive. + +For example, as shown in the above diagram, the server accepts 5 connections and creates 5 goroutines. At this moment, the first 3 connections are active connections, and data can be read from them smoothly. After processing the data, the server sends data back to the connections to complete a data exchange, and then starts the second round of data reading. The last 2 connections are idle connections, and when reading data from them, the process will be blocked. Therefore, the subsequent process is not triggered. As we can see, although only 3 connections can successfully read data at this moment, 5 goroutines are allocated, resulting in a 40% waste of resources. The larger the proportion of idle connections, the more resources will be wasted. + +### Reactor + +![reactor](/.resources/user_guide/tnet/reactor.png) + +The Reactor model refers to using multiplexing (epoll/kqueue) to listen for events such as read and write on file descriptors (FDs), and then performing corresponding operations when events are triggered. + +In the diagram, the poller structure is responsible for listening to events on FDs. Each poller occupies a goroutine, and the number of pollers is usually equal to the number of CPUs. We use a separate poller to listen for read events on the listener port to accept new connections, and then listen for read events on each connection. When a connection becomes readable, a goroutine is allocated to read data from the connection, process the data, and send data back to the connection. At this point, there will be no idle connections occupying goroutines. In a scenario of millions of connections, only active connections are allocated goroutines, which can make full use of memory resources. + +For example, as shown in the above diagram, the server has 5 pollers, one of which is responsible for listening to the listener events and accepting new connections, and the other 4 pollers are responsible for listening to read events on connections. When 2 connections become readable at a certain moment, a goroutine is allocated for each connection to read data, process data, and send data back to the connection. Because it is already known that these two connections are readable, the Read process will not block, and the subsequent process can be executed smoothly. When writing data back to the connection, the goroutine registers a write event with the poller, and then exits. The poller listens for write events on the connection and sends data when the connection is writable, completing a round of data exchange. + +## Quick start + +### Enable tnet + +There are two ways to enable tnet in tRPC-Go. Choose one of them for configuration. It is recommended to use the first method. + +(1) Add tnet in the tRPC-Go framework configuration file. + +(2) Use the WithTransport() method in the code to enable tnet. + +#### Method 1: Configuration file (recommended) + +Add tnet to the transport field in the tRPC-Go configuration file. Since the plugin currently only supports TCP, please do not configure the tnet plugin for UDP services. The server and client can each independently activate tnet, and they do not interfere with each other. + +**Server**: + +``` yaml +server: + transport: tnet # Applies to all services + service: + - name: trpc.app.server.service + network: tcp + transport: tnet # Applies only to the current service +``` + +After the server is started, the log indicates the successful activation of tnet: + +`INFO tnet/server_transport.go service:trpc.app.server.service is using tnet transport, current number of pollers: 1` + +**Client**: + +``` yaml +client: + transport: tnet # Applies to all services + service: + - name: trpc.app.server.service + network: tcp + transport: tnet # Applies only to the current service + conn_type: multiplexed # Using multiplexed connection mode + multiplexed: + enable_metrics: true # Enable metrics for multiplexed pool +``` + +It's recommended to use multiplexed connection mode with tnet to enhance performance, because it can fully leverage tnet's batch packet transmission capabilities. + +After the client is started, the log indicates the successful activation of tnet (Trace level): + +`Debug tnet/client_transport.go roundtrip to:127.0.0.1:8000 is using tnet transport, current number of pollers: 1` + +#### Method 2: Code configuration + +**Server**: + +Notics: This method will enable tnet for all services of the server. + +``` go +import "trpc.group/trpc-go/trpc-go/transport/tnet" + +func main() { + // Create a ServerTransport + trans := tnet.NewServerTransport() + // Create a trpc server + s := trpc.NewServer(server.WithTransport(trans)) + pb.RegisterGreeterService(s, &greeterServiceImpl{}) + s.Serve() +} +``` + +**Client**: + +``` go +import "trpc.group/trpc-go/trpc-go/transport/tnet" + +func main() { + proxy := pb.NewGreeterClientProxy() + trans := tnet.NewClientTransport() + rsp, err := proxy.SayHello(trpc.BackgroundContext(), &pb.HelloRequest{Msg: "Hello"}, client.WithTransport(trans)) +} +``` + +## Use Cases + +According to the benchmark result, tnet transport outperforms gonet transport in specific scenarios. However, not all scenarios exhibit these advantages. Here, we summarize the scenarios in which tnet transport excels. + +**Advantageous Scenarios for tnet:** + +- When using tnet in server, if the client sends requests using multiplexed connection mode, it can fully utilize tnet's batch packet transmission capabilities, leading to increased QPS and reduced CPU usage. + +- When using tnet in server, if there are a large number of idle connections, it can reduce memory usage by lowering the number of goroutines. + +- When using tnet in client, if the multiplexed mode is enabled, it can fully leverage tnet's batch packet transmission capabilities, resulting in improved QPS. + +**Other Scenarios:** + +- When using tnet in server, if the client sends requests not using multiplexed connection mode, performance is similar to gonet. + +- When using tnet in client, if the multiplexed mode is disable, performance is similar to gonet. + +## FAQ + +#### Q:Does tnet support HTTP? + +Tnet doesn't support HTTP. When tnet is used in HTTP server/client, it automatically falls back to using the golang net package. + +#### Q:Why doesn't performance improve after enabling tnet? + +Tnet is not a universal solution, and it can significantly boost service performance by fully utilizing Writev for batch packet transmission and reducing system calls in specific scenarios. If you find that the service performance is still not satisfactory in tnet's advantageous scenarios, you can consider optimizing your service using the following steps: + +Enable the client-side multiplexed connection mode with tnet and make full use of Writev for batch packet transmission whenever possible; + +Enable tnet and multiplexed connection mode for the entire service chain. If the upstream server utilizes multiplexed, the current server can also take advantage of Writev for batch packet transmission; + +If you have enabled the multiplexed connection mode, you can enable metrics to inspect the number of virtual connections on each connection. If there is substantial concurrency, causing an excessive number of virtual connections on a single connection, it can also impact performance. Configure and enable multiplexed metrics accordingly. + + +```yaml +client: + service: + - name: trpc.test.helloworld.Greeter1 + transport: tnet + conn_type: multiplexed + multiplexed: + enable_metrics: true # Enable metrics for multiplexed pool +``` + +Every 3 seconds, logs containing the multiplexed status are printed. For example, you can see that the current number of active connections is 1, and the total number of virtual connections is 98. + +`DEBUG tnet multiplex status: network: tcp, address: 127.0.0.1:7002, connections number: 1, concurrent virtual connection number: 98` + +It also reports status to custom metrics, and the format of the metrics items is as follows: + +Active connections:`trpc.MuxConcurrentConnections.$network.$address` + +Virtual connections:`trpc.MuxConcurrentVirConns.$network.$address` + +Assuming you want to set the maximum concurrent virtual connections per connection to 25, you can add the following configuration: + +```yaml +client: + service: + - name: trpc.test.helloworld.Greeter1 + transport: tnet + conn_type: multiplexed + multiplexed: + enable_metrics: true + max_vir_conns_per_conn: 25 # maximum number of concurrent virtual connections per connection +``` + +#### Q:Why does it log `switch to gonet default transport, tnet server transport doesn't support network type [udp]` after enabling tnet? + +The log indicates tnet transport does't support UDP. It will automatically falls back to using golang net package. \ No newline at end of file diff --git a/docs/user_guide/tnet_zh_CN.md b/docs/user_guide/tnet_zh_CN.md new file mode 100644 index 0000000..bda5c05 --- /dev/null +++ b/docs/user_guide/tnet_zh_CN.md @@ -0,0 +1,185 @@ +# tRPC-Go 接入高性能网络库 tnet + +[English](./tnet.md) | 中文 + +## 前言 + +Golang 的 Net 库提供了简单的非阻塞调用接口,网络模型采用`一个连接一个协程`。在多数的场景下,这个模型简单易用,但是当连接数量成千上万之后,在百万连接的级别,为每个连接分配一个协程将消耗极大的内存,并且调度大量协程也变的非常困难。为了支持百万连接的功能,必须打破一个连接一个协程模型,高性能网络库 [tnet](https://github.com/trpc-group/tnet) 基于`事件驱动`的网络模型,能够提供百万连接的能力。tRPC-Go 框架集成了 tnet 网络库,从而支持百万连接功能。除此之外,tnet 还支持批量收发包功能,零拷贝缓存,精细化内存管理等优化,因此拥有比 Golang 原生 net 库更优秀的性能。 + + +## 原理 + +我们通过两张图展示 Golang 中一个连接一个协程模型和基于事件驱动模型的基本原理。 + +### 一个连接一个协程 + +![goroutine_per_connection](/.resources/user_guide/tnet/goroutine_per_connection_zh_CN.png) + +一个连接一个协程的模式下,服务端 Accept 一个新的连接,就为该连接起一个协程,然后在这个协程中从连接读数据、处理数据、向连接发数据。 + +百万连接场景通常指的是长连接场景,虽然连接总数巨大,但是活跃的连接数量只占少数,活跃的连接指的是某一时刻连接上有数据可读/写,相对的当连接上没有数据可读/写,此连接被称为空闲连接。空闲连接协程会阻塞在 Read 调用,此时协程虽然不会占用调度资源,但是依然会占用内存资源,最终导致消耗巨大的内存。按照这种模式,在百万连接场景下,为每个连接都分配一个协程成本是昂贵的。 + +例如上图所示,服务端 Accept 了 5 个连接,创建了 5 个协程,在这一时刻,前 3 个连接是活跃连接,可以顺利的从连接中读取得到数据,处理数据后向连接发送数据完成一次数据交互,然后进行第二轮数据读取。而后 2 个连接是空闲连接,从连接中读取数据的时候会阻塞,于是后续的流程没有触发。可以看到,这一时刻,虽然只有 3 个连接是可以成功地读取到数据,但是却分配了 5 个协程,资源浪费了 40%,空闲连接占比越大,资源浪费就越多。 + +### 事件驱动 + +![reactor](/.resources/user_guide/tnet/reactor.png) + +事件驱动模式是指利用多路复用(epoll / kqueue)监听 FD 的可读、可写等事件,当有事件触发的时候做相应的处理。 + +图中 Poller 结构负责监听 FD 上的事件,每个 Poller 占用一个协程,Poller 的数量通常等于 CPU 的数量。我们采用了单独的 Poller 来监听 listener 端口的可读事件来 Accept 新的连接,然后监听每个连接的可读事件,当连接变得可读时,再分配协程从连接读数据、处理数据、向连接发数据。此时不会再有空闲连接占用协程,在百万连接场景下,只为活跃连接分配协程,可以充分利用内存资源。 + +例如上图所示,服务端有 5 个 Poller,其中有 1 个单独的 Poller 负责监听 Listener 事件,接收新连接,其余 4 个 Poller 负责监听连接可读事件,在连接可读时,触发处理过程。在这一时刻,Poller 监听到有 2 个连接可读,于是为每个连接分配一个协程,从连接中读取数据、处理数据、写回数据,因为此时已经知道这两个连接可读,所以 Read 过程不会阻塞,后续的流程可以顺利执行,最终 Write 的时候,会向 Poller 注册可写事件,然后协程退出,Poller 监听连接可写,在连接可写的时候发送数据,完成一轮数据交互。 + +## 快速上手 + +### 使用方法 + +支持两种配置方式,用户选择其一进行配置即可,推荐使用第一种配置方法。 + +(1)在 tRPC-Go 框架配置文件中启用 tnet + +(2)在代码中调用 WithTransport() 方法启用 tnet + +#### 方法一:配置文件(推荐) + +在 tRPC-Go 的配置文件中的 transport 字段添加 tnet。因为插件现阶段只支持 TCP,所以 UDP 服务请不要配置 tnet 插件。服务端和客户端可以单独开启 tnet,二者互不影响。 + +**服务端**: + +``` yaml +server: + transport: tnet # 对所有 service 全部生效 + service: + - name: trpc.app.server.service + network: tcp + transport: tnet # 只对当前 service 生效 +``` + +服务端启动后,日志提示启用 tnet 成功: + +`INFO tnet/server_transport.go service:trpc.app.server.service is using tnet transport, current number of pollers: 1` + +**客户端**: + +``` yaml +client: + transport: tnet # 对所有 service 全部生效 + service: + - name: trpc.app.server.service + network: tcp + transport: tnet # 只对当前 service 生效 + conn_type: multiplexed # 使用多路复用连接模式 + multiplexed: + enable_metrics: true # 开启多路复用运行状态的监控 +``` + +推荐客户端开启 tnet 的同时使用多路复用连接模式,充分利用 tnet 批量收发包的能力,提高性能。 + +客户端启动服务后通过 log 确认插件启用成功(Trace 级别): + +`Debug tnet/client_transport.go roundtrip to:127.0.0.1:8000 is using tnet transport, current number of pollers: 1` + +#### 方法二:代码配置 + +**服务端**: + +注意:这种方式会对 server 的所有 service 都启动 tnet。 + +``` go +import "trpc.group/trpc-go/trpc-go/transport/tnet" + +func main() { + // 创建一个 ServerTransport + trans := tnet.NewServerTransport() + // 创建一个 trpc 服务 + s := trpc.NewServer(server.WithTransport(trans)) + pb.RegisterGreeterService(s, &greeterServiceImpl{}) + s.Serve() +} +``` + +**客户端**: + +``` go +import "trpc.group/trpc-go/trpc-go/transport/tnet" + +func main() { + proxy := pb.NewGreeterClientProxy() + trans := tnet.NewClientTransport() + rsp, err := proxy.SayHello(trpc.BackgroundContext(), &pb.HelloRequest{Msg: "Hello"}, client.WithTransport(trans)) +} +``` + +## 适用场景 + +我们使用 tnet 进行了压力测试,从测试结果来看,tnet transport 相比 gonet transport 在特定场景下可以提供更好的性能,但是不是所有场景都有优势。在此总结 tnet transport 的优势场景。 + +**tnet 优势场景:** + +- 作为服务端使用 tnet,客户端发送请求使用多路复用的模式,可以充分发挥 tnet 批量收发包的能力,可以提高 QPS,降低 CPU 占用 + +- 作为服务端使用 tnet,存在大量的不活跃连接的场景,可以通过减少协程数等逻辑降低内存占用 + +- 作为客户端使用 tnet,开启多路复用模式,可以充分发挥 tnet 批量收发包的能力,可以提高 QPS。 + +**其他场景:** + +- 作为服务端使用 tnet,客户端发送请求使用连接池模式,性能表现和 gonet 基本持平 + +- 作为客户端使用 tnet,开启连接池模式,性能表现和 gonet 基本持平 + +## 常见问题 + +#### Q:tnet 支持 HTTP 吗? + +tnet 不支持 HTTP,在使用 HTTP 协议的服务端/客户端开启 tnet 的话,会自动降级使用 golang net 库。 + +#### Q:开启 tnet 之后性能为什么没有提升? + +tnet 并不是万金油,在特定的场景下可以充分利用 Writev 批量发包,减少系统调用,是可以提高服务的性能的。如果在 tnet 的优势场景下服务性能仍不理想,可以按照以下步骤针对自己的服务进行优化。 + +开启客户端的 tnet 多路复用(multiplexed)功能,尽可能利用 Writev 批量发包; + +为整个服务链路开启 tnet 和多路复用,上游使用多路复用的话,当前服务端也可以充分利用 Writev 批量发包; + +如果使用了多路复用功能,可以开启多路复用监控,查看每个连接上有多少虚拟连接,如果并发量较大,导致单连接上的虚拟连接数过多,也会影响性能,添加配置开启多路复用监控上报。 + + +```yaml +client: + service: + - name: trpc.test.helloworld.Greeter1 + transport: tnet + conn_type: multiplexed + multiplexed: + enable_metrics: true # 开启多路复用运行状态的监控 +``` + +每隔3s,就会打印多路复用状态的日志。在日志中可以看到当前的连接数是1个,虚拟连接总数是98个。 + +`DEBUG tnet multiplex status: network: tcp, address: 127.0.0.1:7002, connections number: 1, concurrent virtual connection number: 98` + +同时也会上报自定义监控,监控项格式是: + +并发连接数:`trpc.MuxConcurrentConnections.$network.$address` + +虚拟连接总数:`trpc.MuxConcurrentVirConns.$network.$address` + +假设希望设置每个连接上的最大并发虚拟连接数量为25,可以添加如下配置: + +```yaml +client: + service: + - name: trpc.test.helloworld.Greeter1 + transport: tnet + conn_type: multiplexed + multiplexed: + enable_metrics: true # 开启多路复用监控 + max_vir_conns_per_conn: 25 # 每个连接上的最大并发虚拟连接数量 +``` + +#### Q:开启 tnet 后提示 `switch to gonet default transport, tnet server transport doesn't support network type [udp]`? + +这个报错的意思是,tnet transport 暂时不支持 UDP,自动降级使用 golang net 库,不影响服务正常启动。 +