Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stress tests #227

Merged
merged 1 commit into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ dist-newstyle
*.eventlog.html
*.hp
*.prof
*.aux
*.ps
*.svg
perf.data
perf.data.*
strace.log
Expand Down
139 changes: 139 additions & 0 deletions docs/stress-tests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# Stress Tests

The stress tests are intended to test the performance of `grapesy`, primarily
ensuring that memory usage stays constant even under extreme load. The building
blocks of the test are the stress test server and client applications. These can
be ran manually as separate processes using (e.g.):

```
cabal run test-stress -- server ...
cabal run test-stress -- client ...
```

However, to automate the specific extreme performance scenarios we're interested
in, we also have a driver application that will automically spin up a selection
of servers and clients we care about:

```
cabal run test-stress -- driver
```

The driver will automically kill and restart certain clients and servers to test
"unstable" performance scenarios (e.g. do we leak memory when we attempt to
reconnect to disconnected servers?).

To enable verbose debugging traces of any of these components, pass `-v` as a
top-level option, e.g.:

```
cabal run test-stress -- -v client ...
```

**Warning:** Passing `-v` and `driver` will result in *a lot* of output.

The rest of this document will explain how to use each of these components on
the command-line and the specifics regarding what each of them is doing.

# The API

The stress test client and server communicate via the four major types of RPCs:

1. **Non-streaming:** Client sends one message, server sends one message back.
2. **Client streaming:** Client sends `N` messages, server sends one message
back.
3. **Server streaming:** Client sends one message specifying `N`, server sends
`N` messages back.
4. **Bidirectional streaming:** Client sends one messages specifying `N`, server
and client take turns sending one message back and forth until each has sent
`N` messages.

These are the "atoms" of communication between the stress test clients and
servers. The messages sent back and forth are random lazy bytestrings ranging in
length from 128 to 256 bytes (see [here](../test-stress/Test/Stress/Common.hs)).

# The Server

The server is the simplest to run. Simply specify the port it should bind to and
whether it should use TLS. For example, to start a secure server on port 50051:

```
cabal run test-stress -- server --secure --port 50051
```

Use the `--help` flag to see all available options.

By default, it will use the certificates and keys in the [`../data`](../data/)
directory, just like the demo server. See the [demo server's
documentation](./demo-server.md) for more information.

# The Client

The client takes options that specify the server it should connect to, how many
times it should connect to the server, and what calls it should execute on those
connections. For example, to run a client that opens 3 connections to an
insecure server at port 50051 and makes a client streaming call with 1234
messages and a server streaming call with 500 messages, repeating those calls on
each connection 10 times.

```bash
cabal run test-stress -- client \
--port 50051 \
--num-connections 3 \
--num-calls 5 \
--client-streaming 1234 \
--server-streaming 500
```

Clients also support running each connection concurrently via the `--concurrent`
option. Clients can connect to secure servers (using the default certificates)
using the `--secure` option, but can be configured to use non-default
certificates via other command line options just like the demo client. See the
`--help` client option and the [demo client's documentation](./demo-client.md)
for more information.

# The Driver

The driver spawns a variety of servers and clients in separate processes, and
runs for a total of 60 seconds. Each process is run with a specific heap limit
(via the `-M` RTS flag), and the application will terminate with a non-zero exit
code if any of the processes are killed with a `heap overflow` exception.

## Servers

The driver spawns four total server processes. Each server is either secure or
insecure, and either stable or unstable. Secure servers require TLS, insecure
require non-TLS. Unstable servers are killed and restarted intermittently,
stable servers are left running for the duration of the driver's execution.

## Clients

The driver spawns 56 total client processes. Similar to the servers, each client
is either secure or insecure and stable or unstable. Each client only
communicates with one of the servers. Obviously, (in)secure clients only
communicate with an (in)secure servers. Each client-server pair only
communicates in one of the following "patterns":

* **Many connections:** Open a connection, make a single non-streaming call,
repeat indefinitely. Think of this as calling `withConnection` over and over.
* **Many non-streaming calls:** Open a connection. Make a single non-streaming
call, repeat indefinitely. Think of this as calling `withRPC` and sending a
single message back and forth on a single connection over and over.
* **Client streaming:** Open a connection. Make a non-stop client streaming
call.
* **Many client streaming calls:** Open a connection. Make a client streaming
call with a few messages, repeat indefinitely.
* **Server streaming:** Same as client streaming, but server sends messages
non-stop.
* **Many server streaming calls:** Same as client streaming, but server streams.
* **Bidirectional streaming:** Same as client streaming, but both client and
server send messages indefinitely.
* **Many bidirectional streaming calls:** Same as client streaming, but both
client and server stream messages.

## Summary chart generation

The stress test driver can optionally create summary heap profile charts for the
stable components after the test is finished by passing the `--gen-charts` flag.
This will cause each stable component to emit an event log with heap profiling
events. The driver will parse the event logs and generate SVG plots of the
memory usage over time.
99 changes: 64 additions & 35 deletions grapesy.cabal
Original file line number Diff line number Diff line change
Expand Up @@ -216,8 +216,8 @@ library
, http2-tls >= 0.4.1 && < 0.5
, lens >= 5.0 && < 5.4
, mtl >= 2.2 && < 2.4
, network >= 3.1 && < 3.3
, network-run >= 0.4 && < 0.5
, network >= 3.2.4 && < 3.3
, network-run >= 0.4.1 && < 0.5
, proto-lens >= 0.7 && < 0.8
, proto-lens-runtime >= 0.7 && < 0.8
, random >= 1.2 && < 1.3
Expand Down Expand Up @@ -356,7 +356,7 @@ test-suite test-grapesy
, http2 >= 5.3.4 && < 5.4
, lens >= 5.0 && < 5.4
, mtl >= 2.2 && < 2.4
, network >= 3.1 && < 3.3
, network >= 3.2.4 && < 3.3
, prettyprinter >= 1.7 && < 1.8
, prettyprinter-ansi-terminal >= 1.1 && < 1.2
, proto-lens >= 0.7 && < 0.8
Expand Down Expand Up @@ -412,16 +412,16 @@ executable demo-client
, grapesy
build-depends:
-- External dependencies
, async >= 2.2 && < 2.3
, bytestring >= 0.10 && < 0.13
, conduit >= 1.3 && < 1.4
, contra-tracer >= 0.2 && < 0.3
, exceptions >= 0.10 && < 0.11
, network >= 3.1 && < 3.3
, optparse-applicative >= 0.16 && < 0.19
, proto-lens-runtime >= 0.7 && < 0.8
, text >= 1.2 && < 2.2
, transformers >= 0.5 && < 0.7
, async >= 2.2 && < 2.3
, bytestring >= 0.10 && < 0.13
, conduit >= 1.3 && < 1.4
, contra-tracer >= 0.2 && < 0.3
, exceptions >= 0.10 && < 0.11
, network >= 3.2.4 && < 3.3
, optparse-applicative >= 0.16 && < 0.19
, proto-lens-runtime >= 0.7 && < 0.8
, text >= 1.2 && < 2.2
, transformers >= 0.5 && < 0.7

if !flag(build-demo)
buildable:
Expand Down Expand Up @@ -458,45 +458,74 @@ executable demo-server
, grapesy
build-depends:
-- External dependencies
, aeson >= 1.5 && < 2.3
, bytestring >= 0.10 && < 0.13
, containers >= 0.6 && < 0.8
, exceptions >= 0.10 && < 0.11
, network >= 3.1 && < 3.3
, optparse-applicative >= 0.16 && < 0.19
, proto-lens-runtime >= 0.7 && < 0.8
, text >= 1.2 && < 2.2
, time >= 1.9 && < 1.13
, transformers >= 0.5 && < 0.7
, aeson >= 1.5 && < 2.3
, bytestring >= 0.10 && < 0.13
, containers >= 0.6 && < 0.8
, exceptions >= 0.10 && < 0.11
, network >= 3.2.4 && < 3.3
, optparse-applicative >= 0.16 && < 0.19
, proto-lens-runtime >= 0.7 && < 0.8
, text >= 1.2 && < 2.2
, time >= 1.9 && < 1.13
, transformers >= 0.5 && < 0.7

if !flag(build-demo)
buildable:
False

executable test-stress
test-suite test-stress
import:
, lang
, common-executable-flags
, lang
default-extensions:
RecordWildCards
type:
exitcode-stdio-1.0
hs-source-dirs:
test-stress
, proto
main-is:
Main.hs
other-modules:
Test.Stress.Client
Test.Stress.Cmdline
Test.Stress.Common
Test.Stress.Driver
Test.Stress.Driver.Summary
Test.Stress.Server
Test.Stress.Server.API

Proto.API.Trivial

Paths_grapesy
autogen-modules:
Paths_grapesy
build-depends:
-- Internal dependencies
, grapesy
build-depends:
-- External dependencies
, optparse-applicative >= 0.16 && < 0.19
, async >= 2.2 && < 2.3
, bytestring >= 0.10 && < 0.13
, Chart >= 1.9 && < 1.10
, Chart-diagrams >= 1.9 && < 1.10
, directory >= 1.3 && < 1.4
, exceptions >= 0.10 && < 0.11
, filepath >= 1.4.2.1 && < 1.6
, ghc-events >= 0.17 && < 0.20
, http2 >= 5.3.4 && < 5.4
, network >= 3.2.4 && < 3.3
, optparse-applicative >= 0.16 && < 0.19
, process >= 1.6.12 && < 1.7
, tls >= 1.7 && < 2.2
, random >= 1.2 && < 1.3

if !flag(build-stress-test)
buildable:
False

if flag(snappy)
cpp-options: -DSNAPPY

test-suite grapesy-interop
import:
, lang
Expand Down Expand Up @@ -560,14 +589,14 @@ test-suite grapesy-interop
build-depends:
, grapesy
build-depends:
, ansi-terminal >= 1.1 && < 1.2
, bytestring >= 0.10 && < 0.13
, exceptions >= 0.10 && < 0.11
, mtl >= 2.2 && < 2.4
, network >= 3.1 && < 3.3
, optparse-applicative >= 0.16 && < 0.19
, proto-lens-runtime >= 0.7 && < 0.8
, text >= 1.2 && < 2.2
, ansi-terminal >= 1.1 && < 1.2
, bytestring >= 0.10 && < 0.13
, exceptions >= 0.10 && < 0.11
, mtl >= 2.2 && < 2.4
, network >= 3.2.4 && < 3.3
, optparse-applicative >= 0.16 && < 0.19
, proto-lens-runtime >= 0.7 && < 0.8
, text >= 1.2 && < 2.2

benchmark grapesy-kvstore
import:
Expand Down
13 changes: 10 additions & 3 deletions src/Network/GRPC/Client/Connection.hs
Original file line number Diff line number Diff line change
Expand Up @@ -598,13 +598,20 @@ overrideRateLimits connParams clientConfig = clientConfig {

openClientSocket :: HTTP2Settings -> AddrInfo -> IO Socket
openClientSocket http2Settings =
Run.openClientSocketWithOptions socketOptions
Run.openClientSocketWithOpts socketOptions
where
socketOptions :: [(SocketOption, Int)]
socketOptions :: [(SocketOption, SockOptValue)]
socketOptions = concat [
[ (NoDelay, 1)
[ ( NoDelay
, SockOptValue @Int 1
)
| http2TcpNoDelay http2Settings
]
, [ ( Linger
, SockOptValue $ StructLinger { sl_onoff = 1, sl_linger = 0 }
)
| http2TcpAbortiveClose http2Settings
]
]

-- | Write-buffer size
Expand Down
4 changes: 2 additions & 2 deletions src/Network/GRPC/Common/Compression.hs
Original file line number Diff line number Diff line change
Expand Up @@ -101,8 +101,8 @@ only compr = chooseFirst (compr :| [noCompression])

-- | Insist on the specified algorithm, /no matter what the peer offers/
--
-- This is dangerous: if the peer does not supported the specified algorithm,
-- it will be unable to decompress any messages. Primarily used for testing.
-- This is dangerous: if the peer does not support the specified algorithm, it
-- will be unable to decompress any messages. Primarily used for testing.
--
-- See also 'only'.
insist :: Compression -> Negotation
Expand Down
17 changes: 17 additions & 0 deletions src/Network/GRPC/Common/HTTP2Settings.hs
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,22 @@ data HTTP2Settings = HTTP2Settings {
-- TL;DR: leave this at the default unless you know what you are doing.
, http2TcpNoDelay :: Bool

-- | Set @SO_LINGER@ to a value of 0
--
-- Instead of following the normal shutdown sequence to close the TCP
-- connection, this will just send a @RST@ packet and immediately discard
-- the connection, freeing the local port.
--
-- This should /not/ be enabled in the vast majority of cases. It is only
-- useful in specific scenarios, such as stress testing, where resource
-- (e.g. port) exhaustion is a greater concern than protocol adherence.
-- Even in such scenarios scenarios, it probably only makes sense to
-- enable this option on the client since they will be using a new
-- ephemeral port for each connection (unlike the server).
--
-- TL;DR: leave this at the default unless you know what you are doing.
, http2TcpAbortiveClose :: Bool

-- | Ping rate limit
--
-- This setting is specific to the [@http2@
Expand Down Expand Up @@ -169,6 +185,7 @@ defaultHTTP2Settings = HTTP2Settings {
http2MaxConcurrentStreams = defMaxConcurrentStreams
, http2StreamWindowSize = defInitialStreamWindowSize
, http2ConnectionWindowSize = defInitialConnectionWindowSize
, http2TcpAbortiveClose = False
, http2TcpNoDelay = True
, http2OverridePingRateLimit = Just 100
, http2OverrideEmptyFrameRateLimit = Nothing
Expand Down
Loading