- Introduction
- Features
- Deployment
  3.1. Basic
  3.2. Docker
    3.2.1. Standalone
    3.2.2. Distributed
      3.2.2.1. Additional Node
      3.2.2.2. Entry Node - Configuration
  4.1. Specific Options
  4.2. Tuning
    4.2.1. Concurrency
    4.2.2. Base Storage Driver Usage Warnings - Usage
  5.1. Create
  5.2. Read
  5.3. Update
  5.4. Delete
  5.5. Key families - Development
  7.1. Build
  7.2. Test
    6.2.1. Automated
      6.2.1.1. Unit
      6.2.1.2. Integration
      6.2.1.3. Functional
    6.2.2. Manual
This driver is intended to estimate performance of Pravega Key-Value Store. To work with streams refer to the mongoose-storage-driver-pravega
Mongoose and Pravega are using quite different concepts. So it's necessary to determine how Pravega KVS-specific terms are mapped to the Mongoose abstractions.
Pravega | Mongoose |
---|---|
Key-Value Table | Item Path or Data Item |
Scope | Storage Namespace |
Key-Value Pair | Data Item |
Table Segment (KVT Partition) | N/A |
TBD
Java 11+ is required to build/run.
-
Get the latest
mongoose-base
jar from the maven repo and put it to your working directory. Note the particular version, which is referred as BASE_VERSION below. -
Get the latest
mongoose-storage-driver-coop
jar from the maven repo and put it to the~/.mongoose/<BASE_VERSION>/ext
directory. -
Get the latest
mongoose-storage-driver-pravega-kvs
jar from the maven repo and put it to the~/.mongoose/<BASE_VERSION>/ext
directory.
java -jar mongoose-base-<BASE_VERSION>.jar \
--storage-driver-type=pravega-kvs \
--storage-namespace=scope1 \
--storage-net-node-addrs=<NODE_IP_ADDRS> \
--storage-net-node-port=9090 \
--load-batch-size=100 \
...
docker run \
--network host \
emcmongoose/mongoose-storage-driver-pravega-kvs \
--storage-namespace=scope1 \
--storage-net-node-addrs=<NODE_IP_ADDRS> \
--load-batch-size=100 \
...
docker run \
--network host \
--expose 1099 \
emcmongoose/mongoose-storage-driver-pravega-kvs \
--run-node
docker run \
--network host \
emcmongoose/mongoose-storage-driver-pravega-kvs \
--load-step-node-addrs=<ADDR1,ADDR2,...> \
--storage-net-node-addrs=<NODE_IP_ADDRS> \
--storage-namespace=scope1 \
--load-batch-size=100 \
...
Name | Type | Default Value | Description |
---|---|---|---|
storage-driver-control-scope | boolean | true | Allow to try to create scope |
storage-driver-control-kvt | boolean | true | Allow to try to create kvt |
storage-driver-control-timeoutMillis | integer | 5000 | The timeout for any Pravega Controller API call |
storage-driver-event-key-enabled | boolean | false | Specifies if Mongoose should generate its own routing key during the events creation |
storage-driver-event-key-count | integer | 0 | Specifies a max count of unique routing keys to use during the events creation (may be considered as a routing key period). 0 value means to use unique routing key for each new event |
storage-net-node-addrs | list of strings | 127.0.0.1 | The list of the Pravega storage nodes to use for the load |
storage-net-node-port | integer | 9090 | The default port of the Pravega storage nodes, should be explicitly set to 9090 (the value used by Pravega by default) |
storage-net-maxConnPerSegmentstore | integer | 5 | The default amount of connections per each Pravega Segmentstore |
storage-driver-family-key-enabled | boolean | false | Specifies if Mongoose should use Key Families |
storage-driver-family-key-count | long | 0 | The default amount of Key Families |
storage-driver-family-key-allow-empty | boolean | false | Specifies if Mongoose should allow KVP w/o Key Families |
storage-driver-scaling-partitions | int | 1 | The default amount of partitions (Table segments) in KVT |
storage-net-maxConnPerSegmentstore
This parameter can largely affect the performance, but it also increases network workload
See the design notes
java -jar mongoose-base-<BASE_VERSION>.jar \
--storage-driver-type=pravega-kvs \
--storage-namespace=scope1 \
--storage-net-node-addrs=<NODE_IP_ADDRS> \
--storage-net-node-port=9090 \
--item-output-path=items.csv \
...
Right now only the read from file is supported (so the option --item-output-path=items.csv
is used in create).
java -jar mongoose-base-<BASE_VERSION>.jar \
--load-op-type=read \
--storage-driver-type=pravega-kvs \
--storage-namespace=scope1 \
--storage-net-node-addrs=<NODE_IP_ADDRS> \
--storage-net-node-port=9090 \
--item-input-file=items.csv \
...
To run an update load mongoose needs to know the keys to update which so far can only be provided by specifying
--item-input-path=items.csv
option. To create the file use --item-output-path=items.csv
in create mode.
As mongoose uses a fixed seed you need to alter the seed to upload different data. To have a convenient way of setting
a new seed for each run learn more about expression language.
One thing to notice: Pravega uses same mechanism for creates and updates. So if you update non-existing keys you basically create them. There is no way you can pass a key to update and get 404. You should use read mode for that.
java -jar mongoose-base-<BASE_VERSION>.jar \
--load-op-type=update \
--storage-driver-type=pravega-kvs \
--storage-namespace=scope1 \
--storage-net-node-addrs=<NODE_IP_ADDRS> \
--storage-net-node-port=9090 \
--item-input-file=items.csv \
--item-data-input-seed=7a42d9c482144167 \
...
To run a delete load mongoose needs to know the keys to delete which so far can only be provided by specifying
--item-input-path=items.csv
option. To create the file use --item-output-path=items.csv
in create mode.
One thing to notice: Pravega checks the key sent in the request, if the key exists, Pravega deletes it. If it doesn't,
Pravega still says everything's fine, so Mongoose understands that as a successful operation. This way you can delete
same N keys an endless amount of times and each time get N successfully finished requests, though the keys were actually only
deleted the first time.
java -jar mongoose-base-<BASE_VERSION>.jar \
--load-op-type=delete \
--storage-driver-type=pravega-kvs \
--storage-namespace=scope1 \
--storage-net-node-addrs=<NODE_IP_ADDRS> \
--storage-net-node-port=9090 \
--item-input-file=items.csv \
...
Key families are disabled by default.
To do creates with Key families one needs to enable it and set the amount of keys (family-key
parameters).
If also having an empty family during creates is desired, then allow-empty flag can be used.
Reads do not require any additional flags for key families as long as the input-file is used.
So, a full example with 10 key families and allowed no key family looks like this:
java -jar mongoose-base-<BASE_VERSION>.jar \
--storage-driver-type=pravega-kvs \
--storage-namespace=scope1 \
--storage-net-node-addrs=<NODE_IP_ADDRS> \
--storage-net-node-port=9090 \
--storage-driver-family-key-enabled \
--storage-driver-family-key-count=10 \
--storage-driver-family-key-allow-empty
...
Issue | Description |
---|
Note the Pravega commit # which should be used to build the corresponding Mongoose plugin.
Specify the required Pravega commit # in the build.gradle
file. Then run:
./gradlew clean jar
./gradlew clean test
docker run -d --name=storage --network=host pravega/pravega:<PRAVEGA_VERSION> standalone
./gradlew integrationTest
TBD
- Build the storage driver
- Copy the storage driver's jar file into the mongoose's
ext
directory:
cp -f build/libs/mongoose-storage-driver-pravega-kvs-*.jar ~/.mongoose/<MONGOOSE_BASE_VERSION>/ext/
Note that the Pravega storage driver depends on the
Coop Storage Driver
extension so it should be also put into the ext
directory
3. Build and install the corresponding Pravega version:
./gradlew pravegaDistInstall
- Run the Pravega standalone node:
cd build/pravega/build/distributions/
tar -xzf pravega-<version>.tgz
./pravega-<version>/bin/pravega-standalone
- Run Mongoose's default scenario with some specific command-line arguments:
java -jar mongoose-<MONGOOSE_BASE_VERSION>.jar \
--storage-driver-type=pravega-kvs \
--storage-net-node-port=9090 \
--storage-driver-limit-concurrency=10 \
--item-output-path=goose-events-stream-0