Skip to content

emc-mongoose/mongoose-storage-driver-pravega-kvs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gitter chat Issue Tracker Maven metadata URL Docker Pulls

Content

  1. Introduction
  2. Features
  3. Deployment
      3.1. Basic
      3.2. Docker
        3.2.1. Standalone
        3.2.2. Distributed
          3.2.2.1. Additional Node
          3.2.2.2. Entry Node
  4. Configuration
      4.1. Specific Options
      4.2. Tuning
        4.2.1. Concurrency
        4.2.2. Base Storage Driver Usage Warnings
  5. Usage
      5.1. Create
      5.2. Read
      5.3. Update
      5.4. Delete
      5.5. Key families
  6. Development
      7.1. Build
      7.2. Test
        6.2.1. Automated
          6.2.1.1. Unit
          6.2.1.2. Integration
          6.2.1.3. Functional
        6.2.2. Manual

1. Introduction

This driver is intended to estimate performance of Pravega Key-Value Store. To work with streams refer to the mongoose-storage-driver-pravega

Mongoose and Pravega are using quite different concepts. So it's necessary to determine how Pravega KVS-specific terms are mapped to the Mongoose abstractions.

Pravega Mongoose
Key-Value Table Item Path or Data Item
Scope Storage Namespace
Key-Value Pair Data Item
Table Segment (KVT Partition) N/A

2. Features

TBD

3. Deployment

3.1. Basic

Java 11+ is required to build/run.

  1. Get the latest mongoose-base jar from the maven repo and put it to your working directory. Note the particular version, which is referred as BASE_VERSION below.

  2. Get the latest mongoose-storage-driver-coop jar from the maven repo and put it to the ~/.mongoose/<BASE_VERSION>/ext directory.

  3. Get the latest mongoose-storage-driver-pravega-kvs jar from the maven repo and put it to the ~/.mongoose/<BASE_VERSION>/ext directory.

java -jar mongoose-base-<BASE_VERSION>.jar \
    --storage-driver-type=pravega-kvs \
    --storage-namespace=scope1 \
    --storage-net-node-addrs=<NODE_IP_ADDRS> \
    --storage-net-node-port=9090 \
    --load-batch-size=100 \
    ...

3.2. Docker

3.2.1. Standalone

docker run \
    --network host \
    emcmongoose/mongoose-storage-driver-pravega-kvs \
    --storage-namespace=scope1 \
    --storage-net-node-addrs=<NODE_IP_ADDRS> \
    --load-batch-size=100 \
    ...

3.2.2. Distributed

3.2.2.1. Additional Node

docker run \
    --network host \
    --expose 1099 \
    emcmongoose/mongoose-storage-driver-pravega-kvs \
    --run-node

3.2.2.2. Entry Node

docker run \
    --network host \
    emcmongoose/mongoose-storage-driver-pravega-kvs \
    --load-step-node-addrs=<ADDR1,ADDR2,...> \
    --storage-net-node-addrs=<NODE_IP_ADDRS> \
    --storage-namespace=scope1 \
    --load-batch-size=100 \
    ...

4. Configuration

4.1. Specific Options

Name Type Default Value Description
storage-driver-control-scope boolean true Allow to try to create scope
storage-driver-control-kvt boolean true Allow to try to create kvt
storage-driver-control-timeoutMillis integer 5000 The timeout for any Pravega Controller API call
storage-driver-event-key-enabled boolean false Specifies if Mongoose should generate its own routing key during the events creation
storage-driver-event-key-count integer 0 Specifies a max count of unique routing keys to use during the events creation (may be considered as a routing key period). 0 value means to use unique routing key for each new event
storage-net-node-addrs list of strings 127.0.0.1 The list of the Pravega storage nodes to use for the load
storage-net-node-port integer 9090 The default port of the Pravega storage nodes, should be explicitly set to 9090 (the value used by Pravega by default)
storage-net-maxConnPerSegmentstore integer 5 The default amount of connections per each Pravega Segmentstore
storage-driver-family-key-enabled boolean false Specifies if Mongoose should use Key Families
storage-driver-family-key-count long 0 The default amount of Key Families
storage-driver-family-key-allow-empty boolean false Specifies if Mongoose should allow KVP w/o Key Families
storage-driver-scaling-partitions int 1 The default amount of partitions (Table segments) in KVT

4.2. Tuning

  • storage-net-maxConnPerSegmentstore This parameter can largely affect the performance, but it also increases network workload

4.2.2. Base Storage Driver Usage Warnings

See the design notes

5. Usage

5.1 Create

java -jar mongoose-base-<BASE_VERSION>.jar \
    --storage-driver-type=pravega-kvs \
    --storage-namespace=scope1 \
    --storage-net-node-addrs=<NODE_IP_ADDRS> \
    --storage-net-node-port=9090 \
    --item-output-path=items.csv \
    ...

5.2 Read

Right now only the read from file is supported (so the option --item-output-path=items.csv is used in create).

java -jar mongoose-base-<BASE_VERSION>.jar \
    --load-op-type=read \
    --storage-driver-type=pravega-kvs \
    --storage-namespace=scope1 \
    --storage-net-node-addrs=<NODE_IP_ADDRS> \
    --storage-net-node-port=9090 \
    --item-input-file=items.csv \
    ...

5.3 Update

To run an update load mongoose needs to know the keys to update which so far can only be provided by specifying --item-input-path=items.csv option. To create the file use --item-output-path=items.csv in create mode. As mongoose uses a fixed seed you need to alter the seed to upload different data. To have a convenient way of setting a new seed for each run learn more about expression language.

One thing to notice: Pravega uses same mechanism for creates and updates. So if you update non-existing keys you basically create them. There is no way you can pass a key to update and get 404. You should use read mode for that.

java -jar mongoose-base-<BASE_VERSION>.jar \
    --load-op-type=update \
    --storage-driver-type=pravega-kvs \
    --storage-namespace=scope1 \
    --storage-net-node-addrs=<NODE_IP_ADDRS> \
    --storage-net-node-port=9090 \
    --item-input-file=items.csv \
    --item-data-input-seed=7a42d9c482144167 \
    ...

5.4 Delete

To run a delete load mongoose needs to know the keys to delete which so far can only be provided by specifying --item-input-path=items.csv option. To create the file use --item-output-path=items.csv in create mode. One thing to notice: Pravega checks the key sent in the request, if the key exists, Pravega deletes it. If it doesn't, Pravega still says everything's fine, so Mongoose understands that as a successful operation. This way you can delete same N keys an endless amount of times and each time get N successfully finished requests, though the keys were actually only deleted the first time.

java -jar mongoose-base-<BASE_VERSION>.jar \
    --load-op-type=delete \
    --storage-driver-type=pravega-kvs \
    --storage-namespace=scope1 \
    --storage-net-node-addrs=<NODE_IP_ADDRS> \
    --storage-net-node-port=9090 \
    --item-input-file=items.csv \
    ...

5.5 Key families

Key families are disabled by default.

To do creates with Key families one needs to enable it and set the amount of keys (family-key parameters). If also having an empty family during creates is desired, then allow-empty flag can be used.

Reads do not require any additional flags for key families as long as the input-file is used.

So, a full example with 10 key families and allowed no key family looks like this:

java -jar mongoose-base-<BASE_VERSION>.jar \
    --storage-driver-type=pravega-kvs \
    --storage-namespace=scope1 \
    --storage-net-node-addrs=<NODE_IP_ADDRS> \
    --storage-net-node-port=9090 \
    --storage-driver-family-key-enabled \
    --storage-driver-family-key-count=10 \
    --storage-driver-family-key-allow-empty
    ...

6. Open Issues

Issue Description

7. Development

7.1. Build

Note the Pravega commit # which should be used to build the corresponding Mongoose plugin. Specify the required Pravega commit # in the build.gradle file. Then run:

./gradlew clean jar

7.2. Test

7.2.1. Automated

7.2.1.1. Unit

./gradlew clean test

7.2.1.2. Integration

docker run -d --name=storage --network=host pravega/pravega:<PRAVEGA_VERSION> standalone
./gradlew integrationTest

7.2.1.3. Functional

TBD

7.2.1. Manual

  1. Build the storage driver
  2. Copy the storage driver's jar file into the mongoose's ext directory:
cp -f build/libs/mongoose-storage-driver-pravega-kvs-*.jar ~/.mongoose/<MONGOOSE_BASE_VERSION>/ext/

Note that the Pravega storage driver depends on the Coop Storage Driver extension so it should be also put into the ext directory 3. Build and install the corresponding Pravega version:

./gradlew pravegaDistInstall
  1. Run the Pravega standalone node:
cd build/pravega/build/distributions/
tar -xzf pravega-<version>.tgz
./pravega-<version>/bin/pravega-standalone
  1. Run Mongoose's default scenario with some specific command-line arguments:
java -jar mongoose-<MONGOOSE_BASE_VERSION>.jar \
    --storage-driver-type=pravega-kvs \
    --storage-net-node-port=9090 \
    --storage-driver-limit-concurrency=10 \
    --item-output-path=goose-events-stream-0

About

Mongoose driver to test the Pravega Key-Value Store performance

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages