Skip to content

v3.1 IO Path

Andrey Kurilov edited this page Feb 22, 2017 · 1 revision

Mongoose doesn't care about the target directories, buckets, containers are used to create/read/update/delete the items. It uses the abstract "path" entity instead. Despite that, the behavior depends on the target storage used:

Case Meaning/Effect
Filesystem Relative directory path (may be w/o leading slash)
HTTP/Atmos The directory path if namespace REST interface is used
HTTP/S3 Bucket w/ leading slash and optional directory path
HTTP/Swift Container w/ leading slash and optional directory path

Item Output Path

Note:

  • The item output path is specified with "--item-output-path" option.
  • If load type is "create" and no item output path is specified the path is created with a timestamp value, for example "2016.10.25.12.35.28.436"

After processing, the item output path value is prepended to the item name. This can be seen in the item output file (if configured). For example, if the configured item output path is configured to the "path/to/target" value, the output items list file will contain the records like the following:

...
/path/to/target/0crewxu9yvxe0,174f7682f2c000c8,1048576,0/0
/path/to/target/1dnd9jbc676aj,5ab0a3085060019b,1048576,0/0
/path/to/target/1nh38h71swjky,6ca3398403bf1b22,1048576,0/0
...

So it's not necessary to specify the path (directory/bucket/etc) if the item input file is used.

Variable Output Path

It's possible to specify the variable output path ss far as Mongoose supports variable configuration values. It's required to specify the resulting hierarchy "width" and "depth" to define the variable path. The pattern is:

%p{;}

For example:

java -jar mongoose-next/mongoose.jar 
    --item-output-path=/tmp/test/%p{16\;2}
    --item-data-size=10KB --load-limit-count=1000 --storage-type=fs
    --item-output-file=/tmp/test.csv

will result the writing 1000 files of the 10KB size each to the path hierarchy, defined with the pattern (16 subdirectories on each level, 2 levels).

This can be seen in the resulting item output file from this example:

/tmp/test/5/6/0comnfkhnqbp1,172b494afe0718b5,10240,0/0
/tmp/test/f/8/0un0n3vnpl470,37f769767af92d7c,10240,0/0
/tmp/test/8/15bqjt2p0qq0l,4b7ca52770436a35,10240,0/0
/tmp/test/4/1fs55nwiajke2,5e95e2eb2a407ffa,10240,0/0
/tmp/test/7/0/075lplpyyd1dm,d121467cc3b423a,10240,0/0
/tmp/test/1/7/0xmcdo3kpvnrf,3d698022965c430b,10240,0/0
/tmp/test/2/1xbqvxh9lp2ti,7ea1f70a94692386,10240,0/0
/tmp/test/c/a/153rnnt6x3acj,4b15215b3e9f8023,10240,0/0
/test/c/0/15wmz8fdh5sl7,4c8c21a742f114fb,10240,0/0
/test/2/4/0n0srwf4pdpm9,2a0d82785a79fbf1,10240,0/0

Warning:

The ";" character used to define the path should be escaped with "\" when used from CLI.

Item Input Path

Note:

  • The item input path is specified with "--item-input-path" option.
  • The option is not effective if item input file is configured.

The behavior depends on the configured load type if item input path is used:

  • Create

    Using the item input path enables Copy Mode for the Create load type. Mongoose will try to copy the items from the input path to the output path.

  • Read/Update/Delete

    The input path will be listed and the item listing will be used to perform read/update/delete operation on the items listed.

Clone this wiki locally