Iterative JSON filtering tool based on ijson.
Loading, filtering and output is performed without loading the whole file into memory, allowing filtering of large JSON files.
pip install ijson-filter
ijson is using YAJL2 library for fast parsing, so it's highly recommended that you install it as well. It will still work without YAJL2, but significantly slower.
Usage: ijson-filter [OPTIONS] [INPUT]
Streaming JSON filter.
Options:
-o, --output FILENAME Output filename, defaults to STDOUT.
-f, --filter JSON_PATH_FILTER Filter a JSON path, format:
"PREFIX_PATH[=INT|~REGEX]" Examples: get last
50 elements of data.rows - "data.rows=-50",
get only data.rows and data.description keys
- "data~(rows|description)"
-v, --verbose Verbose output.
--help Show this message and exit.
data.json:
{
"name": "Primary data set #1",
"table": {
"description": "Users",
"rows": [
{ "name": "User1", ... },
...
]
}
}
- Limit the number of items in rows field to 50 last items, of data.json file and output to STDOUT:
$ ijson-filter -f "table.rows=-50" input.json
- Remove fields that contain a number in table object (using regular expressions) of data.json and output to filtered.json:
$ ijson-filter -f 'table~[^\d]+' data.json -o filtered.json
- Filter output from unix commands and chain it to other commands (limit array to first 3 objects):
$ echo '[1,2,3,4,5]' | ijson-filter -f 3 | python -m json.tool
[
1,
2,
3
]
- It's possible to use multiple filters at once by specifying
--filter
parameter multiple times:
$ ijson-filter -f 'table~rows' -f 'table.rows=5' data.json > filtered.json