ftmq

An attempt towards a followthemoney query dsl.

This library provides methods to query and filter entities formatted as followthemoney data, either from a json file/stream or using a SQL backend via followthemoney-store

It also provides a Query class that can be used in other libs to work with SQL queries or api queries.

Minimum Python version: 3.11

Installation

pip install ftmq

Usage

ftmq accepts either a line-based input stream or an argument with a file uri. (For integration with followthemoney-store, see below)

Input stream:

cat entities.ftm.json | ftmq <filter expression> > output.ftm.json

URI argument:

Under the hood, ftmq uses smart_open to be able to interpret arbitrary file uris as argument -i:

ftmq <filter expression> -i ~/Data/entities.ftm.json
ftmq <filter expression> -i https://example.org/data.json.gz
ftmq <filter expression> -i s3://data-bucket/entities.ftm.json
ftmq <filter expression> -i webhdfs://host:port/path/file

...and so on

Of course, the same is possible for output -o:

cat data.json | ftmq <filter expression> -o s3://data-bucket/output.json

Filter for a dataset:

cat entities.ftm.json | ftmq -d ec_meetings

Filter for a schema:

cat entities.ftm.json | ftmq -s Person

Filter for a schema and all it's descendants or ancestors:

cat entities.ftm.json | ftmq -s LegalEntity --schema-include-descendants
cat entities.ftm.json | ftmq -s LegalEntity --schema-include-ancestors

Filter for properties:

Properties are options via --<prop>=<value>

cat entities.ftm.json | ftmq -s Company --country=de

Comparison lookups for properties:

cat entities.ftm.json | ftmq -s Company --incorporationDate__gte=2020 --address__ilike=berlin

Possible lookups:

gt - greater than
lt - lower than
gte - greater or equal
lte - lower or equal
like - SQLish LIKE (use % placeholders)
ilike - SQLish ILIKE, case-insensitive (use % placeholders)
[] - usage: prop[]=foo evaluates if foo is member of array prop

ftmq apply

"Uplevel" an entity input stream to nomenklatura.entity.CompositeEntity and optionally apply a dataset.

ftmq apply -i ./entities.ftm.json -d <aditional_dataset>

Overwrite datasets:

ftmq apply -i ./entities.ftm.json -d <aditional_dataset> --replace-dataset

Coverage / Statistics

Often in ftm scripting, we are iterating through all the proxies (e.g. during aggregation). Why not use this to collect statistics on the way? There is a context manager for this, which turns into the Coverage model:

Print coverage to stdout (and filtered entities to nowhere):

cat entities.ftm.json | ftmq -s Event -o /dev/null --coverage-uri -

Within code:

from ftmq.coverage import Collector

fragments = [...]
buffer = {}

c = Collector()
for proxy in fragments:
    if proxy.id in buffer:
        buffer[proxy.id].merge(proxy)
    else:
        buffer[proxy.id] = proxy
        # here collect stats:
        c.collect(proxy)

coverage = c.export()

ftmstore (database read)

NOT IMPLEMENTED YET

The same cli logic applies:

ftmq store iterate -d ec_meetings -s Event --date__gte=2019 --date__lte=2020

Python Library

from ftmq import Query

q = Query() \
    .where(dataset="ec_meetings", date__lte=2020) \
    .where(schema="Event") \
    .order_by("date", ascending=False)

assert q.apply(proxy)

support

This project is part of investigraph

Media Tech Lab Bayern batch #3

Name		Name	Last commit message	Last commit date
Latest commit History 695 Commits
.github		.github
.vscode		.vscode
ftmq		ftmq
js		js
tests		tests
.bumpversion.cfg		.bumpversion.cfg
.gitignore		.gitignore
.npmignore		.npmignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
VERSION		VERSION
benchmark.py		benchmark.py
package-lock.json		package-lock.json
package.json		package.json
poetry.lock		poetry.lock
py.typed		py.typed
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ftmq

Installation

Usage

Filter for a dataset:

Filter for a schema:

Filter for properties:

Comparison lookups for properties:

ftmq apply

Coverage / Statistics

ftmstore (database read)

Python Library

support

About

Releases 24

Packages

Contributors 3

Languages

License

investigativedata/ftmq

Folders and files

Latest commit

History

Repository files navigation

ftmq

Installation

Usage

Filter for a dataset:

Filter for a schema:

Filter for properties:

Comparison lookups for properties:

ftmq apply

Coverage / Statistics

ftmstore (database read)

Python Library

support

About

Resources

License

Stars

Watchers

Forks

Releases 24

Packages 0

Contributors 3

Languages

Packages