classy-bench
is a low-code Python library that simplifies the process of training and evaluating baseline models for real-world multi-label classification applications. Simply provide your datasets, and quickly get a benchmark of multiple models tailored to your specific use case.
Features and benefits:
- Ready-to-use pipelines: 7 built-in configurable pipelines (BM25, Class TF-IDF, Doc TF-IDF, Bi-Encoder, Cross-Encoder, RoBERTa and T5)
- Customizable: support for bring-your-own AWS Sagemaker pipeline code
- Scalable: run training pipeline on any type and any number of instances
- Faster experimentation: quickly understand which model performs best on your data
- Low-code: any member of the team can run the benchmark with confidence
This library was created as part of a research project "The Right Model for the Job: An Evaluation of Legal Multi-Label Classification Baselines". (Forster, M., Schulz, C., Nokku, P., Mirsafian, M., Kasundra, J. and Skylaki, S.). See the paper on arXiv for more details.
pip install git+https://github.com/thomsonreuters/classy_bench.git
PyPI link coming soon!
- A dataset that is split into 3 files (
train.csv
,dev.csv
andtest.csv
) that contain train, validation and test sets respectively. Each file must have the following columns:id
: an identifier for each sample, e.g. a document idtext
: the input textlabels
: the labels list as a string (e.g."[LabelA, OtherLabel, LabelB]"
)
- Provide a
config.json
file that specifies which classifiers you want to run. In this config file, you can also set hyperparameters for training and evaluation. We recommend that you start by using the default_config.json and adjust it as needed. - Run the benchmark as shown in the
notebooks/example.ipynb
notebook.
-
If you are planning to use any of the included pipelines, you must have a dataset split into 3 files (
train.csv
,dev.csv
andtest.csv
) that contain train, validation and test sets respectively. Each file must have the following columns:id
: an identifier for each sample, e.g. a document idtext
: the input textlabels
: the labels list as a string (e.g."[LabelA, OtherLabel, LabelB]"
)
If you are planning to only use custom pipelines, you can set your own rules. :)
-
Provide a
config.json
file that specifies which classifiers you want to run. Please refer the Custom Pipeline page for an example on how to set up the config file. We recommend that you start by using the default_config.json and adjust it as needed. -
Run the benchmark as shown in the
notebooks/example.ipynb
notebook.
See the Wiki for more information on how to use the library.
- Claudia Schulz
- Edoardo Abati
- Laura Skylaki
- Martina Forster
- Prudhvi Nokku
- Sammy Hannat
Please note we are unable to promise immediate support and/or regular updates to this project at this stage. However, if you need help or have any improvement ideas, please feel free to open an issue or a PR. We'd love to hear your feedback!
Requirements: hatch
, pre-commit
- Clone the repository
- Run
hatch shell
to create and activate a virtual environment - Run
pre-commit install
to install the pre-commit hooks. This will force the linting and formatting checks.
- Linting and formatting checks:
hatch run lint:fmt
- Unit tests:
hatch run test
classy-bench
is distributed under the terms of the MIT license.