Skip to content

Repository collecting resources and best practices to improve experimental rigour in deep learning research.

License

Notifications You must be signed in to change notification settings

Kaleidophon/awesome-experimental-standards-deep-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Experimental Standards for Deep Learning in Natural Language Processing Research

This repository contains supplementary material from our paper of the same title. Since our paper can only capture the state of affairs at the time of publication, the idea here is to keep a more up-to-date version of the resources in the appendix here, and invite the community to collaborate in a transparent manner.

We maintain a version of Table 1 in the original paper, giving an overview over useful resources for different stages of the research process, namely Data, Codebase & Models, Experiments & Analysis, and Publication.

In CHECKLIST.md, we distil the actionable points at the end of the core paper sections into a reusable and modifiable checklist to ensure replicability.

In CHANGELOG.md, we transparently document changes to the repository and versioning. The current version is v0.1.

🎓 Citing

If you find the resources helpful or are using the checklist for one of your academic projects, please cite us in the following way:

@inproceedings{ulmer-etal-2022-experimental,
title = "Experimental Standards for Deep Learning in Natural Language Processing Research",
author = {Ulmer, Dennis  and
  Bassignana, Elisa  and
  M{\"u}ller-Eberstein, Max  and
  Varab, Daniel  and
  Zhang, Mike  and
  van der Goot, Rob  and
  Hardmeier, Christian  and
  Plank, Barbara},
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",
month = dec,
year = "2022",
address = "Abu Dhabi, United Arab Emirates",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.findings-emnlp.196",
pages = "2673--2692",
}

In your paper, you could cite our work for instance as follows:

For our experimental design, we follow many of the guidelines laid out by \citet{ulmer2022experimental}.

🧩 Contributing

Contributing can come in two forms: Opening an issue to correct mistakes or improve the existing content, or adding new content by opening pull requests.

When opening an issue, please label the issue accordingly:

  • enhancement-resources for issues improving or correcting entries in RESOURCES.md.
  • enhancement-standards for issues improving or correcting entries in CHECKLIST.md.
  • duplicate for indicating duplicate entries.
  • general for general questions / issues with the repository.

To contribute and add new content, please first check the CONTRIBUTING.md file and read the contributing guideline before opening a pull request. Use the label

  • enhancement-resources for pull requests adding new resources and
  • enhancement-standards for pull requests adding new points to the checklist.

The pull request template can be checked under PULL_REQUEST_TEMPLATE.md.

Resources

We split up Table 1 from the paper into section specific resources below.

📊 Data

Name Description Link / Reference
Data Version Control (DVC) Command line tool to version datasets and models Link / Paper
Hugging Face datasets Hub to store and share (NLP) data set. Link / Paper
European Language Resources Association Public institution for language and evaluation resources. About / Link
LINDAT/CLARIN Open access to language resources and other data and services for the support of research in digital humanities and social sciences. Link Paper
Zenodo General-purpose open-access repository for research papers, data sets, research software,reports, and any other research related digital artifacts. Link

💻 Codebase & Model

Name Description Link / Reference
Anonymous Github Website to double-anonymize a Github repository. Link
BitBucket A website and cloud-based service that helps developers store and manage their code, as well as track and control changes to their code. Link
Conda Open Source package management systemand environment management system. Link
codecarbon Python package estimating and tracking carbon emission of various kind of computer programs. Link
ONNX Open format built to represent Machine Learning models. Link
Pipenv Virtual environment for managing Python packages. Link
Releasing Research Code Github repository including many tips and templates for releasing research code  Link
Virtualenv Tool to create isolated Python environments. Link

🔬 Experiments & Analysis

Name Description Link / Reference
baycomp Python implementation of Bayesian tests for the comparison of classifiers. Link / Paper
BayesianTestML As baycomp, but also including Julia and R implementations. Link / Paper
confidenceinterval Python package that computes confidence intervals for common evaluation metrics. Link
deep-significance Python package implementing the ASO test by Dror et al. (2019) and other utilities. Link
HyBayes Python package implementing a variety of frequentist and Bayesian significance tests. Link
Hugging Face evaluate Library that implements standardized versions of evaluation metrics and significance tests Link
pingouin Python package implementing various parametric and non-parametric statistical tests. Link / Paper
Protocol buffers Data structure for model predictions Link
RankingNLPSystems Python package to create a fair global ranking of models across multiple tasks (and eval metrics) Link / Paper

📄 Publication

Name Description Link / Reference
dlpd Computer science bibliography to find correct versions of papers. Link
impact Online calculator of carbon emissions based on GPU type Link / Paper
Google scholar Scientific publication search engine. Link
Semantic Scholar Scientific publication search engine. Link
rebiber Python tool to check and normalize the bib entries to the official published versions of the cited papers. Link

Releases

No releases published

Packages

No packages published

Languages