Skip to content
@IndoNLP

IndoNLP

We are researchers who push up the lower bound of the Indonesian NLP standard. We are collaborating to release new data resources and benchmarks.

Pinned Loading

  1. indonlu indonlu Public

    The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)

    Jupyter Notebook 556 195

  2. nusa-crowd nusa-crowd Public

    A collaborative project to collect datasets in Indonesian languages.

    Jupyter Notebook 262 62

  3. nusax nusax Public

    High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)

    Jupyter Notebook 93 10

  4. indonlg indonlg Public

    The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, pre-trained IndoGPT and IndoBART models, and a starter code!…

    Python 70 12

Repositories

Showing 10 of 10 repositories
  • .github Public

    Landing page

    IndoNLP/.github’s past year of commit activity
    1 0 0 0 Updated Nov 26, 2024
  • indonlg Public

    The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, pre-trained IndoGPT and IndoBART models, and a starter code! (EMNLP 2021)

    IndoNLP/indonlg’s past year of commit activity
    Python 70 Apache-2.0 12 1 0 Updated Nov 16, 2024
  • indonlu Public

    The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)

    IndoNLP/indonlu’s past year of commit activity
    Jupyter Notebook 556 Apache-2.0 195 5 1 Updated Nov 16, 2024
  • nusa-writes Public

    NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.

    IndoNLP/nusa-writes’s past year of commit activity
    Jupyter Notebook 24 Apache-2.0 2 0 0 Updated Sep 27, 2024
  • cendol Public

    Indonesian T0 | Instruction-tuning for low-resource and extremely low-resource Austronesian languages

    IndoNLP/cendol’s past year of commit activity
    Jupyter Notebook 11 Apache-2.0 1 0 1 Updated Jun 24, 2024
  • nusa-crowd Public

    A collaborative project to collect datasets in Indonesian languages.

    IndoNLP/nusa-crowd’s past year of commit activity
    Jupyter Notebook 262 Apache-2.0 62 35 (5 issues need help) 2 Updated Jun 2, 2024
  • nusa-catalogue Public

    Dataset Catalogue Homepage for Indonesian Languages

    IndoNLP/nusa-catalogue’s past year of commit activity
    JavaScript 7 Apache-2.0 8 1 0 Updated Feb 19, 2024
  • nusax Public

    High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)

    IndoNLP/nusax’s past year of commit activity
    Jupyter Notebook 93 Apache-2.0 10 0 0 Updated May 8, 2023
  • nusacrowd-asr Public

    NusaCrowd ASR Experiment

    IndoNLP/nusacrowd-asr’s past year of commit activity
    Jupyter Notebook 2 Apache-2.0 0 0 0 Updated Jan 5, 2023
  • IndoNLP/indonlp.github.io’s past year of commit activity
    SCSS 1 Apache-2.0 1 0 0 Updated Jun 12, 2022