Skip to content

lab-v2/langdiversity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

33019eb · Sep 29, 2023

History

27 Commits
Sep 10, 2023
Sep 28, 2023
Sep 28, 2023
Sep 29, 2023
Sep 15, 2023
Sep 9, 2023
Sep 4, 2023
Sep 8, 2023
Sep 15, 2023
Sep 7, 2023
Sep 21, 2023
Sep 29, 2023

Repository files navigation

LangDiversity

PyPI version Python version License

Elevate your language models with insightful diversity metrics.

Links

Paper: https://arxiv.org/abs/2308.11189

Video: https://www.youtube.com/watch?v=BekDOLm6qBI&t=10s&ab_channel=NeuroSymbolic

Check out LangDiversity Hello World if you're new.

Table of Contents

Introduction

LangDiversity is a package that provides tools to calculate diversity measures for a given set of data. Specifically, it can compute measures like Shannon's entropy and Gini impurity. It also offers utilities to select prompts based on their diversity scores when interacting with models like OpenAI's GPT-3.5 Turbo.

The primary goal of this project is to assist researchers and developers in analyzing the diversity of responses generated by language models, thereby aiding in the evaluation and fine-tuning of such models.

Installation

pip install langdiversity

Usage

Detailed documentation is available here.

Bibtex

If you used this software in your work please cite our paper

@misc{ngu2023diversity,
      title={Diversity Measures: Domain-Independent Proxies for Failure in Language Model Queries},
      author={Noel Ngu and Nathaniel Lee and Paulo Shakarian},
      year={2023},
      eprint={2308.11189},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

This repository is licensed under BSD-3-Clause

Contacts

For any inquiries or feedback, please contact: