This is the official repository for the EMNLP 2024 (findings) paper "Improving Quotation Attribution with Fictional Character Embeddings". We train LUAR models on Drama plays to distinguish utterances of fictional characters, and use the resulting models to derive character representations that are further injected in a Quotation Attribution model to improve the accuracy on unseen literary works.
This repository contains two subfolders:
- UAR that includes all the code and data used to train LUAR on drama plays.
- quotation_attribution, a clone from BookNLP+ where we modify the original quotation attribution model. This folder contains all code to train and reproduce our quotation attribution experiments.
Run the following commands to create an environment and install all the required packages:
python3 -m venv charemb
. ./charemb/bin/activate
pip3 install -U pip
pip3 install -r requirements.txt
Each folder has its own README file, with instructions to run code.