Skip to content

An approach to using BERT models for text classification within a hierarchical taxonomy.

Notifications You must be signed in to change notification settings

elnaske/Hierarchical_Text_Classification_with_BERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hierarchical Text Classification with BERT

An approach to hierarchical text classification using BERT-based models. It explicitly limits the classes that can be predicted for lower tiers by masking the logit outputs of the prediction layer with a binary vector designating the dependencies between different levels of the hierarchy.

Based on a project for the course L665 - Applying Machine Learning Techniques in Computational Linguistics at Indiana University Bloomington. Because the original implementation used a dataset that is not yet publicly availaible, this model was trained on the Blurb Genre Collection dataset.

Architecture

Dataset

The model input consisted of the

  • subcategories
  • first three levels

Methodology

  • RoBERTa

  • Fine-tune t1

  • Train t2

  • Hugging Face Trainer API

Results

About

An approach to using BERT models for text classification within a hierarchical taxonomy.

Topics

Resources

Stars

Watchers

Forks

Languages