Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Typos in readme #1

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 10 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ trainer = ClassifierTrainer()
trained_model = trainer.train_classification_model(
- train_df: pd.DataFrame: Training DataFrame with 2 columns: ["excerpt", "target_classification"],
- val_df: pd.DataFrame: Validation DataFrame with 2 columns: ["excerpt", "target_classification"],
- architecture_setup: str: one of ["base_architecture", "multiabel_architecture"], default='multiabel_architecture',
- architecture_setup: str: one of ["base_architecture", "multilabel_architecture"], default='multilabel_architecture',
- backbone_name: str: Backbone Name in HuggingFace, default='nlp-thedeep/humbert',
- results_dir: str: Results directory, default='results',
- enable_checkpointing: bool: Whether or not to save model checkpoints while training, default=True,
Expand Down Expand Up @@ -68,8 +68,8 @@ and treats all the categories with the same encoded embedding.
- **Our proposed Architecture**: our approach shares N-1 Transformer layers of the LLM across all the categories,
while the Nth layer is replicated K times, K being the number of classification tasks.
A linear classification head is then defined on top of each Transformer
sub-layer, predicting only the labels belonging to the corresponding task.
The resulting architecture is a combination of the shared parameters and specific components and follows
sub-layer, predicting only the labels belonging to the corresponding task.
The resulting architecture is a combination of the shared parameters and specific components and follows
the relations and hierarchy of the analytical framework's label space.
<p float="center">
<img src="img/architectures/multilabel-architecture.png" width="450" />
Expand All @@ -85,7 +85,7 @@ import pandas as pd
trainer = ClassifierTrainer()

train_df = pd.read_csv(TRAINING DATAFRAME PATH)
val_df = pd.red_csv(VALIDATION DATAFRAME PATH)
val_df = pd.read_csv(VALIDATION DATAFRAME PATH)

trained_model = trainer.train_classification_model(train_df, val_df)
```
Expand All @@ -105,4 +105,9 @@ test_set_predictions = trainer.generate_test_predictions(test_df.excerpt.tolist(
**Generate test set results**
```
test_set_results = trainer.generate_test_results(test_df)
```
```

**multilabel_architecture**

When using the `multilabel_architecture` make sure you have a nested hierarchy
of exactly 3 levels separated with `->`