Skip to content

Commit

Permalink
Update EvaLatin.md
Browse files Browse the repository at this point in the history
  • Loading branch information
RacheleSprugnoli authored Dec 5, 2023
1 parent 2965735 commit 6d65aad
Showing 1 changed file with 5 additions and 12 deletions.
17 changes: 5 additions & 12 deletions 2024/EvaLatin.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,6 @@
- [Introduction](#introduction)
- [Important Dates](#important-dates)
- [Data](#data)
* [Training Data](#training-data)
* [Test Data](#test-data)
- [How to participate](#how-to-participate)

___
Expand All @@ -22,7 +20,7 @@ The LT4HALA 2024 workshop will also be the venue of the third edition of *EvaLat
- How can we promote the development of resources and language technologies for the Latin language?
- How can we foster collaboration among scholars working on Latin and attract researchers from different disciplines?

EvaLatin 2024 edition will have 2 tasks (i.e. Syntactic Parsing and Sentiment Analysis) each with 3 sub-tasks (i.e. Classical, Cross-Genre, Cross-Time). These sub-tasks are designed to measure the impact of genre and diachrony on the NLP tools performances, a relevant aspect to keep in mind when dealing with the diachronic and diatopic diversity of Latin. Shared data and an evaluation script will be provided to the participants who will choose to participate in either one or all tasks and subtasks.
EvaLatin 2024 edition will have 2 task, i.e. Dependency Parsing and Emotion Polarity Detection. Shared test data and an evaluation script will be provided to the participants who will choose to participate in either one or all tasks.

EvaLatin 2024 is organized by Rachele Sprugnoli, Federica Iurescia and Marco Passarotti.

Expand All @@ -40,18 +38,13 @@ EvaLatin 2024 is organized by Rachele Sprugnoli, Federica Iurescia and Marco Pas


### DATA
TBA
Dependency parsing will be based on the Universal Dependencies framework. **No** specific training data will be released but participants will be allowed to use the Latin treebanks already available: the main challenge will be to understand which treebank (or combination of treebanks) is the most suitable to deal with new test data. Test data will be both prose and poetic texts from different time periods. Even for the emotion polarity detection task, **no** training data will be released but the organizers will provide an annotation sample and a manually created polarity lexicon. In this task participants will have the opportunity to test unsupervised or cross-language approaches. Test data will be poetic texts from different time periods.

#### Training Data
TBA
### HOW TO PARTICIPATE
Participants will be required to submit their runs and to provide a technical report that should include a brief description of their approach, focusing on the adopted algorithms, models and resources, a summary of their experiments, and an analysis of the obtained results.Technical reports will be included in the proceedings as short papers: the maximum length is 4 pages (excluding references) and they should follow the LREC-COLING 2024 official format. Reports will receive a light review (we will check for the correctness of the format, the exactness of results and ranking, and overall exposition).

#### Test Data
TBA
Participants are allowed to use any approach (e.g. from traditional machine learning algorithms to Large Language Models) and any resource (annotated and non-annotated data, embeddings): all approaches and resources are expected to be described in the systems' reports.

### HOW TO PARTICIPATE
Participants will be required to submit their runs and to provide a technical report that should include a brief description of their approach, focusing on the adopted algorithms, models and resources, a summary of their experiments, and an analysis of the obtained results.

The first run will be produced according to the *closed modality*: the only annotated data to be used for training and tuning the systems are those distributed by the organizers. Other non-annotated resources, e.g. word embeddings, are instead allowed. The second run will be produced according to the *open modality*: annotated external data, such as the Latin datasets of the Universal Dependecies initiative, can be also employed. All external resources are expected to be described in the systems' reports. The closed run is compulsory, while the open run is optional.

***
<p style="text-align: center;">Back to the <a href="https://circse.github.io/LT4HALA/"><b>Main Page</b></a></p>
Expand Down

0 comments on commit 6d65aad

Please sign in to comment.