Skip to content

Commit

Permalink
Merge branch 'main' into add-text2sql
Browse files Browse the repository at this point in the history
  • Loading branch information
perlitz authored Jan 8, 2025
2 parents 52982a6 + 43f12bc commit cac3983
Show file tree
Hide file tree
Showing 43 changed files with 202 additions and 157 deletions.
13 changes: 13 additions & 0 deletions docs/docs/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,23 @@ Evaluate a custom dataset - with existing predictions
These examples demonstrate how to evaluate a datasets of different tasks when predictions are already available and no inference is required.

`Example code for QA task <https://github.com/IBM/unitxt/blob/main/examples/evaluate_qa_dataset_with_given_predictions.py>`__

`Example code for classification task <https://github.com/IBM/unitxt/blob/main/examples/evaluate_classification_dataset_with_given_predictions.py>`__

Related documentation: :ref:`Evaluating datasets <evaluating_datasets>`

Evaluate a Named Entity Recognition (NER) dataset
===================================================

This example demonstrates how to evaluate a named entity recognition task.
The ground truth entities are provided as spans within the provided texts,
and the model is prompted to identify these entities.
Classifical f1_micro, f1_macro, and per-entity-type f1 metrics are reported.

Example code <https://github.com/IBM/unitxt/blob/main/examples/ner_evaluation.py>`__

Related documentation: :ref:`Add new dataset tutorial <adding_dataset>`, :ref:`Open NER task in catalog <catalog.tasks.ner.all_entity_types>`, :ref:`Inference Engines <inference>`.

Evaluation usecases
-----------------------

Expand Down
30 changes: 15 additions & 15 deletions docs/docs/saving_and_loading_from_catalog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ It's also possible to add artifacts to the library's default catalog:
Using Catalog Assets
--------------------

To use catalog objects, simply specify their name in the Unitxt object that will use them.
To use catalog objects, simply specify their name in the Unitxt object that will use them.

.. code-block:: python
Expand All @@ -56,8 +56,8 @@ To use catalog objects, simply specify their name in the Unitxt object that will
Modifying Catalog Assets on the Fly
-----------------------------------

To modify a catalog asset's fields dynamically, upon fetching the asset from the catalog, use the syntax: ``artifact_name[key_to_modify=new_value]``.
To assign lists, use: ``artifact_name[key_to_modify=[new_value_0, new_value_1]]``.
To modify a catalog asset's fields dynamically, upon fetching the asset from the catalog, use the syntax: ``artifact_name[key_to_modify=new_value]``.
To assign lists, use: ``artifact_name[key_to_modify=[new_value_0, new_value_1]]``.
To assign dictionaries, use: ``artifact_name[key_to_modify={new_key_0=new_value_0,new_key_1=new_value_1}]``.
Note that the whole new value of the field has to be specified; not just one item of a list, or one key of the dictionary.
For instance, to change the metric specification of a task:
Expand Down Expand Up @@ -85,20 +85,20 @@ Use ``get_from_catalog`` to directly access catalog assets, and obtain an asset
A Catalog Asset Linking to Another Catalog Asset
------------------------------------------------

A catalog asset can be just a link to another asset.
This feature comes handy when for some reason, we want to change the catalog name
of an existing asset (e.g. ``asset1`` to ``asset2``), while there is already code
A catalog asset can be just a link to another asset.
This feature comes handy when for some reason, we want to change the catalog name
of an existing asset (e.g. ``asset1`` to ``asset2``), while there is already code
that uses the old name of the asset and we want to avoid non-backward compatible changes.

In such a case, we can save the asset as ``asset2``, create an asset of type
In such a case, we can save the asset as ``asset2``, create an asset of type
:class:`ArtifactLink <unitxt.artifact.ArtifactLink>` that links to ``asset2``, and save
that one as ``asset1``.
When ``asset1`` is accessed from an existing code, Unixt Catalog realizes that the asset fetched from position ``asset1``
is an ``ArtifactLink``, so it continues and fetches ``asset2`` -- the Artifact linked to by ``asset1``.
When ``asset1`` is accessed from an existing code, Unixt Catalog realizes that the asset fetched from position ``asset1``
is an ``ArtifactLink``, so it continues and fetches ``asset2`` -- the Artifact linked to by ``asset1``.

.. code-block:: python
link_to_asset2 = ArtifactLink(artifact_linked_to="asset2")
link_to_asset2 = ArtifactLink(to="asset2")
add_to_catalog(
link_to_asset2,
"asset1",
Expand All @@ -109,8 +109,8 @@ Deprecated Asset
----------------

Every asset has a special field named ``__deprecated_msg__`` of type ``str``, whose default value is None.
When None, the asset is cocnsidered non-deprecated. When not None, the asset is considered deprecated, and
its ``__deprecated_msg__`` is logged at level WARN upon its instantiation. (Other than this logging,
When None, the asset is cocnsidered non-deprecated. When not None, the asset is considered deprecated, and
its ``__deprecated_msg__`` is logged at level WARN upon its instantiation. (Other than this logging,
the artifact is instantiated normally.)

Example of a deprecated catalog asset:
Expand All @@ -123,12 +123,12 @@ Example of a deprecated catalog asset:
"text": "You are an agent in charge of answering a boolean (yes/no) question. The system presents you with a passage and a question. Read the passage carefully, and then answer yes or no. Think about your answer, and make sure it makes sense. Do not explain the answer. Only say yes or no."
}
Combining this feature with ``ArtifactLink`` in the above example, we can also log a warning to the accessing code that
the name ``asset1`` is to be replaced by ``asset2``.
Combining this feature with ``ArtifactLink`` in the above example, we can also log a warning to the accessing code that
the name ``asset1`` is to be replaced by ``asset2``.

.. code-block:: python
link_to_asset2 = ArtifactLink(artifact_linked_to="asset2",
link_to_asset2 = ArtifactLink(to="asset2",
__deprecated_msg__="'asset1' is going to be deprecated. In future uses, please access 'asset2' instead.")
add_to_catalog(
link_to_asset2,
Expand Down
70 changes: 70 additions & 0 deletions examples/ner_evaluation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
import json

from unitxt import get_logger
from unitxt.api import create_dataset, evaluate
from unitxt.inference import (
CrossProviderInferenceEngine,
)

logger = get_logger()
entity_types = ["Person", "Location", "Organization"]


test_set = [
{
"text": "John lives in Texas.",
"entity_types": entity_types,
"spans_starts": [0, 14],
"spans_ends": [5, 19],
"labels": ["Person", "Location"],
},
{
"text": "Phil works at Apple and eats an apple.",
"entity_types": entity_types,
"spans_starts": [0, 14],
"spans_ends": [5, 19],
"labels": ["Person", "Organization"],
},
]


dataset = create_dataset(
task="tasks.ner.all_entity_types",
test_set=test_set,
split="test",
format="formats.chat_api",
)

# Infer using Llama-3.2-1B base using HF API
# model = HFPipelineBasedInferenceEngine(
# model_name="Qwen/Qwen1.5-0.5B-Chat", max_new_tokens=32
# )
# Change to this to infer with external APIs:

model = CrossProviderInferenceEngine(model="llama-3-8b-instruct", provider="watsonx")
# The provider can be one of: ["watsonx", "together-ai", "open-ai", "aws", "ollama", "bam"]


predictions = model(dataset)
results = evaluate(predictions=predictions, data=dataset)

print("Global Results:")
print(results.global_scores.summary)

print("Example prompt:")

print(json.dumps(results.instance_scores[0]["source"], indent=4))

print("Instance Results:")
print(
results.instance_scores.to_df(
columns=[
"text",
"prediction",
"processed_prediction",
"processed_references",
"score",
"score_name",
]
).to_markdown()
)
10 changes: 5 additions & 5 deletions prepare/cards/atis.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from unitxt.span_lableing_operators import IobExtractor
from unitxt.test_utils.card import test_card

classes = [
entity_types = [
"aircraft_code",
"airline_code",
"airline_name",
Expand Down Expand Up @@ -103,9 +103,9 @@
},
),
IobExtractor(
labels=classes,
begin_labels=["B-" + c for c in classes],
inside_labels=["I-" + c for c in classes],
labels=entity_types,
begin_labels=["B-" + c for c in entity_types],
inside_labels=["I-" + c for c in entity_types],
outside_label="O",
),
Copy(
Expand All @@ -117,7 +117,7 @@
get_default=[],
not_exist_ok=True,
),
Set(fields={"classes": classes}),
Set(fields={"entity_types": entity_types}),
],
task="tasks.span_labeling.extraction",
templates="templates.span_labeling.extraction.all",
Expand Down
2 changes: 1 addition & 1 deletion prepare/cards/universal_ner.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@
),
Set(
fields={
"classes": ["Person", "Organization", "Location"],
"entity_types": ["Person", "Organization", "Location"],
}
),
],
Expand Down
20 changes: 5 additions & 15 deletions prepare/tasks/ner.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from typing import List, Tuple

from unitxt.blocks import Task
from unitxt.catalog import add_to_catalog
from unitxt.catalog import add_link_to_catalog, add_to_catalog

add_to_catalog(
Task(
Expand All @@ -20,19 +20,9 @@
overwrite=True,
)

add_to_catalog(
Task(
input_fields={"text": str, "entity_types": List[str]},
reference_fields={
"spans_starts": List[int],
"spans_ends": List[int],
"text": str,
"labels": List[str],
},
prediction_type=List[Tuple[str, str]],
metrics=["metrics.ner"],
augmentable_inputs=["text"],
),
"tasks.ner.all_entity_types",
add_link_to_catalog(
artifact_linked_to="tasks.span_labeling.extraction",
name="tasks.ner.all_entity_types",
deprecate=False,
overwrite=True,
)
13 changes: 10 additions & 3 deletions prepare/tasks/span_labeling.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,17 @@

add_to_catalog(
Task(
__description__="""This is Entity Extraction task where multiple entity types are to be extracted.
The input is the 'text' and 'entity_types' to extract (e.g. ["Organization", "Location", "Person"])
By default, classical f1 metric is used, which expects a list of <entity,entity_type> pairs.
Multiple f1 score are reported, including f1_micro and f1_macro and f1 per per entity_type.".
The template's post processors must convert the model textual predictions into the expected list format.
""",
input_fields={
"text": str,
"text_type": str,
"class_type": str,
"classes": List[str],
"entity_types": List[str],
},
reference_fields={
"text": str,
Expand All @@ -22,7 +28,8 @@
"metrics.ner",
],
augmentable_inputs=["text"],
defaults={"text_type": "text", "class_type": "entity type"},
defaults={"text_type": "text"},
default_template="templates.span_labeling.extraction.detailed",
),
"tasks.span_labeling.extraction",
overwrite=True,
Expand Down
29 changes: 22 additions & 7 deletions prepare/templates/span_labeling/templates.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
add_to_catalog(
SpanLabelingTemplate(
input_format="{text_type}: {text}",
instruction="From the following {text_type}, extract the objects for which the {class_type} expressed is one of {classes}.",
instruction="From the following {text_type}, extract the objects for which the entity type expressed is one of {entity_types}.",
postprocessors=["processors.to_span_label_pairs"],
),
"templates.span_labeling.extraction.extract",
Expand All @@ -17,7 +17,7 @@
add_to_catalog(
SpanLabelingTemplate(
input_format="{text_type}: {text}",
instruction="From the following {text_type}, extract spans having a {class_type}: {classes}.",
instruction="From the following {text_type}, extract spans having a entity type: {entity_types}.",
postprocessors=["processors.to_span_label_pairs"],
),
"templates.span_labeling.extraction.having",
Expand All @@ -26,7 +26,7 @@

add_to_catalog(
SpanLabelingTemplate(
input_format="{text_type}: {text}\nFrom this {text_type}, extract entities that carry one of the following types: {classes}.",
input_format="{text_type}: {text}\nFrom this {text_type}, extract entities that carry one of the following types: {entity_types}.",
postprocessors=["processors.to_span_label_pairs"],
),
"templates.span_labeling.extraction.carry",
Expand All @@ -36,7 +36,7 @@
add_to_catalog(
SpanLabelingTemplate(
input_format="{text_type}: {text}",
instruction="From the following {text_type}, identify spans with {class_type}:{classes}.",
instruction="From the following {text_type}, identify spans with entity type:{entity_types}.",
postprocessors=["processors.to_span_label_pairs"],
),
"templates.span_labeling.extraction.identify",
Expand All @@ -55,19 +55,34 @@
add_to_catalog(
SpanLabelingTemplate(
input_format="{text_type}:\n{text}",
instruction="From the following {text_type}, extract the objects for which the {class_type} expressed is one of {classes}.",
target_prefix="{class_type}:\n",
instruction="From the following {text_type}, extract the objects for which the entity type expressed is one of {entity_types}.",
target_prefix="entity type:\n",
postprocessors=["processors.to_span_label_pairs"],
title_fields=["text_type", "class_type"],
title_fields=["text_type"],
),
"templates.span_labeling.extraction.title",
overwrite=True,
)


add_to_catalog(
SpanLabelingTemplate(
instruction="""From the given {text_type}, extract all the entities of the following entity types: {entity_types}.
Return the output in this exact format:
The output should be a comma separated list of pairs of entity and corresponding entity_type.
Use a colon to separate between the entity and entity_type. """,
input_format="{text_type}:\n{text}",
postprocessors=["processors.to_span_label_pairs"],
),
"templates.span_labeling.extraction.detailed",
overwrite=True,
)


add_to_catalog(
TemplatesList(
items=[
"templates.span_labeling.extraction.detailed",
"templates.span_labeling.extraction.extract",
"templates.span_labeling.extraction.having",
"templates.span_labeling.extraction.carry",
Expand Down
Loading

0 comments on commit cac3983

Please sign in to comment.