Merge branch 'main' into add-text2sql

IBM · Jan 8, 2025 · cac3983 · cac3983
2 parents 52982a6 + 43f12bc
commit cac3983
Show file tree

Hide file tree

Showing 43 changed files with 202 additions and 157 deletions.
diff --git a/docs/docs/examples.rst b/docs/docs/examples.rst
@@ -47,10 +47,23 @@ Evaluate a custom dataset - with existing predictions
 These examples demonstrate how to evaluate a datasets of different tasks when predictions are already available and no inference is required.
 
 `Example code for QA task  <https://github.com/IBM/unitxt/blob/main/examples/evaluate_qa_dataset_with_given_predictions.py>`__
+
 `Example code for classification task  <https://github.com/IBM/unitxt/blob/main/examples/evaluate_classification_dataset_with_given_predictions.py>`__  
 
 Related documentation: :ref:`Evaluating datasets <evaluating_datasets>`
 
+Evaluate a Named Entity Recognition (NER) dataset
+===================================================
+
+This example demonstrates how to evaluate a named entity recognition task.
+The ground truth entities are provided as spans within the provided texts, 
+and the model is prompted to identify these entities.
+Classifical f1_micro, f1_macro, and per-entity-type f1 metrics are reported.
+
+Example code <https://github.com/IBM/unitxt/blob/main/examples/ner_evaluation.py>`__
+
+Related documentation: :ref:`Add new dataset tutorial <adding_dataset>`, :ref:`Open NER task in catalog <catalog.tasks.ner.all_entity_types>`, :ref:`Inference Engines <inference>`.
+
 Evaluation usecases
 -----------------------
 

diff --git a/docs/docs/saving_and_loading_from_catalog.rst b/docs/docs/saving_and_loading_from_catalog.rst
@@ -42,7 +42,7 @@ It's also possible to add artifacts to the library's default catalog:
 Using Catalog Assets
 --------------------
 
-To use catalog objects, simply specify their name in the Unitxt object that will use them. 
+To use catalog objects, simply specify their name in the Unitxt object that will use them.
 
 .. code-block:: python
 
@@ -56,8 +56,8 @@ To use catalog objects, simply specify their name in the Unitxt object that will
 Modifying Catalog Assets on the Fly
 -----------------------------------
 
-To modify a catalog asset's fields dynamically, upon fetching the asset from the catalog, use the syntax: ``artifact_name[key_to_modify=new_value]``. 
-To assign lists, use: ``artifact_name[key_to_modify=[new_value_0, new_value_1]]``. 
+To modify a catalog asset's fields dynamically, upon fetching the asset from the catalog, use the syntax: ``artifact_name[key_to_modify=new_value]``.
+To assign lists, use: ``artifact_name[key_to_modify=[new_value_0, new_value_1]]``.
 To assign dictionaries, use: ``artifact_name[key_to_modify={new_key_0=new_value_0,new_key_1=new_value_1}]``.
 Note that the whole new value of the field has to be specified; not just one item of a list, or one key of the dictionary.
 For instance, to change the metric specification of a task:
@@ -85,20 +85,20 @@ Use ``get_from_catalog`` to directly access catalog assets, and obtain an asset
 A Catalog Asset Linking to Another Catalog Asset
 ------------------------------------------------
 
-A catalog asset can be just a link to another asset. 
-This feature comes handy when for some reason, we want to change the catalog name 
-of an existing asset (e.g. ``asset1`` to ``asset2``), while there is already code 
+A catalog asset can be just a link to another asset.
+This feature comes handy when for some reason, we want to change the catalog name
+of an existing asset (e.g. ``asset1`` to ``asset2``), while there is already code
 that uses the old name of the asset and we want to avoid non-backward compatible changes.
 
-In such a case, we can save the asset as ``asset2``, create an asset of type 
+In such a case, we can save the asset as ``asset2``, create an asset of type
 :class:`ArtifactLink <unitxt.artifact.ArtifactLink>` that links to ``asset2``, and save
 that one as ``asset1``.
-When ``asset1`` is accessed from an existing code, Unixt Catalog realizes that the asset fetched from position ``asset1`` 
-is an ``ArtifactLink``, so it continues and fetches ``asset2`` -- the Artifact linked to by ``asset1``. 
+When ``asset1`` is accessed from an existing code, Unixt Catalog realizes that the asset fetched from position ``asset1``
+is an ``ArtifactLink``, so it continues and fetches ``asset2`` -- the Artifact linked to by ``asset1``.
 
 .. code-block:: python
 
-    link_to_asset2 = ArtifactLink(artifact_linked_to="asset2")
+    link_to_asset2 = ArtifactLink(to="asset2")
     add_to_catalog(
         link_to_asset2,
         "asset1",
@@ -109,8 +109,8 @@ Deprecated Asset
 ----------------
 
 Every asset has a special field named ``__deprecated_msg__`` of type ``str``, whose default value is None.
-When None, the asset is cocnsidered non-deprecated. When not None, the asset is considered deprecated, and 
-its ``__deprecated_msg__`` is logged at level WARN upon its instantiation. (Other than this logging, 
+When None, the asset is cocnsidered non-deprecated. When not None, the asset is considered deprecated, and
+its ``__deprecated_msg__`` is logged at level WARN upon its instantiation. (Other than this logging,
 the artifact is instantiated normally.)
 
 Example of a deprecated catalog asset:
@@ -123,12 +123,12 @@ Example of a deprecated catalog asset:
         "text": "You are an agent in charge of answering a boolean (yes/no) question. The system presents you with a passage and a question. Read the passage carefully, and then answer yes or no. Think about your answer, and make sure it makes sense. Do not explain the answer. Only say yes or no."
     }
 
-Combining this feature with ``ArtifactLink`` in the above example, we can also log a warning to the accessing code that 
-the name ``asset1`` is to be replaced by ``asset2``. 
+Combining this feature with ``ArtifactLink`` in the above example, we can also log a warning to the accessing code that
+the name ``asset1`` is to be replaced by ``asset2``.
 
 .. code-block:: python
 
-    link_to_asset2 = ArtifactLink(artifact_linked_to="asset2",
+    link_to_asset2 = ArtifactLink(to="asset2",
            __deprecated_msg__="'asset1' is going to be deprecated. In future uses, please access 'asset2' instead.")
     add_to_catalog(
         link_to_asset2,

diff --git a/examples/ner_evaluation.py b/examples/ner_evaluation.py
@@ -0,0 +1,70 @@
+import json
+
+from unitxt import get_logger
+from unitxt.api import create_dataset, evaluate
+from unitxt.inference import (
+    CrossProviderInferenceEngine,
+)
+
+logger = get_logger()
+entity_types = ["Person", "Location", "Organization"]
+
+
+test_set = [
+    {
+        "text": "John lives in Texas.",
+        "entity_types": entity_types,
+        "spans_starts": [0, 14],
+        "spans_ends": [5, 19],
+        "labels": ["Person", "Location"],
+    },
+    {
+        "text": "Phil works at Apple and eats an apple.",
+        "entity_types": entity_types,
+        "spans_starts": [0, 14],
+        "spans_ends": [5, 19],
+        "labels": ["Person", "Organization"],
+    },
+]
+
+
+dataset = create_dataset(
+    task="tasks.ner.all_entity_types",
+    test_set=test_set,
+    split="test",
+    format="formats.chat_api",
+)
+
+# Infer using Llama-3.2-1B base using HF API
+# model = HFPipelineBasedInferenceEngine(
+#   model_name="Qwen/Qwen1.5-0.5B-Chat", max_new_tokens=32
+# )
+# Change to this to infer with external APIs:
+
+model = CrossProviderInferenceEngine(model="llama-3-8b-instruct", provider="watsonx")
+# The provider can be one of: ["watsonx", "together-ai", "open-ai", "aws", "ollama", "bam"]
+
+
+predictions = model(dataset)
+results = evaluate(predictions=predictions, data=dataset)
+
+print("Global Results:")
+print(results.global_scores.summary)
+
+print("Example prompt:")
+
+print(json.dumps(results.instance_scores[0]["source"], indent=4))
+
+print("Instance Results:")
+print(
+    results.instance_scores.to_df(
+        columns=[
+            "text",
+            "prediction",
+            "processed_prediction",
+            "processed_references",
+            "score",
+            "score_name",
+        ]
+    ).to_markdown()
+)
diff --git a/prepare/cards/atis.py b/prepare/cards/atis.py
@@ -8,7 +8,7 @@
 from unitxt.span_lableing_operators import IobExtractor
 from unitxt.test_utils.card import test_card
 
-classes = [
+entity_types = [
     "aircraft_code",
     "airline_code",
     "airline_name",
@@ -103,9 +103,9 @@
             },
         ),
         IobExtractor(
-            labels=classes,
-            begin_labels=["B-" + c for c in classes],
-            inside_labels=["I-" + c for c in classes],
+            labels=entity_types,
+            begin_labels=["B-" + c for c in entity_types],
+            inside_labels=["I-" + c for c in entity_types],
             outside_label="O",
         ),
         Copy(
@@ -117,7 +117,7 @@
             get_default=[],
             not_exist_ok=True,
         ),
-        Set(fields={"classes": classes}),
+        Set(fields={"entity_types": entity_types}),
     ],
     task="tasks.span_labeling.extraction",
     templates="templates.span_labeling.extraction.all",

diff --git a/prepare/cards/universal_ner.py b/prepare/cards/universal_ner.py
@@ -76,7 +76,7 @@
             ),
             Set(
                 fields={
-                    "classes": ["Person", "Organization", "Location"],
+                    "entity_types": ["Person", "Organization", "Location"],
                 }
             ),
         ],

diff --git a/prepare/tasks/ner.py b/prepare/tasks/ner.py
@@ -1,7 +1,7 @@
 from typing import List, Tuple
 
 from unitxt.blocks import Task
-from unitxt.catalog import add_to_catalog
+from unitxt.catalog import add_link_to_catalog, add_to_catalog
 
 add_to_catalog(
     Task(
@@ -20,19 +20,9 @@
     overwrite=True,
 )
 
-add_to_catalog(
-    Task(
-        input_fields={"text": str, "entity_types": List[str]},
-        reference_fields={
-            "spans_starts": List[int],
-            "spans_ends": List[int],
-            "text": str,
-            "labels": List[str],
-        },
-        prediction_type=List[Tuple[str, str]],
-        metrics=["metrics.ner"],
-        augmentable_inputs=["text"],
-    ),
-    "tasks.ner.all_entity_types",
+add_link_to_catalog(
+    artifact_linked_to="tasks.span_labeling.extraction",
+    name="tasks.ner.all_entity_types",
+    deprecate=False,
     overwrite=True,
 )
diff --git a/prepare/tasks/span_labeling.py b/prepare/tasks/span_labeling.py
@@ -5,11 +5,17 @@
 
 add_to_catalog(
     Task(
+        __description__="""This is Entity Extraction task where multiple entity types are to be extracted.
+The input is the 'text' and 'entity_types' to extract (e.g. ["Organization", "Location", "Person"])
+
+By default, classical f1 metric is used, which expects a list of <entity,entity_type> pairs.
+Multiple f1 score are reported, including f1_micro and f1_macro and f1 per per entity_type.".
+The template's post processors must convert the model textual predictions into the expected list format.
+""",
         input_fields={
             "text": str,
             "text_type": str,
-            "class_type": str,
-            "classes": List[str],
+            "entity_types": List[str],
         },
         reference_fields={
             "text": str,
@@ -22,7 +28,8 @@
             "metrics.ner",
         ],
         augmentable_inputs=["text"],
-        defaults={"text_type": "text", "class_type": "entity type"},
+        defaults={"text_type": "text"},
+        default_template="templates.span_labeling.extraction.detailed",
     ),
     "tasks.span_labeling.extraction",
     overwrite=True,

diff --git a/prepare/templates/span_labeling/templates.py b/prepare/templates/span_labeling/templates.py
@@ -7,7 +7,7 @@
 add_to_catalog(
     SpanLabelingTemplate(
         input_format="{text_type}: {text}",
-        instruction="From the following {text_type}, extract the objects for which the {class_type} expressed is one of {classes}.",
+        instruction="From the following {text_type}, extract the objects for which the entity type expressed is one of {entity_types}.",
         postprocessors=["processors.to_span_label_pairs"],
     ),
     "templates.span_labeling.extraction.extract",
@@ -17,7 +17,7 @@
 add_to_catalog(
     SpanLabelingTemplate(
         input_format="{text_type}: {text}",
-        instruction="From the following {text_type}, extract spans having a {class_type}: {classes}.",
+        instruction="From the following {text_type}, extract spans having a entity type: {entity_types}.",
         postprocessors=["processors.to_span_label_pairs"],
     ),
     "templates.span_labeling.extraction.having",
@@ -26,7 +26,7 @@
 
 add_to_catalog(
     SpanLabelingTemplate(
-        input_format="{text_type}: {text}\nFrom this {text_type}, extract entities that carry one of the following types: {classes}.",
+        input_format="{text_type}: {text}\nFrom this {text_type}, extract entities that carry one of the following types: {entity_types}.",
         postprocessors=["processors.to_span_label_pairs"],
     ),
     "templates.span_labeling.extraction.carry",
@@ -36,7 +36,7 @@
 add_to_catalog(
     SpanLabelingTemplate(
         input_format="{text_type}: {text}",
-        instruction="From the following {text_type}, identify spans with {class_type}:{classes}.",
+        instruction="From the following {text_type}, identify spans with entity type:{entity_types}.",
         postprocessors=["processors.to_span_label_pairs"],
     ),
     "templates.span_labeling.extraction.identify",
@@ -55,19 +55,34 @@
 add_to_catalog(
     SpanLabelingTemplate(
         input_format="{text_type}:\n{text}",
-        instruction="From the following {text_type}, extract the objects for which the {class_type} expressed is one of {classes}.",
-        target_prefix="{class_type}:\n",
+        instruction="From the following {text_type}, extract the objects for which the entity type expressed is one of {entity_types}.",
+        target_prefix="entity type:\n",
         postprocessors=["processors.to_span_label_pairs"],
-        title_fields=["text_type", "class_type"],
+        title_fields=["text_type"],
     ),
     "templates.span_labeling.extraction.title",
     overwrite=True,
 )
 
 
+add_to_catalog(
+    SpanLabelingTemplate(
+        instruction="""From the given {text_type}, extract all the entities of the following entity types: {entity_types}.
+Return the output in this exact format:
+The output should be a comma separated list of pairs of entity and corresponding entity_type.
+Use a colon to separate between the entity and entity_type. """,
+        input_format="{text_type}:\n{text}",
+        postprocessors=["processors.to_span_label_pairs"],
+    ),
+    "templates.span_labeling.extraction.detailed",
+    overwrite=True,
+)
+
+
 add_to_catalog(
     TemplatesList(
         items=[
+            "templates.span_labeling.extraction.detailed",
             "templates.span_labeling.extraction.extract",
             "templates.span_labeling.extraction.having",
             "templates.span_labeling.extraction.carry",
-Original file line number
+Diff line change
@@ Expand Up / @@ -76,7 +76,7 @@ @@
                 ),
                 Set(
                     fields={
-                        "classes": ["Person", "Organization", "Location"],
+                        "entity_types": ["Person", "Organization", "Location"],
                     }
                 ),
             ],
@@ Expand Down @@