Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing quotes from udf_metadata_key #1026

Merged
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,8 +185,8 @@ SELECT ChatGPT('Is this video summary related to Ukraine russia war', text)
CREATE UDF IF NOT EXISTS PredictHouseRent FROM
( SELECT * FROM HomeRentals )
TYPE Ludwig
'predict' 'rental_price'
'time_limit' 120;
PREDICT 'rental_price'
TIME_LIMIT 120;
```

</details>
Expand Down
8 changes: 4 additions & 4 deletions benchmark/text_summarization/text_summarization_with_evadb.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@
cursor.query("DROP UDF IF EXISTS TextSummarizer;").df()
cursor.query("""CREATE UDF IF NOT EXISTS TextSummarizer
TYPE HuggingFace
'task' 'summarization'
'model' 'sshleifer/distilbart-cnn-12-6'
'min_length' 5
'max_length' 100;""").df()
TASK 'summarization'
MODEL 'sshleifer/distilbart-cnn-12-6'
MIN_LENGTH 5
MAX_LENGTH 100;""").df()


cursor.query("DROP TABLE IF EXISTS cnn_news_summary;").df()
Expand Down
8 changes: 4 additions & 4 deletions docs/source/benchmarks/text_summarization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,10 @@ Creating Text Summarization Function in EvaDB

CREATE UDF IF NOT EXISTS TextSummarizer
TYPE HuggingFace
'task' 'summarization'
'model' 'sshleifer/distilbart-cnn-12-6'
'min_length' 5
'max_length' 100;
TASK 'summarization'
MODEL 'sshleifer/distilbart-cnn-12-6'
MIN_LENGTH 5
MAX_LENGTH 100;


Tuning EvaDB for Maximum GPU Utilization
Expand Down
4 changes: 2 additions & 2 deletions docs/source/overview/concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ Here is set of illustrative EvaQL queries for a ChatGPT-based video question ans
--- After creating the function, we can use the function in any future query
CREATE UDF SpeechRecognizer
TYPE HuggingFace
'task' 'automatic-speech-recognition'
'model' 'openai/whisper-base';
TASK 'automatic-speech-recognition'
MODEL 'openai/whisper-base';

-- EvaDB automatically extracts the audio from the videos
--- We only need to run the SpeechRecognizer UDF on the 'audio' column
Expand Down
4 changes: 2 additions & 2 deletions docs/source/reference/ai/hf.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ EvaDB supports UDFS similar to `Pipelines <https://huggingface.co/docs/transform

CREATE UDF IF NOT EXISTS HFObjectDetector
TYPE HuggingFace
'task' 'object-detection'
'model' 'facebook / detr-resnet-50'
TASK 'object-detection'
MODEL 'facebook / detr-resnet-50'

EvaDB supports all arguments supported by HF pipelines. You can pass those using a key value format similar to task and model above.

Expand Down
8 changes: 4 additions & 4 deletions docs/source/reference/ai/model-train.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ Training and Finetuning
CREATE UDF IF NOT EXISTS PredictHouseRent FROM
( SELECT sqft, location, rental_price FROM HomeRentals )
TYPE Ludwig
'predict' 'rental_price'
'time_limit' 120;
PREDICT 'rental_price'
TIME_LIMIT 120;

In the above query, you are creating a new customized UDF by automatically training a model from the `HomeRentals` table. The `rental_price` column will be the target column for predication, while `sqft` and `location` are the inputs.

Expand All @@ -24,8 +24,8 @@ You can also simply give all other columns in `HomeRentals` as inputs and let th
CREATE UDF IF NOT EXISTS PredictHouseRent FROM
( SELECT * FROM HomeRentals )
TYPE Ludwig
'predict' 'rental_price'
'time_limit' 120;
PREDICT 'rental_price'
TIME_LIMIT 120;

.. note::

Expand Down
2 changes: 1 addition & 1 deletion docs/source/reference/ai/openai.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ To create a chat completion UDF in EvaDB, use the following SQL command:

CREATE UDF IF NOT EXISTS OpenAIChatCompletion
IMPL 'evadb/udfs/openai_chat_completion_udf.py'
'model' 'gpt-3.5-turbo'
MODEL 'gpt-3.5-turbo'

EvaDB supports the following models for chat completion task:

Expand Down
2 changes: 1 addition & 1 deletion docs/source/reference/ai/yolo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ To create a YOLO UDF in EvaDB using Ultralytics models, use the following SQL co

CREATE UDF IF NOT EXISTS Yolo
TYPE ultralytics
'model' 'yolov8m.pt'
MODEL 'yolov8m.pt'

You can change the `model` value to specify any other model supported by Ultralytics.

Expand Down
6 changes: 3 additions & 3 deletions docs/source/reference/evaql/create.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,9 +75,9 @@ To register an user-defined function by training a predication model.
CREATE UDF IF NOT EXISTS PredictHouseRent FROM
(SELECT * FROM HomeRentals)
TYPE Ludwig
'predict' 'rental_price'
'time_list' 120;
'tune_for_memory' False;
PREDICT 'rental_price'
TIME_LIST 120;
TUNE_FOR_MEMORY False;

CREATE MATERIALIZED VIEW
------------------------
Expand Down
2 changes: 1 addition & 1 deletion docs/source/usecases/object-detection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ To create a custom ``Yolo`` function based on the popular ``YOLO-v8m`` model, us

CREATE UDF IF NOT EXISTS Yolo
TYPE ultralytics
'model' 'yolov8m.pt';
MODEL 'yolov8m.pt';

Object Detection Queries
------------------------
Expand Down
4 changes: 2 additions & 2 deletions docs/source/usecases/question-answering.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ To create a custom ``SpeechRecognizer`` function based on the popular ``Whisper`

CREATE FUNCTION SpeechRecognizer
TYPE HuggingFace
'task' 'automatic-speech-recognition'
'model' 'openai/whisper-base';
TASK 'automatic-speech-recognition'
MODEL 'openai/whisper-base';

.. note::

Expand Down
8 changes: 4 additions & 4 deletions docs/source/usecases/text-summarization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,13 @@ To create custom ``TextSummarizer`` and ``TextClassifier`` functions, use the ``

CREATE FUNCTION IF NOT EXISTS TextSummarizer
TYPE HuggingFace
'task' 'summarization'
'model' 'facebook/bart-large-cnn';
TASK 'summarization'
MODEL 'facebook/bart-large-cnn';

CREATE FUNCTION IF NOT EXISTS TextClassifier
TYPE HuggingFace
'task' 'text-classification'
'model' 'distilbert-base-uncased-finetuned-sst-2-english';
TASK 'text-classification'
MODEL 'distilbert-base-uncased-finetuned-sst-2-english';

.. note::

Expand Down
4 changes: 3 additions & 1 deletion evadb/parser/create_udf_statement.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,9 @@ def __str__(self) -> str:

if self._metadata is not None:
for key, value in self._metadata:
s += f" '{key}' '{value}'"
# NOTE :- Removing quotes around key and making it upper case
# Since in tests we are doing a straight string comparison
s += f" {key.upper()} '{value}'"
return s

@property
Expand Down
2 changes: 1 addition & 1 deletion evadb/parser/evadb.lark
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ udf_impl: string_literal

udf_metadata: udf_metadata_key udf_metadata_value

udf_metadata_key: string_literal
udf_metadata_key: uid

udf_metadata_value: string_literal | decimal_literal

Expand Down
4 changes: 3 additions & 1 deletion evadb/parser/lark_visitor/_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,9 @@ def create_udf(self, tree):
value = key_value_pair[1]
if isinstance(value, ConstantValueExpression):
value = value.value
metadata.append((key_value_pair[0].value, value)),
# Removing .value from key_value_pair[0] since key is now an ID_LITERAL
# Adding lower() to ensure the key is in lowercase
metadata.append((key_value_pair[0].lower(), value)),

return CreateUDFStatement(
udf_name,
Expand Down
2 changes: 1 addition & 1 deletion evadb/parser/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ def parse_create_udf(
mock_query += f" TYPE {type}"
task, model = kwargs["task"], kwargs["model"]
if task is not None and model is not None:
mock_query += f" 'task' '{task}' 'model' '{model}'"
mock_query += f" TASK '{task}' MODEL '{model}'"
else:
mock_query += f" IMPL '{udf_file_path}'"
mock_query += ";"
Expand Down
4 changes: 2 additions & 2 deletions evadb/udfs/udf_bootstrap_queries.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@

Yolo_udf_query = """CREATE UDF IF NOT EXISTS Yolo
TYPE ultralytics
'model' 'yolov8m.pt';
MODEL 'yolov8m.pt';
"""

face_detection_udf_query = """CREATE UDF IF NOT EXISTS FaceDetector
Expand Down Expand Up @@ -185,7 +185,7 @@

yolo8n_query = """CREATE UDF IF NOT EXISTS Yolo
TYPE ultralytics
'model' 'yolov8n.pt';
MODEL 'yolov8n.pt';
"""


Expand Down
6 changes: 3 additions & 3 deletions test/benchmark_tests/test_benchmark_pytorch.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ def test_automatic_speech_recognition(benchmark, setup_pytorch_tests):
udf_name = "SpeechRecognizer"
create_udf = (
f"CREATE UDF {udf_name} TYPE HuggingFace "
"'task' 'automatic-speech-recognition' 'model' 'openai/whisper-base';"
"TASK 'automatic-speech-recognition' MODEL 'openai/whisper-base';"
)
execute_query_fetch_all(setup_pytorch_tests, create_udf)

Expand All @@ -135,14 +135,14 @@ def test_summarization_from_video(benchmark, setup_pytorch_tests):
asr_udf = "SpeechRecognizer"
create_udf = (
f"CREATE UDF {asr_udf} TYPE HuggingFace "
"'task' 'automatic-speech-recognition' 'model' 'openai/whisper-base';"
"TASK 'automatic-speech-recognition' MODEL 'openai/whisper-base';"
)
execute_query_fetch_all(setup_pytorch_tests, create_udf)

summary_udf = "Summarizer"
create_udf = (
f"CREATE UDF {summary_udf} TYPE HuggingFace "
"'task' 'summarization' 'model' 'philschmid/bart-large-cnn-samsum' 'min_length' 10 'max_length' 100;"
"TASK 'summarization' MODEL 'philschmid/bart-large-cnn-samsum' MIN_LENGTH 10 MAX_LENGTH 100;"
)
execute_query_fetch_all(setup_pytorch_tests, create_udf)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ def test_create_udf_with_relational_api(self):
query = create_speech_recognizer_udf_if_not_exists.sql_query()
self.assertEqual(
query,
"""CREATE UDF IF NOT EXISTS SpeechRecognizer TYPE HuggingFace 'task' 'automatic-speech-recognition' 'model' 'openai/whisper-base'""",
"""CREATE UDF IF NOT EXISTS SpeechRecognizer TYPE HuggingFace TASK 'automatic-speech-recognition' MODEL 'openai/whisper-base'""",
)
create_speech_recognizer_udf_if_not_exists.execute()

Expand All @@ -242,7 +242,7 @@ def test_create_udf_with_relational_api(self):
query = create_speech_recognizer_udf.sql_query()
self.assertEqual(
query,
"CREATE UDF SpeechRecognizer TYPE HuggingFace 'task' 'automatic-speech-recognition' 'model' 'openai/whisper-base'",
"CREATE UDF SpeechRecognizer TYPE HuggingFace TASK 'automatic-speech-recognition' MODEL 'openai/whisper-base'",
)
with self.assertRaises(ExecutorError):
create_speech_recognizer_udf.execute()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ def test_ray_error_populate_to_all_stages(self):
udf_name, task = "HFObjectDetector", "image-classification"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' '{task}'
TASK '{task}'
"""

execute_query_fetch_all(self.evadb, create_udf_query)
Expand Down
30 changes: 15 additions & 15 deletions test/integration_tests/long/test_huggingface_udfs.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ def test_io_catalog_entries_populated(self):
udf_name, task = "HFObjectDetector", "image-classification"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' '{task}'
TASK '{task}'
"""

execute_query_fetch_all(self.evadb, create_udf_query)
Expand All @@ -79,7 +79,7 @@ def test_raise_error_on_unsupported_task(self):
task = "zero-shot-object-detection"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' '{task}'
TASK '{task}'
"""
# catch an assert

Expand All @@ -95,8 +95,8 @@ def test_object_detection(self):
udf_name = "HFObjectDetector"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' 'object-detection'
'model' 'facebook/detr-resnet-50';
TASK 'object-detection'
MODEL 'facebook/detr-resnet-50';
"""
execute_query_fetch_all(self.evadb, create_udf_query)

Expand Down Expand Up @@ -141,7 +141,7 @@ def test_image_classification(self):
udf_name = "HFImageClassifier"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' 'image-classification'
TASK 'image-classification'
"""
execute_query_fetch_all(self.evadb, create_udf_query)

Expand Down Expand Up @@ -180,7 +180,7 @@ def test_text_classification(self):
udf_name = "HFTextClassifier"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' 'text-classification'
TASK 'text-classification'
"""
execute_query_fetch_all(self.evadb, create_udf_query)

Expand Down Expand Up @@ -216,7 +216,7 @@ def test_automatic_speech_recognition(self):
udf_name = "SpeechRecognizer"
create_udf = (
f"CREATE UDF {udf_name} TYPE HuggingFace "
"'task' 'automatic-speech-recognition' 'model' 'openai/whisper-base';"
"TASK 'automatic-speech-recognition' MODEL 'openai/whisper-base';"
)
execute_query_fetch_all(self.evadb, create_udf)

Expand Down Expand Up @@ -247,14 +247,14 @@ def test_summarization_from_video(self):
asr_udf = "SpeechRecognizer"
create_udf = (
f"CREATE UDF {asr_udf} TYPE HuggingFace "
"'task' 'automatic-speech-recognition' 'model' 'openai/whisper-base';"
"TASK 'automatic-speech-recognition' MODEL 'openai/whisper-base';"
)
execute_query_fetch_all(self.evadb, create_udf)

summary_udf = "Summarizer"
create_udf = (
f"CREATE UDF {summary_udf} TYPE HuggingFace "
"'task' 'summarization' 'model' 'philschmid/bart-large-cnn-samsum' 'min_length' 10 'max_new_tokens' 100;"
"TASK 'summarization' MODEL 'philschmid/bart-large-cnn-samsum' MIN_LENGTH 10 'max_new_tokens' 100;"
)
execute_query_fetch_all(self.evadb, create_udf)

Expand All @@ -279,8 +279,8 @@ def test_toxicity_classification(self):
udf_name = "HFToxicityClassifier"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' 'text-classification'
'model' 'martin-ha/toxic-comment-model'
TASK 'text-classification'
MODEL 'martin-ha/toxic-comment-model'
"""
execute_query_fetch_all(self.evadb, create_udf_query)

Expand Down Expand Up @@ -328,8 +328,8 @@ def test_multilingual_toxicity_classification(self):
udf_name = "HFMultToxicityClassifier"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' 'text-classification'
'model' 'EIStakovskii/xlm_roberta_base_multilingual_toxicity_classifier_plus'
TASK 'text-classification'
MODEL 'EIStakovskii/xlm_roberta_base_multilingual_toxicity_classifier_plus'
"""
execute_query_fetch_all(self.evadb, create_udf_query)

Expand Down Expand Up @@ -376,7 +376,7 @@ def test_named_entity_recognition_model_all_pdf_data(self):
udf_name = "HFNERModel"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' 'ner'
TASK 'ner'
"""
execute_query_fetch_all(self.evadb, create_udf_query)

Expand Down Expand Up @@ -411,7 +411,7 @@ def test_named_entity_recognition_model_no_ner_data_exists(self):
udf_name = "HFNERModel"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' 'ner'
TASK 'ner'
"""
execute_query_fetch_all(self.evadb, create_udf_query)

Expand Down
4 changes: 2 additions & 2 deletions test/integration_tests/long/test_model_train.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,8 @@ def test_ludwig_automl(self):
CREATE UDF IF NOT EXISTS PredictHouseRent FROM
( SELECT * FROM HomeRentals )
TYPE Ludwig
'predict' 'rental_price'
'time_limit' 120;
PREDICT 'rental_price'
TIME_LIMIT 120;
"""
execute_query_fetch_all(self.evadb, create_predict_udf)

Expand Down
4 changes: 2 additions & 2 deletions test/integration_tests/long/test_reuse.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,8 @@ def _load_hf_model(self):
udf_name = "HFObjectDetector"
create_udf_query = f"""CREATE UDF {udf_name}
TYPE HuggingFace
'task' 'object-detection'
'model' 'facebook/detr-resnet-50';
TASK 'object-detection'
MODEL 'facebook/detr-resnet-50';
"""
execute_query_fetch_all(self.evadb, create_udf_query)

Expand Down
Loading