feat: Llm operator #1313

gaurav274 · 2023-10-23T21:02:44Z

Add support for LLM in select as operator SELECT DummyLLM({prompt}, data) FROM fruitTable;
Provide LLM token usage and cost for the above queries.

xzdandy · 2023-10-24T03:45:59Z

evadb/executor/llm_executor.py

+    def exec(self, *args, **kwargs) -> Iterator[Batch]:
+        child_executor = self.children[0]
+        for batch in child_executor.exec(**kwargs):
+            llm_result = self.llm_expr.evaluate(batch)


Is the batch optimization done in the LLMExecutor and will be added in future PRs?

xzdandy · 2023-10-24T03:47:33Z

evadb/expression/expression_utils.py

+    llm_exprs = []
+    for expr in exprs:
+        if is_llm_expression(expr):
+            llm_exprs.append(expr.copy())


Add a note here, the chained function call will not work here. For example STRTODATAFRAME(LLM('EXTRACT SOME COLUMN', data))

Yes, I'll add that in next PR

xzdandy · 2023-10-24T03:50:22Z

evadb/optimizer/statement_to_opr_converter.py

+                new_root.append_child(plan_root)
+                plan_root = new_root
+            self._plan = plan_root
+


IMO, the generic way is to do it in optimizer with apply and merge. What will the plan looks like, if we have SELECT id, LLM(...) FROM some_table;

It will be Project(id, llm.response) -> LLMExec() -> Get

pchunduri6 · 2023-10-31T15:01:20Z

evadb/functions/llms/openai.py

+    def generate(self, prompts: List[str]) -> List[str]:
+        import openai
+
+        @retry(tries=6, delay=20)


It might be a good time to also add logging to the retry logic -
https://tenacity.readthedocs.io/en/latest/#before-and-after-retry-and-logging

This will log the retry attempts in our logger so the user knows when rate-limiting errors occur. I found this helpful when waiting for a long time. The downside is that we must add the tenacity library to the requirements.

pchunduri6 · 2023-10-31T15:08:25Z

evadb/functions/llms/openai.py

+        try_to_import_tiktoken()
+        import tiktoken
+
+        encoding = tiktoken.encoding_for_model(self.model_name)


If we already have the response, we can directly compute the cost using the response["usage"] parameter?
Tiktoken would be good to estimate the cost before executing the query (helpful for query opt). Maybe we can have two functions, estimate_cost and get_cost. Estimating cost is not simple, though, because we do not know the completion tokens apriori. We would then need a heuristic for the estimated completion tokens.

Yes, a valid concern. I was also thinking about it.

pchunduri6 · 2023-10-31T15:12:10Z

evadb/constants.py

@@ -21,3 +21,4 @@
 IFRAMES = "IFRAMES"
 AUDIORATE = "AUDIORATE"
 DEFAULT_FUNCTION_EXPRESSION_COST = 100
+LLM_FUNCTIONS = ["chatgpt", "completion"]


Are we not adding the LLM operator to the parser? So, the allowed LLM names are restricted here?

jiashenC and others added 7 commits October 18, 2023 21:51

[RELEASE]: 0.3.8

26607be

[BUMP]: v0.3.9+dev

59f85fa

chekcpoint

42a3850

merge upstream

76174e1

fix linter

960d33b

revert changes

affa8ec

revert

55d1727

xzdandy reviewed Oct 24, 2023

View reviewed changes

llm cost added

8414c3e

pchunduri6 reviewed Oct 31, 2023

View reviewed changes

This was linked to issues Oct 31, 2023

ChatGPT function is stuck indefinitely without an error message #1226

Open

feat: Promote ChatGPT to a built-in LLM operator #1297

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Llm operator #1313

feat: Llm operator #1313

gaurav274 commented Oct 23, 2023 •

edited

Loading

xzdandy Oct 24, 2023

gaurav274 Oct 24, 2023

xzdandy Oct 24, 2023

gaurav274 Oct 24, 2023

xzdandy Oct 24, 2023

gaurav274 Oct 24, 2023

pchunduri6 Oct 31, 2023

pchunduri6 Oct 31, 2023

gaurav274 Oct 31, 2023

pchunduri6 Oct 31, 2023

feat: Llm operator #1313

Are you sure you want to change the base?

feat: Llm operator #1313

Conversation

gaurav274 commented Oct 23, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gaurav274 commented Oct 23, 2023 •

edited

Loading