-
Notifications
You must be signed in to change notification settings - Fork 82
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
c074c4b
commit e260e45
Showing
4 changed files
with
153 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
Setting Configurations | ||
======================= | ||
|
||
The Settings module is a central configuration system for managing application-wide settings. | ||
It ensures consistent and thread-safe access to configurations, allowing settings to be dynamically | ||
adjusted and temporarily overridden within specific contexts. In most examples seen, we have | ||
used the settings to configured our LM. | ||
|
||
Using the Settings module | ||
-------------------------- | ||
.. code-block:: python | ||
from lotus | ||
from lotus.models import LM | ||
lm = LM(model="gpt-4o-mini") | ||
lotus.settings.configure(lm=lm) | ||
Configurable Parameters | ||
-------------------------- | ||
1. enable_cache: | ||
* Description: Enables or Disables cahcing mechanisms | ||
* Default: False | ||
.. code-block:: python | ||
settings.configure(enable_cache=True) | ||
2. cascade_IS_weight: | ||
* Description: Specifies the weight for importance Sampling in cascade Operators | ||
* Default: 0.5 | ||
.. code-block:: python | ||
settings.configure(cascade_IS_weight=0.8) | ||
3. cascade_num_calibration_quantiles: | ||
* Description: Number of quantiles used for calibrating sem_filter | ||
* Defualt: 50 | ||
.. code-block:: python | ||
settings.configure(cascade_num_calibration_quantiles=100) | ||
4. min_join_cascade_size: | ||
* Description: Minimum size of qa join cascade to trigger additional Processing | ||
* Default: 100 | ||
.. code-block:: python | ||
settings.configure(min_join_cascade_size=200) | ||
5. cascade_IS_max_sample_range: | ||
* DescriptionL maximum range for sampling during cascade IS Operations | ||
* Default: 250 | ||
.. code-block:: python | ||
settings,configure(cascade_IS_max_sample_range= 500) | ||
6. cascade_IS_random_seed: | ||
* Description: Seed value for randomization in casde IS. Use None for non-deterministic behavior | ||
* Default: None | ||
.. code-block:: python | ||
settings.configure(cascade_IS_random_seed=42) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
Prompt Strategies | ||
=================== | ||
|
||
In addition to calling the semantic operators, advanced prompt stratigies can be used to potentially | ||
get or improve the desired output. Two Prompt Strategies that can be used are Chain of Thought (CoT) and | ||
Demonstrations. | ||
|
||
Chain of Thought + Demonstrations: | ||
---------------------------------- | ||
Chain of Thought reasoning refers to structing prompts in a way that guides the model through a step-by-step process | ||
to arrive at a final answer. By breaking down complex tasks into intermediate steps, CoT ensures more accurate and | ||
logical output | ||
|
||
Here is a simple example of using chain of thought with the Semantic Filter operator | ||
.. code-block:: python | ||
import pandas as pd | ||
import lotus | ||
from lotus.models import LM | ||
lm = LM(model="gpt-4o-mini") | ||
lotus.settings.configure(lm=lm) | ||
data = { | ||
"Course Name": [ | ||
"Probability and Random Processes", | ||
"Optimization Methods in Engineering", | ||
"Digital Design and Integrated Circuits", | ||
"Computer Security", | ||
] | ||
} | ||
df = pd.DataFrame(data) | ||
user_instruction = "{Course Name} requires a lot of math" | ||
example_data = { | ||
"Course Name": ["Machine Learning", "Reaction Mechanisms", "Nordic History"], | ||
"Answer": [True, True, False], | ||
"Reasoning": ["Machine Learning requires a solid understanding of linear alebra and calculus", | ||
"Reaction Engineering requires Ordinary Differential Equations to solve reactor design problems", | ||
"Nordic History has no math involved"] | ||
} | ||
examples = pd.DataFrame(example_data) | ||
df = df.sem_filter(user_instruction, examples=examples, strategy="cot") | ||
print(df) | ||
When calling the Semantic Filter operator, we pass in an example DataFrame as well as the CoT strategy, which acts as a guide | ||
for how the model should reason and respond to the given instructions. For instance, in the examples DataFrame | ||
* "Machine Learning" has an answer of True, with reasoning that it requires a solid understanding of linear algebra and calculus. | ||
* "Reaction Mechanisms" also has an answer of True, justified by its reliance on ordinary differential equations for solving reactor design problems. | ||
* "Nordic History" has an answer of False, as it does not involve any mathematical concepts. | ||
|
||
Using the CoT strategy will provide an output below: | ||
+---+----------------------------------------+-------------------------------------------------------------------+ | ||
| | Course Name | explanation_filter | | ||
+---+----------------------------------------+-------------------------------------------------------------------+ | ||
| 0 | Probability and Random Processes | Probability and Random Processes is heavily based on... | | ||
| 1 | Optimization Methods in Engineering | Optimization Methods in Engineering typically involves... | | ||
| 2 | Digital Design and Integrated Circuits | Digital Design and Integrated Circuits typically covers... | | ||
+---+-------------------------------------+----------------------------------------------------------------------+ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,4 +32,3 @@ Example | |
out = df.sem_agg("Summarize all {Course Name}")._output[0] | ||
print(out) | ||
Output |