Skip to content

Commit

Permalink
chore: update code docs verifier workflow
Browse files Browse the repository at this point in the history
  • Loading branch information
adeprez committed Oct 1, 2024
1 parent 1770885 commit 6724dd2
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 45 deletions.
7 changes: 2 additions & 5 deletions .github/workflows/docs-code.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,14 +38,11 @@ jobs:
- name: Install dependencies
run: |
pip install ./lavague-core
pip install ./lavague-integrations/contexts/lavague-contexts-gemini
pip install ./lavague-integrations/contexts/lavague-contexts-openai
pip install ./lavague-integrations/drivers/lavague-drivers-selenium
pip install ./lavague-sdk
sudo apt-get install -y jq
pip freeze | grep lavague
git log | head -1
export DISABLE_LAVAGUE_ANIMATION=1
export LAVAGUE_TELEMETRY=NONE
- name: Check code consistency with docs/index.md
run: |
Expand Down
71 changes: 31 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,19 +64,10 @@ pip install lavague
2. Use our framework to build a Web Agent and implement the objective:

```python
from lavague.core import WorldModel, ActionEngine
from lavague.core.agents import WebAgent
from lavague.drivers.selenium import SeleniumDriver

selenium_driver = SeleniumDriver(headless=False)
world_model = WorldModel()
action_engine = ActionEngine(selenium_driver)
agent = WebAgent(world_model, action_engine)
agent.get("https://huggingface.co/docs")
agent.run("Go on the quicktour of PEFT")

# Launch Gradio Agent Demo
agent.demo("Go on the quicktour of PEFT")
from lavague.sdk import WebAgent

agent = WebAgent()
trajectory = agent.run("https://huggingface.co/docs", "Go on the quicktour of PEFT")
```

For more information on this example and how to use LaVague, see our [quick-tour](https://docs.lavague.ai/en/latest/docs/get-started/quick-tour/).
Expand Down Expand Up @@ -156,33 +147,33 @@ The cost of these LLM calls depends on:

Please see our [dedicated documentation on token counting and cost estimations](https://docs.lavague.ai/en/latest/docs/get-started/token-usage/) to learn how you can track all tokens and estimate costs for running your agents.

## 📈 Data collection

We want to build a dataset that can be used by the AI community to build better Large Action Models for better Web Agents. You can see our work so far on building community datasets on our [BigAction HuggingFace page](https://huggingface.co/BigAction).

This is why LaVague collects the following user data telemetry by default:

- Version of LaVague installed
- Code / List of actions generated for each web action step
- The past actions
- The "observations" (method used to check the current page)
- LLM used (i.e GPT4)
- Multi modal LLM used (i.e GPT4)
- Randomly generated anonymous user ID
- Whether you are using a CLI command (lavague-qa for example), the Gradio demo or our library directly.
- The objective used
- The chain of thoughts on the agent
- The interaction zone on the page (bounding box)
- The viewport size of your browser
- The current step
- The instruction(s) generated & the current engine used
- The token costs & usages
- The URL you performed an action on
- Whether the action failed or succeeded
- The extra used data specified
- Error message, where relevant
- The source nodes (chunks of HTML code retrieved from the web page to perform this action)

## 📈 Data collection

We want to build a dataset that can be used by the AI community to build better Large Action Models for better Web Agents. You can see our work so far on building community datasets on our [BigAction HuggingFace page](https://huggingface.co/BigAction).

This is why LaVague collects the following user data telemetry by default:

- Version of LaVague installed
- Code / List of actions generated for each web action step
- The past actions
- The "observations" (method used to check the current page)
- LLM used (i.e GPT4)
- Multi modal LLM used (i.e GPT4)
- Randomly generated anonymous user ID
- Whether you are using a CLI command (lavague-qa for example), the Gradio demo or our library directly.
- The objective used
- The chain of thoughts on the agent
- The interaction zone on the page (bounding box)
- The viewport size of your browser
- The current step
- The instruction(s) generated & the current engine used
- The token costs & usages
- The URL you performed an action on
- Whether the action failed or succeeded
- The extra used data specified
- Error message, where relevant
- The source nodes (chunks of HTML code retrieved from the web page to perform this action)

**Be careful to NEVER includes personal information in your objectives and the extra user data. If you intend to includes personal information in your objectives/extra user data, it is HIGHLY recommended to turn off the telemetry.**

### 🚫 Turn off all telemetry
Expand Down

0 comments on commit 6724dd2

Please sign in to comment.