Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid model output format. Please follow the correct schema. Details: 1 validation error for CustomAgentOutput #183

Open
aparente opened this issue Jan 28, 2025 · 3 comments

Comments

@aparente
Copy link

aparente commented Jan 28, 2025

Currently running deepseek-r1:32b with ollama and running into this error. It opens chromium and processes the prompt but does nothing else. I've unchecked "use vision", haven't seen this error message on other issues.

M3 max macbook pro webui version 1.3

FWIW I also haven't been able to get llama2:7b working, but with different errors.

Edit: duplicate issue here - ##179

INFO [browser_use] BrowserUse logging setup complete with level info
INFO [root] Anonymized telemetry enabled. See https://github.com/browser-use/browser-use for more information.

To create a public link, set share=True in launch().
INFO [agent] 🚀 Starting task: go to google.com and type 'OpenAI' click search and give me the first url
INFO [src.agent.custom_agent]
📍 Step 1
INFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent]
Okay, I need to figure out how to accomplish the user's task step by step. The task is to go to google.com, type 'OpenAI' in the search bar, click the search button, and then provide the first URL result.

First, since the current URL is about:blank, which means it's a new tab or an empty page, I need to navigate to google.com. So, the first action should be using go_to_url with the URL set to "https://www.google.com".

Once we're on Google's homepage, I should check for the search input element. Typically, Google has an input field where you can type your query. From the hints, it looks like the index for the search input is 0. So, the next action is to input 'OpenAI' into that field.

After typing the search term, the next step is to click the search button. The element with index 1 is likely the search button labeled "Google Search". Clicking this should trigger the search and display the results page.

On the results page, I need to extract the first URL. To do this, I can use the extract_page_content action with include_links set to true. This will capture all the text on the page along with the links, making it easier to identify the first URL related to 'OpenAI'.

I should also consider any potential issues like pop-ups or redirects, but since Google's interface is straightforward, it's probably smooth. Additionally, after clicking search, I need to wait for the page to load before extracting the content.

Finally, once the content is extracted, I can parse through it to find the first URL and provide that back as the result. If there are any unexpected elements or changes in the DOM structure, I might need to adjust my approach, but based on the current setup, this sequence should work.

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
ERROR [agent] ❌ Result failed 1/5 times:
Invalid model output format. Please follow the correct schema.
Details: 1 validation error for CustomAgentOutput
current_state
Field required [type=missing, input_value={'current__state': {'prev...include_links': True}}]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
INFO [src.agent.custom_agent]
📍 Step 1
INFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent]
Alright, let's try to figure out what went wrong here and how to fix it. So the user gave a task to navigate to Google, search for 'OpenAI', click the search button, and then extract the first URL result. The initial response I provided was supposed to follow specific JSON formatting guidelines.

First, I looked at the task requirements. It's straightforward: go to Google, type in the query, perform the search, and get the first URL. I need to make sure each step is addressed properly.

Next, I checked the structure of my previous response. I realized that I misspelled "current_state" as "current__state". That was a mistake because the correct field name is essential for the JSON to be valid. Without it, the system can't parse the data correctly, leading to validation errors.

I also reviewed the action sequence. Initially, I included actions to go to Google, input the text, click the search button, and extract content. But now that I think about it, maybe extracting content isn't necessary if all I need is the first URL after performing the search. However, without seeing the elements on the page after the search, it's hard to say what the exact next steps are.

Another consideration was the proper formatting of the JSON. Each action must be a dictionary within an array, and each key in those dictionaries must follow the specified schema exactly. I made sure that all action names and their parameters were correctly spelled and structured.

I also thought about whether additional actions might be needed after clicking the search button. For instance, if Google's results page has dynamic content or requires scrolling to reveal the first URL, I might need to include a scroll action before extracting the content. But since the user only asked for the first URL, maybe just navigating and performing the search is enough.

Furthermore, I reflected on the error message received: it pointed out that "current_state" was missing because of a typo. That's a critical oversight, so correcting that was my top priority.

Finally, I realized that without proper error handling or feedback after each action, there might be more issues down the line. But since the main issue was the incorrect JSON structure and typos, focusing on fixing those was the immediate step to take.

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
INFO [src.agent.custom_agent] ✅ Eval: Success - The page is about:blank, no previous actions taken yet.
INFO [src.agent.custom_agent] 🧠 New Memory:
INFO [src.agent.custom_agent] ⏳ Task Progress:

INFO [src.agent.custom_agent] 📋 Future Plans:

  1. Navigate to google.com. 2. Input 'OpenAI' into the search bar. 3. Click the search button. 4. Extract the first URL from the results.
    INFO [src.agent.custom_agent] 🤔 Thought: I need to start by navigating to Google's homepage since the current page is blank. Once there, I'll locate and interact with the search input and button to perform the search for 'OpenAI'. After that, I'll extract the necessary information from the search results page.
    INFO [src.agent.custom_agent] 🎯 Summary: Navigate to Google, perform a search for 'OpenAI', and extract the first URL result.
    INFO [src.agent.custom_agent] 🛠️ Action 1/3: {}
    INFO [src.agent.custom_agent] 🛠️ Action 2/3: {"input_text":{"index":0,"text":"OpenAI"}}
    INFO [src.agent.custom_agent] 🛠️ Action 3/3: {"click_element":{"index":1}}
    INFO [src.agent.custom_agent] 🧠 All Memory:

ERROR [agent] ❌ Result failed 2/5 times:
Error executing action input_text: Element index 0 does not exist - retry or use alternative actions
INFO [src.agent.custom_agent]
📍 Step 2
xINFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent]
Alright, let me try to figure out how to approach this problem step by step. The user's task is to navigate to google.com, search for 'OpenAI', click the search button, and then extract the first URL from the results.

First, I need to consider the current state of the browser. It's on about:blank with no interactive elements available. So the first logical step is to navigate to google.com using the go_to_url action.

Once at google.com, there should be an input field for the search query and a button to click for searching. However, in my previous attempt, I used index 0 for the input field and index 1 for the search button. This led to an error because the indices didn't exist. That tells me that the elements' indexes aren't as expected.

Perhaps instead of relying solely on indexes, I should use XPath or another method to locate these elements more accurately. Alternatively, maybe scrolling could bring the input field into view if it's not immediately accessible.

I also remember that sometimes websites have dynamic content or pop-ups (like cookies) that might interfere with element interaction. So, checking for and handling any such elements before proceeding would be wise.

Given the error about index 0 not existing, I'll need to adjust my approach. Maybe using XPath selectors to directly target the search input and button instead of relying on indexes. That way, even if the indexes change or aren't available, the actions can still be performed.

Additionally, after performing the search, extracting the first URL requires identifying the correct element in the results page. I should ensure that the extraction method captures this correctly, perhaps by using a more specific XPath or waiting for the results to load fully before extracting.

In summary, my revised plan includes:

  1. Navigating directly to google.com.
  2. Using XPath selectors to locate and input the search query into the correct field.
  3. Clicking the search button using an accurate selector.
  4. Extracting the first URL from the search results efficiently.

I need to make sure each action is correctly formulated with valid parameters and that I handle any potential issues like pop-ups or dynamic content along the way.

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
ERROR [agent] ❌ Result failed 3/5 times:
Invalid model output format. Please follow the correct schema.
Details: 1 validation error for CustomAgentOutput
current_state
Field required [type=missing, input_value={'current__state': {'prev...]/div[2]/button/div'}}]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
INFO [src.agent.custom_agent]

@kishangondaliya
Copy link

I can confirm same issue with the latest release

@aparente
Copy link
Author

Update:

I inadvertently left it running and realized DeepSeek made progress before stalling again. It's UNUSABLY slow, but clearly doing....something. It got to the point of highlighting elements on the google search page before running into more errors.

📍 Step 2
INFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent]
Alright, so I'm trying to help the user by automating a browser task. The goal is to go to Google's homepage, type 'OpenAI', click the search button, and then extract the first URL from the results.

First, looking at the problem, the current page is about:blank, which means it's just an empty tab. There are no interactive elements available yet, so I can't input text or click buttons right away. That makes sense because you need to navigate to a specific website before any elements on that site become interactive.

The user provided previous attempts where they tried to go to Google, but encountered issues with element indexes not existing. It seems like the indices 0 and 1 were being used for input and search button clicks, which probably don't exist yet because we're still on about:blank. So, I need a way to navigate to Google first before interacting with any elements.

To fix this, my plan is:

  1. Navigate to Google's Homepage: Use the "go_то_url" action to move from the blank page to Google.com. This should make all of Google's interactive elements available for interaction.

  2. Input Search Query: Once on Google, find and interact with the search bar. Instead of relying on an index that might not exist, I'll use XPath selectors to precisely locate the input field. The XPath "/html/body/div[1]/div[3]/form/.../input" should target the correct input element.

  3. Click the Search Button: After typing the query, I need to click the search button. Again, using XPath here ensures that we're targeting the exact button without depending on an index that might change or not exist in different page states.

  4. Extract Page Content: Finally, after performing the search and landing on the results page, I'll extract the content of the page. By including links, I can then parse out the first URL from the search results to provide it back to the user.

Throughout this process, I need to ensure that each action logically follows the previous one and that I handle any potential errors, such as elements not being found or actions failing due to incorrect selectors. If issues arise, retrying with more accurate XPath expressions or waiting for the page to fully load could be necessary steps.

It's also important to remember to validate each step to confirm that the expected elements exist before attempting interactions. This helps prevent further errors and ensures that the automation process stays on track towards completing the user's task successfully.

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
ERROR [agent] ❌ Result failed 4/5 times:
Invalid model output format. Please follow the correct schema.
Details: 1 validation error for CustomAgentOutput
current_state
Field required [type=missing, input_value={'current_ state': {'prev...form/.../button/div'}}]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
INFO [src.agent.custom_agent]
📍 Step 2
INFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent]
Okay, so I'm trying to help the user complete their task of going to Google, searching for 'OpenAI', and getting the first URL result. But it seems like there are some issues with how the previous responses were formatted.

First, in the last response, I mistakenly wrote "current__state" instead of "current_state", which is a typo. Also, another typo was "go_то_url" instead of "go_to_url". These typos caused validation errors because the JSON structure didn't match what was expected.

I need to make sure that all the keys in the JSON are correctly spelled. So, I'll correct those typos to ensure the JSON is valid. Also, I should verify that each action within the sequence uses the correct function names and parameters.

Additionally, I'm considering if there's a more reliable way to locate the search input on Google. Sometimes elements can change, so relying solely on indexes might not be the best approach. Maybe using XPath expressions or other selectors could help avoid issues where the index doesn't exist anymore.

But for now, since the main issue was formatting errors, I'll fix those and make sure each action is correctly specified. Once these corrections are made, I should test the sequence to ensure it navigates properly, inputs the search term, clicks the button, and extracts the first URL.

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
INFO [src.agent.custom_agent] ✅ Eval: Success - The page is about:blank, no previous actions taken yet.
INFO [src.agent.custom_agent] 🧠 New Memory:
INFO [src.agent.custom_agent] ⏳ Task Progress:

INFO [src.agent.custom_agent] 📋 Future Plans:

  1. Navigate to google.com. 2. Input 'OpenAI' into the search bar. 3. Click the search button. 4. Extract the first URL from the results.
    INFO [src.agent.custom_agent] 🤔 Thought: I need to start by navigating to Google's homepage since the current page is blank. Once there, I'll locate and interact with the search input and button to perform the search for 'OpenAI'. After that, I'll extract the necessary information from the search results page.
    INFO [src.agent.custom_agent] 🎯 Summary: Navigate to Google, perform a search for 'OpenAI', and extract the first URL result.
    INFO [src.agent.custom_agent] 🛠️ Action 1/4: {"go_to_url":{"url":"https://www.google.com"}}
    INFO [src.agent.custom_agent] 🛠️ Action 2/4: {"input_text":{"index":0,"text":"OpenAI"}}
    INFO [src.agent.custom_agent] 🛠️ Action 3/4: {"click_element":{"index":1}}
    INFO [src.agent.custom_agent] 🛠️ Action 4/4: {"extract_content":{"include_links":true}}
    INFO [src.agent.custom_agent] 🧠 All Memory:

INFO [controller] 🔗 Navigated to https://www.google.com
INFO [controller] Something new appeared after action 1 / 4
INFO [src.agent.custom_agent]
📍 Step 3
INFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent]
Okay, let's break down what the user is asking for. They want me to go to Google, type 'OpenAI', click search, and give them the first URL. So far, I've navigated to Google successfully but failed when trying to input text and click the search button.

Looking at the interactive elements provided, there are several options. The search bar is index 10, which seems correct because it's a textarea with the title "Search". Maybe in my previous attempts, I used the wrong index or didn't wait for the page to load properly.

I should try inputting into index 10 this time and then click the search button at index 13. After that, extract the content to find the first URL result.

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
ERROR [agent] ❌ Result failed 1/5 times:
Invalid model output format. Please follow the correct schema.
Details: 1 validation error for CustomAgentOutput
current_state
Field required [type=missing, input_value={'current_ state': {'prev...include_links': True}}]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
INFO [src.agent.custom_agent]
📍 Step 3
INFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent]

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
INFO [src.agent.custom_agent] ❌ Eval: Failed - The previous actions failed due to incorrect element indexes or timing issues.
INFO [src.agent.custom_agent] 🧠 New Memory:
INFO [src.agent.custom_agent] ⏳ Task Progress:

INFO [src.agent.custom_agent] 📋 Future Plans:

  1. Locate the search input field using index 10. 2. Input 'OpenAI' into the search bar. 3. Click the search button at index 13. 4. Extract the first URL from the results.
    INFO [src.agent.custom_agent] 🤔 Thought: Since previous attempts failed due to incorrect element indexes, I'll use the correct indices based on the current page structure: input_text using index 10 and click_element with index 13 for the search button.
    INFO [src.agent.custom_agent] 🎯 Summary: Input 'OpenAI' into the search bar at index 10, click the search button at index 13, and extract the first URL result.
    INFO [src.agent.custom_agent] 🛠️ Action 1/3: {"input_text":{"index":10,"text":"OpenAI"}}
    INFO [src.agent.custom_agent] 🛠️ Action 2/3: {"click_element":{"index":13}}
    INFO [src.agent.custom_agent] 🛠️ Action 3/3: {"extract_content":{"include_links":true}}
    INFO [src.agent.custom_agent] 🧠 All Memory:

INFO [controller] ⌨️ Input "OpenAI" into index 10
INFO [controller] Something new appeared after action 1 / 3
INFO [src.agent.custom_agent]
📍 Step 4
INFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent]
Alright, let's break down how I arrived at the solution for this problem.

First, the user wants me to navigate to Google's homepage, search for "OpenAI," and then extract the first URL from the search results. The initial steps involved navigating to Google, which was straightforward.

However, in previous attempts, using the wrong element indexes caused failures. I realized that relying on incorrect indexes wasn't working, so I needed a different approach.

Looking at the current page elements provided by the user, I noticed that after typing "OpenAI" into the search bar (index 10), a list of suggestions appeared. These are listed as interactive elements from index 15 to 24. The correct suggestion is at index 15: "

OpenAI
".

Since the search button's index had changed, I needed to adjust that too. The updated search button is now at index 25 instead of the previously incorrect 13. So, I planned to click this new index after selecting the correct suggestion.

Next, to ensure accurate results and avoid timing issues or unexpected suggestions, I decided to use XPath selectors. XPath allows me to target elements more precisely based on their attributes rather than just their position in the DOM tree. This increases reliability, especially when dealing with dynamic content like dropdowns or search suggestions.

By selecting the suggestion using index 15 and then clicking the search button at index 25, I aimed to navigate directly to the desired results page. After that, extracting the first URL from the page would be the final step.

I also considered potential issues such as elements not being immediately clickable after loading or changes in element indexes due to dynamic content. To mitigate these risks, using XPath provides a more robust method compared to relying solely on element indices.

In summary, the key steps were:

  1. Use the correct index for the search bar (index 10) and input "OpenAI."
  2. Select the precise suggestion using its index (index 15).
  3. Click the updated search button (index 25) to execute the search.
  4. Extract the first URL from the resulting page.

This approach ensures that each action is targeted accurately, reducing the chances of errors and improving reliability in navigating through dynamic web pages like Google's search interface.

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
ERROR [agent] ❌ Result failed 1/5 times:
Invalid model output format. Please follow the correct schema.
Details: 1 validation error for CustomAgentOutput
current_state
Field required [type=missing, input_value={'current_ state': {'prev...nclude_ links': True}}]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
INFO [src.agent.custom_agent]
📍 Step 4
INFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent] ```json
{
"current_ state": {
"prev_action_evaluation": "Failed - The previous action failed because the index 15 element is not interactive or does not exist.",
"important_contents": "",
"task_progress": "",
"future_plans": "1. Use a reliable method to interact with the search input (e.g., using XPath). 2. Input 'OpenAI' and trigger the search action. 3. Extract the first URL from the results.",
"thought": "The previous attempt to click index 15 failed. To avoid this, I'll use XPath selectors for the search input and button to ensure accurate interaction despite dynamic page changes.",
"summary": "Use XPath selectors to interact with the search elements, perform the search, and extract the first URL result."
},
"action": [
{
"input_ text": {
"xpath": "/html/body/div[1]/div[3]/form/div[1]/div[1]/div[2]/div[2]/textarea",
"text": "OpenAI"
}
},
{
"click_element": {
"xpath": "/html/body/div[1]/div[3]/form/div[1]/div[1]/div[4]/center/input[1]"
}
},
{
"extract_ content": {
"include_links": true
}
}
]
}



INFO     [src.agent.custom_agent] 🤯 End Deep Thinking
ERROR    [agent] ❌ Result failed 2/5 times:
 Invalid model output format. Please follow the correct schema.
Details: 2 validation errors for CustomAgentOutput
action.0.input_text.index
  Field required [type=missing, input_value={'xpath': '/html/body/div...area', 'text': 'OpenAI'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.10/v/missing
action.1.click_element.index
  Field required [type=missing, input_value={'xpath': '/html/body/div...div[4]/center/input[1]'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.10/v/missing
INFO     [src.agent.custom_agent] 
📍 Step 4
INFO     [src.agent.custom_agent] 🤯 Start Deep Thinking: 
INFO     [src.agent.custom_agent] ```json
{
    "current_ state": {
        "prev_ action_ evaluation": "Failed  - The previous actions failed due to invalid XPath selectors or missing required fields in the actions.",
        "important_ contents": "",
        "task_ progress": "",
        "future_ plans": "1. Locate the search input using a more reliable XPath selector. 2. Input 'OpenAI' into the search bar. 3. Click the search button at index 25 using its correct XPath. 4. Extract the first URL from the results.",
        "thought": "The previous attempt failed because of incorrect XPath selectors or missing parameters. I need to verify the exact XPath for the search input and button, ensuring all required fields are included in the actions.",
        "summary": "Use corrected XPath selectors to interact with the search elements, perform the search, and extract the first URL result."
    },
    "action": [
        {
            "input_ text": {
                "xpath": "/html/body/div[1]/div[3]/form/div[1]/div[1]/div[2]/div[2]/textarea",
                "text": "OpenAI"
            }
        },
        {
            "click_ element": {
                "index": 25,
                "xpath": "/html/body/div[1]/div[3]/form/div[1]/div[1]/div[4]/center/input[1]"
            }
        },
        {
            "extract_ content": {
                "include_ links": true
            }
        }
    ]
}

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
ERROR [agent] ❌ Result failed 3/5 times:
Expecting value: line 1 column 1 (char 0)
INFO [src.agent.custom_agent]
📍 Step 4
INFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent]
Alright, let's tackle this step by step. The user wants me to go to Google, type 'OpenAI', click search, and then provide the first URL from the results.

First, I'm at google.com, so that's done. Next, I need to input 'OpenAI' into the search bar. Looking at the interactive elements provided, index 10 is a textarea with the title "Search", which seems perfect for entering text.

After typing 'OpenAI', Google might display suggestions below the search bar. However, since the user simply wants to click the search button, I don't need to interact with those suggestions. Instead, I'll directly click the search button, which is at index 25.

Once I've clicked the search button, Google will load the results page. To extract the first URL, I can use the extract_content action with include_links set to true. This should give me a list of links from the current page, and the first one in that list should be the top result for 'OpenAI'.

I need to make sure all actions are properly formatted. Each action is an object within a JSON array, following the specified format. Also, I should avoid any markdown or formatting issues that might have caused previous errors.

In summary:

  1. Input 'OpenAI' into index 10.
  2. Click the search button at index 25.
  3. Extract the first URL from the results using extract_content with include_links set to true.

This sequence should successfully fulfill the user's request without any formatting issues or missed steps.

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
ERROR [agent] ❌ Result failed 4/5 times:
Expecting value: line 3 column 1 (char 2)
INFO [src.agent.custom_agent]
📍 Step 4
INFO [src.agent.custom_agent] 🤯 Start Deep Thinking:
INFO [src.agent.custom_agent]
Alright, I need to help the user by navigating to Google, typing 'OpenAI', clicking search, and then extracting the first URL result.

First, let's check where we are now. The current URL is google.com, which means we're already on the right page. So, no need to navigate again.

Next, I need to find where to input the text 'OpenAI'. Looking at the interactive elements provided, there’s a textarea at index 10 labeled "Search". That's perfect for our search query.

Once the text is entered into the search bar, Google usually shows suggestions. But since the user just wants to click the search button, I should proceed directly. The search button is an input element with value "Google Search" and has the index 25. So, I'll use that to submit the search query.

After clicking the search button, the page will load new results. Now, to extract the first URL from these results, I can use the extract_content action with include_ links set to true. This will fetch all links on the current page, and I can then pick the first one which should be the desired result.

I also need to make sure that my actions are correctly formatted according to the user's instructions. Each action in the sequence must be valid and follow the specified JSON structure.

So, summarizing the steps:

  1. Input 'OpenAI' into the search bar at index 10.
  2. Click the Google Search button at index 25 to submit the query.
  3. Extract all links from the results page using extract_ content with include_ links enabled.

By following these steps, I should successfully complete the user's task and provide them with the first URL related to 'OpenAI'.

INFO [src.agent.custom_agent] 🤯 End Deep Thinking
ERROR [agent] ❌ Result failed 5/5 times:
Invalid model output format. Please follow the correct schema.
Details: 1 validation error for CustomAgentOutput
current_state
Field required [type=missing, input_value={'current_ state': {'prev...s [URL]"}, 'action': []}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
ERROR [agent] ❌ Stopping due to 5 consecutive failures
WARNING [src.agent.custom_agent] No history or first screenshot to create GIF from

@kishangondaliya
Copy link

kishangondaliya commented Jan 29, 2025

So this is what I figured out, I hope it helps.

  • With the latest release, it works (I mentioned in a previous comment that it was not working, but that was an issue on my side)
  • Take the latest code and try to run, repo is quite new and rapid development is going on so you can see quick fixes often
  • Try this release(follow instructions, are you missing: Max Actions per Step to 1): https://github.com/browser-use/web-ui/releases/tag/v1.4
  • Deepseek-r1 does not support function calling at the moment. That means it will not perform well in browser automation
  • I used deepseek-r1:7b. I had an Nvidia driver issue and hence it was not working for a bigger model on the CPU. Try with a smaller model first.

For me, it worked for simple tasks, like googling something, but when I asked to add a few items to the cart of Amazon, it faffled and went into a loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants