Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIR-176] JSONDecodeError for Firecrawl #912

Closed
zahra-teb opened this issue Nov 20, 2024 · 3 comments
Closed

[FIR-176] JSONDecodeError for Firecrawl #912

zahra-teb opened this issue Nov 20, 2024 · 3 comments
Labels
bug Something isn't working question Further information is requested sdk

Comments

@zahra-teb
Copy link

Describe the Bug
The following code encounters JSONDecodeError: Expecting value: line 2 column 1 (char 1)

from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key='fc-*********************')

response = app.scrape_url(url='https://www.zoomit.ir/review/416149-samsung-galaxy-book-3-ultra-review/', params={
	'formats': [ 'markdown' ],
})

{
"name": "JSONDecodeError",
"message": "Expecting value: line 2 column 1 (char 1)",
"stack": "---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
File ~/newest_software/software/myenv/lib/python3.11/site-packages/requests/models.py:974, in Response.json(self, **kwargs)
973 try:
--> 974 return complexjson.loads(self.text, **kwargs)
975 except JSONDecodeError as e:
976 # Catch JSON-related errors and raise as requests.JSONDecodeError
977 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError

File /usr/lib/python3.11/json/init.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
343 if (cls is None and object_hook is None and
344 parse_int is None and parse_float is None and
345 parse_constant is None and object_pairs_hook is None and not kw):
--> 346 return _default_decoder.decode(s)
347 if cls is None:

File /usr/lib/python3.11/json/decoder.py:337, in JSONDecoder.decode(self, s, _w)
333 """Return the Python representation of s (a str instance
334 containing a JSON document).
335
336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
338 end = _w(s, end).end()

File /usr/lib/python3.11/json/decoder.py:355, in JSONDecoder.raw_decode(self, s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end

JSONDecodeError: Expecting value: line 2 column 1 (char 1)

During handling of the above exception, another exception occurred:

JSONDecodeError Traceback (most recent call last)
Cell In[1], line 6
2 from firecrawl import FirecrawlApp
4 app = FirecrawlApp(api_key='fc-*******************')
----> 6 response = app.scrape_url(url='https://www.zoomit.ir/review/416149-samsung-galaxy-book-3-ultra-review/', params={
7 \t'formats': [ 'markdown' ],
8 })

File ~/newest_software/software/myenv/lib/python3.11/site-packages/firecrawl/firecrawl.py:89, in FirecrawlApp.scrape_url(self, url, params)
87 raise Exception(f'Failed to scrape URL. Error: {response}')
88 else:
---> 89 self._handle_error(response, 'scrape URL')

File ~/newest_software/software/myenv/lib/python3.11/site-packages/firecrawl/firecrawl.py:592, in FirecrawlApp._handle_error(self, response, action)
581 def _handle_error(self, response: requests.Response, action: str) -> None:
582 """
583 Handle errors from API responses.
584
(...)
590 Exception: An exception with a message containing the status code and error details from the response.
591 """
--> 592 error_message = response.json().get('error', 'No error message provided.')
593 error_details = response.json().get('details', 'No additional error details provided.')
595 if response.status_code == 402:

File ~/newest_software/software/myenv/lib/python3.11/site-packages/requests/models.py:978, in Response.json(self, **kwargs)
974 return complexjson.loads(self.text, **kwargs)
975 except JSONDecodeError as e:
976 # Catch JSON-related errors and raise as requests.JSONDecodeError
977 # This aliases json.JSONDecodeError and simplejson.JSONDecodeError
--> 978 raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)

JSONDecodeError: Expecting value: line 2 column 1 (char 1)"
}

@zahra-teb zahra-teb added the bug Something isn't working label Nov 20, 2024
@zahra-teb
Copy link
Author

Well, I found the reason :)) The response status is 403!

@nickscamara
Copy link
Member

Oh interesting! @rafaelsideguide did we patch this?

@nickscamara nickscamara added question Further information is requested sdk labels Dec 6, 2024
Copy link
Collaborator

ftonato commented Dec 27, 2024

Hello @zahra-teb,

After conducting tests, I found that the problem is no longer reproducible. As such, I will be closing this issue for now.

If the problem persists or reoccurs, please don't hesitate to open a new ticket, and we'll be happy to assist you further.

For reference, here is the code I used for testing, which consistently returned "statusCode": 200:

from firecrawl import FirecrawlApp
import json

def main():
    app = FirecrawlApp(api_key='fc-****')

    urls = [
        'https://www.zoomit.ir/review/416149-samsung-galaxy-book-3-ultra-review/',
        'https://www.safeschoolsdc.org/financial-assistance-award',
        'https://www.mvcupboard.org/food-assistance'
    ]
    
    params = {
        'formats': ['markdown'],
    }

    for url in urls:
        try:
            response = app.scrape_url(url=url, params=params)
            
            print(f"Scraping successful for {url}!")
            print("\nResponse:")
            print(json.dumps(response, indent=2))
            
        except Exception as e:
            print(f"An error occurred while scraping {url}: {str(e)}")

@ftonato ftonato changed the title [Bug] JSONDecodeError for Firecrawl JSONDecodeError for Firecrawl Dec 27, 2024
@ftonato ftonato changed the title JSONDecodeError for Firecrawl [FIR-176] JSONDecodeError for Firecrawl Dec 27, 2024
@ftonato ftonato closed this as completed Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested sdk
Projects
None yet
Development

No branches or pull requests

3 participants