-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fatal error, been suspended #48
Comments
…nto a concatenated string, instead of an array, refs #48
Hi @dvlp123456, to fix the last 2 cases, please try the current version from the main branch where I just pushed the fix. If you are using crawler in WSL, it's straightforward - see the tutorial https://crawler.siteone.io/installation-and-requirements/manual-installation/#linux-x64-or-wsl-on-windows In the first case I see that the request timeouted after 5 seconds. Since you're using random parameter generation in the GET query, this may have bypassed the cache and if the test site is dynamically generated, it may be very slow. Try setting |
Hello, @janreges , thank you very much! After I updated crawler, case 2 and case 3 were solved perfectly, case 1 still has the same error message, although I set timeout to 120 and remove --add-random-query-params parameter ./crawler --url=https://www.XXXXX.com/ --output=text --workers=2 --memory-limit=1024M --timeout=120 --max-queue-length=3000 --max-visited-urls=10000 --max-url-length=5000 --max-non200-responses-per-basename=10 --remove-query-params --show-scheme-and-host --do-not-truncate-url --output-html-report=tmp/myreport.html --output-json-file=wr_test_dir/report.json --output-text-file=wr_test_dir/report.txt --add-timestamp-to-output-file --add-host-to-output-file --ignore-store-file-error --sitemap-xml-file=wr_test_dir/sitemap.xml --sitemap-txt-file=wr_test_dir/sitemap.txt --sitemap-base-priority=0.5 --sitemap-priority-increase=0.1 |
Please try using This will bypass possible problems related to DNS resolving. More info about If the problem persists, it is possible that the target site or its firewall detects the use of a crawler and DROPs the connection. In this case, try to force a custom user-agent, e.g. using |
Hi, my friend,
I'm new crawler, I had crawled some website with
siteone-crawler
and got correct report,However, there are three websites that did not get the correct report, error message like this:
my environment:
I would be very grateful if you could help solve this problem!
The text was updated successfully, but these errors were encountered: