Request will be rejected by webserver when headless is true.Code works well when headless is false. #1979
Replies: 2 comments
-
Certain functionality doesn't work in headless mode. For example requests that are created by JS and requires authentication that relies on proper browser emulation doesn't work in headless. Since you didn't provide a lot of context I would have to say that you must run it in headful mode. |
Beta Was this translation helpful? Give feedback.
-
Adding to what @lunden23 mentioned here, my scraping success rate was absolute trash running it in headless mode (might as well just run requests with header / proxy, I've seen it do better). My work around is I have 2 dedicated workstations that are purely for scraping activities. I can easily run 12 independent chrome windows / drivers on each machine and have them scrape in parallel with sub processes. Note if you do this then you will need to frequently clean up your chrome profile data as starting a chrome session with undetected chrome driver will create a new profile. I learned this the hard way when I saw I had 1 TB in chrome profiles on my pc. |
Beta Was this translation helpful? Give feedback.
-
start_urls = ["http://www.chinaunicombidding.cn/bidInformation"]
Response.text
<html><head></head><body></body></html>
Beta Was this translation helpful? Give feedback.
All reactions