-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
爬取成功,但是db文件是空的 #3
Comments
我就试了下爬东城的,结果数据库还是空的 To get rid of this warning, change this: BeautifulSoup([your markup]) to this: BeautifulSoup([your markup], "lxml") markup_type=markup_type)) |
链家最近加了严格的限制(验证码和流量限制),代码还没有进行更新 |
我把代码改了,把小区信息爬了下来,就是爬到成交记录,就ip异常,我再去找找有没有解决的办法,谢谢了 |
@lanbing510 @wangbiao92 我第一次趴下来的时候数据db也是空的,你后来怎么解决数据库空的,运行第二次就一直有验证码和流量限制了 没办法login了,可以share 一下相应的code或者解决方法么?谢谢 |
@pfsun ,链家的网页改动了,所以代码要改动,但还是没有解决流量异常的问题,用了ip代理没用 |
可以试试我这个爬虫 数据会存在mysql。https://github.com/XuefengHuang/lianjia-scrawler |
@wangbiao92 好的 我再试试 谢谢 |
@XuefengHuang 谢谢 我试试去。 |
登入也成功,爬的是北京的,没有报错,就是db文件是空的,请问哪里出错了
The text was updated successfully, but these errors were encountered: