爬取成功，但是db文件是空的 #3

wangbiao92 · 2017-04-09T03:01:57Z

登入也成功，爬的是北京的，没有报错，就是db文件是空的，请问哪里出错了

wangbiao92 · 2017-04-09T03:12:26Z

我就试了下爬东城的，结果数据库还是空的
E:\Git\LianJiaSpider-master>python LianJiaSpider.py
d:\Anaconda2\lib\site-packages\bs4_init_.py:166: UserWarning: No parser was explicitly spe
arser for this system ("lxml"). This usually isn't a problem, but if you run this code on ano
ment, it may use a different parser and behave differently.

To get rid of this warning, change this:

BeautifulSoup([your markup])

to this:

BeautifulSoup([your markup], "lxml")

markup_type=markup_type))
爬下了东城区全部的小区信息
done
all done ^_^

lanbing510 · 2017-04-10T07:58:00Z

链家最近加了严格的限制（验证码和流量限制），代码还没有进行更新

wangbiao92 · 2017-04-10T08:24:24Z

我把代码改了，把小区信息爬了下来，就是爬到成交记录，就ip异常，我再去找找有没有解决的办法，谢谢了

pfsun · 2017-04-11T18:15:10Z

@lanbing510 @wangbiao92 我第一次趴下来的时候数据db也是空的，你后来怎么解决数据库空的，运行第二次就一直有验证码和流量限制了没办法login了，可以share 一下相应的code或者解决方法么？谢谢

wangbiao92 · 2017-04-12T15:39:20Z

@pfsun ，链家的网页改动了，所以代码要改动，但还是没有解决流量异常的问题，用了ip代理没用

XuefengHuang · 2017-04-13T03:26:24Z

可以试试我这个爬虫数据会存在mysql。https://github.com/XuefengHuang/lianjia-scrawler

pfsun · 2017-04-14T18:43:11Z

@wangbiao92 好的我再试试谢谢

pfsun · 2017-04-14T18:44:05Z

@XuefengHuang 谢谢我试试去。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

爬取成功，但是db文件是空的 #3

爬取成功，但是db文件是空的 #3

wangbiao92 commented Apr 9, 2017

wangbiao92 commented Apr 9, 2017

lanbing510 commented Apr 10, 2017

wangbiao92 commented Apr 10, 2017

pfsun commented Apr 11, 2017

wangbiao92 commented Apr 12, 2017

XuefengHuang commented Apr 13, 2017

pfsun commented Apr 14, 2017

pfsun commented Apr 14, 2017

爬取成功，但是db文件是空的 #3

爬取成功，但是db文件是空的 #3

Comments

wangbiao92 commented Apr 9, 2017

wangbiao92 commented Apr 9, 2017

lanbing510 commented Apr 10, 2017

wangbiao92 commented Apr 10, 2017

pfsun commented Apr 11, 2017

wangbiao92 commented Apr 12, 2017

XuefengHuang commented Apr 13, 2017

pfsun commented Apr 14, 2017

pfsun commented Apr 14, 2017