python_scraber_foxebook_net

My first scraber =))) it is study work.

I want to make scraber for site 'http://www.foxebook.net/', and save all data in csv.

My aim to learn work with lmxl library. Right now I don't know how to work with urllib or pycurl, and will use framework grab.

I think, it will be correct to make it in next step: 1. Make list of pages, who content links to details. 2. Make dict of pages with details. 3. Collect details to dict. 3.1 Test to work download link. 3.2 Collect tag from page. 4. Write details to CSV.

Problems: UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 3: ordinal not in range(128)

For valid link test we need find all link to filestorages and get page from every URL: http://www.embedupload.com/ xpath to table with links to filestorages /html/body/table[2]/tbody/tr/td[2]/table/tbody/tr[2]/td/div/table/tbody

xpath to block with url from page with link to file
/html/body/table[2]/tbody/tr/td[2]/table/tbody/tr[1]/td/table/tbody/tr[3]/td/div/span/b/a
/html/body/table[2]/tbody/tr/td[2]/table/tbody/tr[1]/td/table/tbody/tr[3]/td/div/span/b/a

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
scraber.py		scraber.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

python_scraber_foxebook_net

About

Releases

Packages

Languages

magakoos/python_scraber_foxebook_net

Folders and files

Latest commit

History

Repository files navigation

python_scraber_foxebook_net

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages