Scientific Python - Final Project

ToDo:

SCRAPPER

Task 1

A function to scrap all the items given a search query. E.g. the user wants alle entries from page 1 to 10 containing male t-shirts.

Reding of the config file and creating the respective url
Iterating trough the pages
- Handling min and max amount of articles to read
- Implementing a stop criterion
- Fail case. What happens if there is no next page?
Complete the category and size ids

Task 2

Given an username find all information about that person. Check the reviews, save links to all listed items and check for the amount of likes and overall prices.

Find url for users
- Serch for the user since url needs the id of user
Find usefull Tags or X-paths to the reviews
Loop over multiple pages is given

VISUALIZATION

Plotting the different attributes according to clothing type
Comparing availability between genders
Plot the color diversity between genders
...

ANNOTATOR

Create the annotation page
- Handling the dataset
  - Save the images in a seperate folder when the user tags them as either like or dislike
  - Save all images from one article
  - (Include relabeling of unseen images)
- Ajax interactions
...

Short description

This project will contain an interactive web scrapper with an user intafece based on the browser where the use will be able to filter their search and annotate the scrapped data. However the scrapper will only work on the webstore of Vinted.com.

Requirements

For the scrapper to run

Install python requirements using pip:
pip install -r requirements.txt

Create conda env using:
conda env create -f scrapper.yml

Instructions

Each of the modules has a seperate argument. If you want to run the scrapper just type:
python run.py --mode scrapper --page_range [1-15]

When running the scrapper you have to create or modifiy a config file. In this case this fill will be config.yaml. Within this file you can specify which kind of clothes you want and also define some other parameters such as size, color, brands ect.. When creating the config file the entries should follow this pattern config.yaml

args = [
        {"gender": "femal", "category": shoes", "size": [36, 37], "color": ["red", "green"], "brand": ["nike", "addidas"]},
        {"gender": "male", "category": shoes", "size": [36, 37], "color": ["red", "green"], "brand": ["nike", "addidas"]},
        {"gender": "male", "category": pants", "size": ["M", "L"]}
    ]

or using the yaml type description:

args:
    - {"gender": "femal", "category": shoes", "size": [36, 37], "color": ["red", "green"], "brand": ["nike", "addidas"]},
    - {"gender": "male", "category": shoes", "size": [36, 37], "color": ["red", "green"], "brand": ["nike", "addidas"]},
    - {"gender": "male", "category": pants", "size": ["M", "L"]}

File structure

data
- contains the generated csv files with information about the seller, price, color, clothing type
scripts
- to eas the handling of the multiple modules we created seperate scripts file for each of the module which are called by using the respective argument when running the main file
src
- the code modules for the different functionalities

├── data
│ └── run0.csv
│ └── run1.csv
├── scripts
│ └── annotator.py
│ └── scrapper.py
│ └── trainer.py

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
documentation		documentation
drivers		drivers
logs/network/run_001		logs/network/run_001
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
run.py		run.py
scrapper.yml		scrapper.yml
test.py		test.py
tests.ipynb		tests.ipynb
visualizations.ipynb		visualizations.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scientific Python - Final Project

ToDo:

SCRAPPER

Task 1

Task 2

VISUALIZATION

ANNOTATOR

Short description

Requirements

Instructions

File structure

About

Releases

Packages

Languages

Koni99/scipy_final

Folders and files

Latest commit

History

Repository files navigation

Scientific Python - Final Project

ToDo:

SCRAPPER

Task 1

Task 2

VISUALIZATION

ANNOTATOR

Short description

Requirements

Instructions

File structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages