OSML_value

Estimating the economic value of open source machine learning repositories by scraping Github.

This is the code repository for an upcoming paper by Max Langenkamp and Daniel Yue on estimating the value of open source ML repositories

File explanations:

Copy of List of tools for MLOps_v4 - Tools.csv: the csv containing the ML repos of interest including links to the relevant Github repositories scraped_contributor_information_for_repos.csv: the file containing the scraped contributor information. At first this should not exist, but as the script runs/ inevitably has to rerun as Github rate limits the scraping, this file will be used to avoid duplicating effort. scrape_ml_repo.py: contains all the code to scrape. It is very messy (in large part because of rush + Github rate limiting)

Quick start instructions

Git clone repo
Create virtualenv and install dependencies python3 -m venv env && source env/bin/activate pip3 install -r requirements.txt
If you want to run the scraping file from scratch, you should rename or delete scraped_contributor_information_for_repos.csv otherwise no scraping will happen.

Cautionary note: Github rate limits prevent you from scraping all the repos in our list at once. You can either wait an hour to continue again or else use a VPN once you detect blocking.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.gitignore		.gitignore
2022-02-18 - contributions.csv		2022-02-18 - contributions.csv
Copy of List of tools for MLOps_v4 - Tools.csv		Copy of List of tools for MLOps_v4 - Tools.csv
LICENSE		LICENSE
OSML_value.Rproj		OSML_value.Rproj
README.md		README.md
extract_contributions.ipynb		extract_contributions.ipynb
plot_contributions.R		plot_contributions.R
requirements.txt		requirements.txt
scrape_ml_repo.py		scrape_ml_repo.py
scraped_contributor_information_for_repos.csv		scraped_contributor_information_for_repos.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OSML_value

File explanations:

Quick start instructions

About

Releases

Packages

Contributors 2

Languages

License

Yichabod/OSML_value

Folders and files

Latest commit

History

Repository files navigation

OSML_value

File explanations:

Quick start instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages