Source code and data for open source data science for social good. This is a data science portfolio.
-
university_sexcrimes
Analysis of data on sex crimes in US university campuses.
-
heart_disease_risk_prediction
Predicting heart disease risk from open data.
-
cancer_mortality_prediction
Predicting cancer survival using logistic regression from open data.
-
predicting_news_popularity
Predicting popularity of news articles from open data.
-
opensource_mapping_project
Open source mapping project.
-
astroinformatics
Analysis of astronomy data using machine learning techniques.
-
scientific_collaboration
Project to analyze planetary scale scientific collaboration data.
-
accident_prediction
Road accident forecasting and data exploration project.
Interactive website using shiny at:
-
patterns_in_crime
Predicting patterns of crime using data science. Larger cities have disproportionately more crime per capita compared to smaller cities (super-linear scaling of crime). We used techniques from dynamical systems and complex systems to explain the super-linear scaling of crime in cities and other socio-technological systems
-
spam_classification
Building an SVM based spam classifier trained on data from the UCI repository
-
breast_cancer_prediction
Downloads data from the UCI machine learning repository to make predictions for breast cancer. A few features turn out to be really important for prediction like epithelial cell size. This uses a random forest.
-
funding_trends_science
Project to analyze data on funding trends in biomedical science.
-
infectious_disease_prediction
Project to analyze data on emerging infectious diseases.
-
forecasting_imports
Project to forecast imports and model supply chains.
-
deep_learning_basic
Basic deep learning model using keras for prediction.
-
ai_healthcare
Machine learning and AI applied to healthcare.
-
ai_social_good
Machine learning, data science and AI for social good.
-
ai_bigdata_biology
Machine learning and bioinformatics for big data in biology.
-
browser_based_data_science
Browser based data science for democratic access to data science tools.
-
clinical_informatics
Open source privacy-preserving clinical informatics.
-
policy_paper_general_public
Policy paper for general public on Ethical Artificial Intelligence (EAI) for social good.
-
nlp
Resources, code and data for natural language processing.
-
self_organising_map_wine_dataset
A self organising map (SOM) on the UCI wine dataset using the Orange data science tool.
-
outreach
Outreach for machine learning and AI for general public
-
teaching_resources
Teaching resources for machine learning, data science and AI for a general audience
-
Quick summary
- Open source code and data for open source data science.
-
If you use this code, please cite the paper and code
-
Citizen Data Science for Social Good: Case Studies and Vignettes from Recent Projects https://doi.org/10.13140/RG.2.1.1846.6002
-
Citizen Data Science for Social Good in Complex Systems, Interdisciplinary Description of Complex Systems, 16(1):88-91, 2018 http://indecs.eu/index.php?s=x&y=2018&p=88-91
-
Banerjee, Soumya. (2017, September 3). Citizen Data Science for Social Good: Case Studies and Vignettes from Recent Projects (Supplementary Resources). Zenodo. http://doi.org/10.5281/zenodo.883783
![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.883783.svg)](https://doi.org/10.5281/zenodo.883783)
-
-
These projects are an example of my approach to data science for good. I work very closely with domain experts and stakeholders and use computational tools for good. I outline my design and work philosophy below.
Install R, R Studio, MATLAB and Python
Install R
and R Studio
https://www.rstudio.com/products/rstudio/download/preview/
source("https://raw.githubusercontent.com/neelsoumya/rlib/master/INSTALL_MANY_MODULES.R")
Install Python dependencies as follows:
pip3 install -r requirements.txt
Soumya Banerjee
https://sites.google.com/site/neelsoumya/
[email protected]