Skip to content

pgadosey/Complete-Life-Cycle-of-a-Data-Science-Project

ย 
ย 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Complete-Life-Cycle-of-a-Data-Science-Project

CREDITS:All corresponding resources

MOTIVATION:Motivation to create this repository to help upcoming aspirants and help to others in the data science field

Business understanding

1.Data collection

Data consists of 3 kinds

a.Structure data (tabular data,etc...)

b.Unstructured data (images,text,audio,etc...)

c.semi structured data (XML,JSON,etc...)

variable

a.qualitative (nominal,ordinal,binary)

b.quantitative(discrete,continuous)

a.Web scraping best article to refer-https://towardsdatascience.com/choose-the-best-python-web-scraping-library-for-your-application-91a68bc81c4f

https://www.analyticsvidhya.com/blog/2019/10/web-scraping-hands-on-introduction-python/?utm_source=linkedin&utm_medium=KJ|link|weekend-blogs|blogs|44087|0.875

https://www.analyticsvidhya.com/blog/2019/10/web-scraping-hands-on-introduction-python/?utm_source=linkedin&utm_medium=AV|link|high-performance-blog|blogs|44204|0.375

https://www.bigdatanews.datasciencecentral.com/profiles/blogs/top-30-free-web-scraping-software

https://medium.com/analytics-vidhya/master-web-scraping-completly-from-zero-to-hero-38051423256b

1.Beautifulsoup

2.Scrapy

3.Selenium

4.Request to access data 

5.AUTOSCRAPER - https://github.com/alirezamika/autoscraper

webbot https://pypi.org/project/webbot/

6.Twitter scraping tool (๐š๐š ๐š’๐š—๐š or tweepy)-https://github.com/twintproject/twint

  https://analyticsindiamag.com/complete-tutorial-on-twint-twitter-scraping-without-twitters-api/
  
  https://developer.twitter.com/en/docs
  
  Scraping Instagram -instaloader  https://thecleverprogrammer.com/2020/07/30/scraping-instagram-with-python/
  
  Scrape Wikipedia  wikipedia
  
  Web Scraping to Create a CSV File  https://thecleverprogrammer.com/2020/08/08/web-scraping-to-create-csv/

7.urllib

8.pattern

9.Octoparse Easy Web Scraping  https://www.octoparse.com/ 

 ParseHub https://www.parsehub.com/  https://analyticsindiamag.com/parsehub-no-code-gui-based-web-scraping-tool/
 
 Diffbot  https://analyticsindiamag.com/diffbot/
 
 Trustpilot
 
 lxml  https://lxml.de/index.html#introduction
 
 ScrapingBee  https://analyticsindiamag.com/scrapingbee-api/
 
 MechanicalSoup https://analyticsindiamag.com/mechanicalsoup-web-scraping-custom-dataset-tutorial/
 
 Scrape HTML tables https://www.youtube.com/watch?v=6U5xJ3mXRKA&feature=youtu.be 
 
 patang (extract product details) https://github.com/tejazz/patang
 
 pandas(read_html)
 
 https://analyticsindiamag.com/complete-learning-path-to-web-scraping-with-all-major-tools/

b.Web Crawling

https://python.libhunt.com/scrapy-alternatives

b.3rd party API'S

c.creating own data (manual collection eg:google docx,servey,etc...) primary data

d.Databases

Databases are 2 kind sequel and no sequel database

sql,sql lite,mysql,mongodb,hadoop,elastic search,cassendra,amazon s3,hive,googlebigtable,AWS DynamoDB,HBase,oracle db

sql in python https://medium.com/jbennetcodes/how-to-rewrite-your-sql-queries-in-pandas-and-more-149d341fc53e

Cloud AI Data labeling service https://cloud.google.com/ai-platform/data-labeling/docs?utm_source=youtube&utm_medium=Unpaidsocial&utm_campaign=guo-20200503-Data-Labeling

e.Online resources - ultimate resource https://datasetsearch.research.google.com/

1)kaggle-https://www.kaggle.com/datasets , ๐š™๐š’๐š™ ๐š’๐š—๐šœ๐š๐šŠ๐š•๐š• ๐š”๐šŠ๐š๐š๐š•๐šŽ๐š๐šŠ๐š๐šŠ๐šœ๐šŽ๐š๐šœ

Downloading Kaggle datasets directly into Google Colab -https://towardsdatascience.com/downloading-kaggle-datasets-directly-into-google-colab-c8f0f407d73a

2)movielens-https://grouplens.org/datasets/movielens/latest/

3)data.gov-https://data.gov.in/

4)uci-https://archive.ics.uci.edu/ml/datasets.php     https://github.com/tirthajyoti/UCI-ML-API

5)Group Lens dataset https://grouplens.org/

Wikipedia ML Datasets https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research

6)world3bank  https://data.world/ , worldbank

7)Google Cloud BigQuery public datasets

  Google Public Datasets-cloud.google.com/bigquery/public-data/
  
  Google Cloud Data Catalog  https://cloud.google.com/data-catalog
  
  Academic Torrents-https://academictorrents.com/check.htm?returnto=%2Fbrowse.php

8)online hacktons

9)image data from google_images_download

https://www.visualdata.io/discovery

http://xviewdataset.org/#dataset

https://ai.googleblog.com/2016/09/introducing-open-images-dataset.html

10)image data from Bing_Search

image data from simple_image_download  https://github.com/RiddlerQ/simple_image_download

11)https://www.columnfivemedia.com/100-best-free-data-sources-infographic

12)Reddit:https://lnkd.in/dv5UCD4       https://www.reddit.com/r/datasets/

13)https://datasets.bifrost.ai/?ref=producthunt

14)data.world:https://lnkd.in/gEK897K

15)https://data.world/datasets/open-data

   https://tinyletter.com/data-is-plural

16)FiveThirtyEight :-  https://lnkd.in/gyh-HDj , https://data.fivethirtyeight.com/

17)BuzzFeed :- https://lnkd.in/gzPWyHj

   Buzzfeed News -github.com/BuzzFeedNews
   
   Socrata - https://opendata.socrata.com/

18)Google public datasets :- https://lnkd.in/g5dH8qE

Statistics Canada https://www.statcan.gc.ca/eng/start  https://towardsdatascience.com/how-to-collect-data-from-statistics-canada-using-python-db8a81ce6475

19)Quandl :- https://www.quandl.com  stock data

   statista : https://www.statista.com/ stock data

20)socorateopendata :- https://lnkd.in/gea7JMz

21)AcedemicTorrents :- https://lnkd.in/g-Ur9Xy

22)labelimage:- https://github.com/wkentaro/labelme  ,  https://github.com/tzutalin/labelImg 

Labelbox-https://labelbox.com/

Playment-https://playment.io/

SuperAnnotate -https://www.superannotate.com/

CVAT-https://github.com/openvinotoolkit/cvat

Lionbridge- https://lionbridge.ai/

LinkedAI: A No-code Data Annotations- https://analyticsindiamag.com/linkedai/

Dataturks

V7 Darwin The Rapid Image Annotator   https://docs.v7labs.com/docs/loading-a-dataset-in-python   https://github.com/v7labs/darwin-py#usage-as-a-python-library

https://waliamrinal.medium.com/top-and-easy-to-use-open-source-image-labelling-tools-for-machine-learning-projects-ffd9d5af4a20

https://github.com/heartexlabs/awesome-data-labeling  

Label a Dataset with a Few Lines of Code https://eric-landau.medium.com/label-a-dataset-with-a-few-lines-of-code-45c140ff119d

https://analyticsindiamag.com/complete-guide-to-data-labelling-tools/

23)tensorflow_datasets as tfds  https://www.tensorflow.org/datasets  (import tensorflow_datasets as tfds)

https://lionbridge.ai/datasets/tensorflow-datasets-machine-learning/

24)https://datasets.bifrost.ai/?ref=producthunt

25)https://ourworldindata.org/

26)https://data.worldbank.org/

27)google open images:https://storage.googleapis.com/openimages/web/download.html

https://cloud.google.com/bigquery/public-data/   https://towardsdatascience.com/bigquery-public-datasets-936e1c50e6bc  

28)https://data.gov.in/

29)imagenet dataset-http://www.image-net.org/

30)https://parulpandey.com/2020/08/09/getting-datasets-for-data-analysis-tasks%e2%80%8a-%e2%80%8aadvanced-google-search/

31)https://storage.googleapis.com/openimages/web/index.html  , 

   https://storage.googleapis.com/openimages/web/visualizer/index.html?set=train&type=segmentation&r=false&c=%2Fm%2F09qck
   
   https://console.cloud.google.com/marketplace/browse?filter=solution-type:dataset&_ga=2.35328417.1459465882.1589693499-869920574.1589693499
   
   https://catalog.data.gov/dataset?groups=education2168#topic=education_navigation
   
   https://vincentarelbundock.github.io/Rdatasets/datasets.html
 
32)coco dataset https://cocodataset.org/#explore
 
33)huggingface datasets-https://github.com/huggingface/datasets  https://huggingface.co/datasets  https://huggingface.co/languages

pip install datasets

34)Big Bad NLP Database-https://datasets.quantumstat.com/

https://github.com/niderhoff/nlp-datasets

nlp-datasets https://github.com/karthikncode/nlp-datasets

https://analyticsindiamag.com/15-most-important-nlp-datasets/      https://medium.com/ai-in-plain-english/25-free-datasets-for-natural-language-processing-57e407402c60

35)https://www.edureka.co/blog/25-best-free-datasets-machine-learning/

36)bigquery public dataset ,Google Public Data Explorer

https://cloud.google.com/public-datasets

37)inbuilt library data eg:iris dataset,mnist dataset,etc...

pandas-datareader  https://github.com/pydata/pandas-datareader

tf.data.Datasets for TensorFlow Datasets 

38)https://data.gov.sg/    https://data.gov.au/   https://data.europa.eu/euodp/en/data   https://data.europa.eu/euodp/en/data    https://data.govt.nz/

data.gov.be ,data.egov.bg/ ,data.gov.cz/english ,portal.opendata.dk,govdata.de,opendata.riik.ee,data.gov.ie,data.gov.gr,datos.gob.es,data.gouv.fr,data.gov.hr

dati.gov.it,data.gov.cy,opendata.gov.lt,data.gov.lv,data.public.lu,data.gov.mt,data.overheid.nl,data.gv.at,danepubliczne.gov.pl,dados.gov.pt,data.gov.ro,podatki.gov.si

data.gov.sk,avoindata.fi,oppnadata.se,https://data.adb.org/ ,https://data.iadb.org/ ,https://www.weforum.org/agenda/2018/03/latin-america-smart-cities-big-data/

https://data.fivethirtyeight.com/ , https://wiki.dbpedia.org/ ,https://www.europeandataportal.eu/en ,https://data.europa.eu/ ,https://www.census.gov/,

https://www.who.int/data/gho ,https://data.unicef.org/open-data/ ,http://data.un.org/ ,https://data.oecd.org/ ,https://data.worldbank.org/  

39.Awesome Public Dataset- https://github.com/awesomedata/awesome-public-datasets

https://github.com/the-pudding/data

datasets  https://github.com/benedekrozemberczki/datasets

kdnuggets  https://www.kdnuggets.com/datasets/index.html

Hub https://github.com/activeloopai/Hub

40.Datasets for Machine Learning on Graphs-https://ogb.stanford.edu/

41.https://www.johnsnowlabs.com/data/

42.30 largest tensorflow datasets-https://lionbridge.ai/datasets/tensorflow-datasets-machine-learning/

43. coco dataset-https://cocodataset.org/#home

Google Open images-https://opensource.google/projects/open-images-dataset  https://storage.googleapis.com/openimages/web/index.html

50+ Object Detection Datasets-https://medium.com/towards-artificial-intelligence/50-object-detection-datasets-from-different-industry-domains-1a53342ae13d

   70+ Image Classification Datasets from different Industry domains-https://medium.com/towards-artificial-intelligence/70-image-classification-datasets-from-different-industry-domains-part-2-cd1af6e48eda
   
bifrost-   https://datasets.bifrost.ai/

https://public.roboflow.com/

https://www.visualdata.io/discovery        http://www.image-net.org/      https://www.cs.toronto.edu/~kriz/cifar.html  
   
tensorflow_datasets.object_detection - https://storage.googleapis.com/openimages/web/index.html

https://github.com/google-research-datasets/Objectron/  https://ai.googleblog.com/2020/11/announcing-objectron-dataset.html?m=1 

http://idd.insaan.iiit.ac.in/   http://database.mmsp-kn.de/koniq-10k-database.html

https://ai.googleblog.com/2020/11/announcing-objectron-dataset.html

https://www.visualdata.io/discovery  https://blogs.bing.com/maps/2019-03/microsoft-releases-12-million-canadian-building-footprints-as-open-data

https://blogs.bing.com/maps/2019-09/microsoft-releases-18M-building-footprints-in-uganda-and-tanzania-to-enable-ai-assisted-mapping

https://datasets.bifrost.ai/     https://storage.googleapis.com/openimages/web/download.html  https://computervisiononline.com/datasets  http://yacvid.hayko.at/

https://www.cogitotech.com/use-cases/biodiversity/

ImageNet data -http://image-net.org/

ApolloScape Dataset-http://apolloscape.auto/

https://github.com/chrieke/awesome-satellite-imagery-datasets

44.https://github.com/fivethirtyeight/data

45.Recommender Systems Datasets-https://cseweb.ucsd.edu/~jmcauley/datasets.html

46.indiadataportal-https://indiadataportal.com/

47.US Government Open Dataset: https://www.data.gov/

https://censusreporter.org/   https://data.census.gov/cedsci/

48.AWS Public Data Sets:https://registry.opendata.aws/    https://aws.amazon.com/opendata/?wwps-cards.sort-by=item.additionalFields.sortDate&wwps-cards.sort-order=desc

49.https://the-eye.eu/public/AI/pile_preliminary_components/

  Reddit -https://www.reddit.com/r/datasets/
  
  wikipedia-https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
  
  http://opendata.cern.ch/  ,  https://www.imf.org/en/Data
  
  Global Health Observatory data repository-https://apps.who.int/gho/data/node.main
  
  CERN Open Data Portal-http://opendata.cern.ch/
  
  TensorFlow Datasets https://www.tensorflow.org/datasets
  
50.openblender- https://www.openblender.io/#/welcome

51.Top 10 Datasets For Cybersecurity Projects- https://analyticsindiamag.com/top-10-datasets-for-cybersecurity-projects/

52.Datasets from Web Crawl Data (nlp)-http://data.statmt.org/cc-100/

53.https://www.springboard.com/blog/free-public-data-sets-data-science-project/

54.NASA - https://nasa.github.io/data-nasa-gov-frontpage/ace 

55.Academic Torrents,GitHub Datasets,CERN Open Data Portal,Global Health Observatory Data Repository

56.32 Data Sets to Uplift your Skills in Data Science-https://blog.datasciencedojo.com/data-sets-data-science-skills/?utm_content=144243072&utm_medium=social&utm_source=linkedin&hss_channel=lcp-3740012

57.OpenDaL-https://opendatalibrary.com/

Data Is Plural-https://docs.google.com/spreadsheets/d/1wZhPLMCHKJvwOkP4juclhjFgqIY8fQFMemwKL2c64vk/edit#gid=0

VisualData-https://www.visualdata.io/discovery

https://medium.com/towards-artificial-intelligence/best-datasets-for-machine-learning-data-science-computer-vision-nlp-ai-c9541058cf4f
 
58.Pandas Data Reader-https://pandas-datareader.readthedocs.io/en/latest/remote_data.html

59.ieee-dataport-https://ieee-dataport.org/datasets

https://medium.com/towards-artificial-intelligence/best-datasets-for-machine-learning-data-science-computer-vision-nlp-ai-c9541058cf4f

https://github.com/neomatrix369/awesome-ai-ml-dl/blob/master/data/datasets.md#datasets-and-sources-of-raw-data

60.Faker is a Python package that generates fake data-https://github.com/joke2k/faker

Synthetic data library https://github.com/finos/datahub https://github.com/agmmnn/awesome-blender https://opendata.blender.org/ https://www.youtube.com/watch?v=eZwOeBkLL8E

61.Text Data Annotator Tool - Datasaur  https://datasaur.ai/

62.Google Analytics cost data import https://segmentstream.com/google-analytics?utm_source=twitter&utm_medium=cpc&utm_campaign=ga_costs_import_en&utm_content=guide

63.https://lionbridge.ai/services/crowdsourcing/    https://lionbridge.ai/     https://www.clickworker.com/  https://appen.com/  https://www.globalme.net/

64.Azure Open Datasets https://azure.microsoft.com/en-us/services/open-datasets/       https://azure.microsoft.com/en-in/services/open-datasets/catalog/
  
Yelp Open Dataset  https://www.yelp.com/dataset

https://data.world/

ODK Open Data Kit- https://getodk.org/

World Bank Open Data https://data.worldbank.org/

https://analyticsindiamag.com/10-biggest-data-breaches-that-made-headlines-in-2020/

https://data.mendeley.com/

https://github.com/iamtekson/geospatial-data-download-sites

https://eugeneyan.com/writing/data-discovery-platforms/

65.https://medium.com/towards-artificial-intelligence/best-datasets-for-machine-learning-data-science-computer-vision-nlp-ai-c9541058cf4f

https://towardsdatascience.com/data-repositories-for-almost-every-type-of-data-science-project-7aa2f98128b

https://github.com/MTG/freesound-datasets

https://dataform.co/

https://github.com/rfordatascience/tidytuesday https://www.youtube.com/watch?v=vCBeGLpvoYM

https://www.analyticsvidhya.com/blog/2020/12/top-15-datasets-of-2020-that-every-data-scientist-should-add-to-their-portfolio/?utm_source=linkedin&utm_medium=AV|link|high-performance-blog|blogs|44181|0.375

https://cseweb.ucsd.edu/~jmcauley/datasets.html

66.https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research  

https://archive.org/details/datasets

https://commoncrawl.org/

https://www.youtube.com/watch?v=1aUt8zAG09E

67.yfinance for finance data using     https://github.com/ranaroussi/yfinance

import fix_yahoo_finance as yf

https://www.analyticsvidhya.com/blog/2021/01/bear-run-or-bull-run-can-reinforcement-learning-help-in-automated-trading/?utm_source=feedburner&utm_medium=email&utm_campaign=Feed%3A+AnalyticsVidhya+%28Analytics+Vidhya%29

Downloading Historical Stock prices with Alpha Vantage  https://medium.com/towards-artificial-intelligence/downloading-historical-stock-prices-with-alpha-vantage-688edad46a6d

Get Financial Data Directly into Python https://www.quandl.com/tools/python 

openml https://www.openml.org/search?type=data

https://registry.opendata.aws/

voice_datasets https://github.com/jim-schwoebel/voice_datasets
 
Dynamically-Generated-Hate-Speech-Dataset https://github.com/bvidgen/Dynamically-Generated-Hate-Speech-Dataset

2.Feature engineering

Validate your Data (Schema) https://towardsdatascience.com/introduction-to-schema-a-python-libary-to-validate-your-data-c6d99e06d56a

Data cleaning-Pyjanitor-https://analyticsindiamag.com/beginners-guide-to-pyjanitor-a-python-tool-for-data-cleaning/

Speed Up Data Cleaning and Exploratory Data Analysis in Python with klib https://github.com/akanz1/klib https://towardsdatascience.com/speed-up-your-data-cleaning-and-preprocessing-with-klib-97191d320f80

Easy to use Python library of customized functions for cleaning and analyzing data https://github.com/akanz1/klib

Remove duplicate data in dataset

a.Handle missing value

 Types of missing value 
 
 i.missing completely at random(no correlation b/w missing and observed data) we can delete no disturbance of data distribution
 
 ii.missing at random (randomness in missing data, missing value have correlation by data) we can't delete because disturbance of data distribution
 
 iii.missing not at random  (there is reason for missing value and directly related to value)

 1.if missing data too small then delete it a.row deletion b.column deletion c.pairwise deletion
 
 2.replace by statistical method mean(influenced by outiler),median(not influenced by outiler),mode
 
 3.apply classifier algorithm to predict missing value
 
 4.Iterative imputer,knn imputer, multivariate imputation
 
 5.apply unsupervised 
 
 6.Random Sample Imputation
 
 7.Adding a variable to capture NAN(missing term)
 
 8.Arbitrary Value Imputation
 
 9.hot deck Imputation,Cold deck imputation
 
 10.regression Imputation
 
 11.End of Distribution Imputation
 
 12.Arbitrary Value Imputation
 
 13.Frequent Category Imputation
 
 14.MICE Imputation
 
 Extrapolation and Interpolation
 
 Imputation using K-NN
 
 Imputation Using Deep Learning (Datawig)
 
 15.autoimpute-https://github.com/kearnz/autoimpute
 
 https://towardsdatascience.com/6-different-ways-to-compensate-for-missing-values-data-imputation-with-examples-6022d9ca0779
 
 https://stefvanbuuren.name/fimd/want-the-hardcopy.html

b.Handle imbalance

 1.Under Sampling - mostly not prefer because lost of data
 
 2.Over Sampling  (RandomOverSampler (here new points create by same dot)) ,  SMOTETomek(new points create by nearest point so take long time),BorderLine Smote,KMeans Smote,SVM Smote,SMOTNC,ADASYN,Smote-NC    https://towardsdatascience.com/5-smote-techniques-for-oversampling-your-imbalance-data-b8155bdbe2b5
 
 https://towardsdatascience.com/7-over-sampling-techniques-to-handle-imbalanced-data-ec51c8db349f
 
 3.class_weight give more importance(weight) to that small class
 
 4.use Stratified kfold to keep the ratio of classess constantly
 
 5.Weighted Neural Network
 
 https://machinelearningmastery.com/framework-for-imbalanced-classification-projects/

c.Remove noise data

d.Format data

e.Handle categorical data Ordinal,Nominal,cyclic,binary categorical variables

 1.One Hot Encoding
 
 2.Count Or Frequency Encoding
 
 3.Target Guided Ordinal Encoding
 
 4.Mean Encoding
 
 5.Probability Ratio Encoding
 
 6.label encoding
 
 7.probability ratio encoding
 
 8.woe(Weight_of_evidence)
 
 9.one hot encoding with multi category (keep most frequently repeated only)
 
 10.feature hashing 
 
 11.sparse csr matrix
 
 12.entity embeddings
 
 13.binary encoding
 
 14.Rare label encoding
 
 15.Leave-one-out(Loo) encoding
 
 https://towardsdatascience.com/beyond-one-hot-17-ways-of-transforming-categorical-features-into-numeric-features-57f54f199ea4

f.Scaling of data

   1.Normalisation  

   2.Standardization
 
   3.Robust Scaler not influenced by outliers because using of median,IQR
   
   4. Min Max Scaling
   
   5.Mean normalization
   
   6.maximum absolute scaling
   
   https://www.analyticsvidhya.com/blog/2020/07/types-of-feature-transformation-and-scaling/?utm_source=linkedin&utm_medium=KJ|link|high-performance-blog|blogs|44204|0.375

Q-Q plot or Shapiro-Wilk Normality Test is used to check whether feature is guassian or normal distributed required for linear regression,logistic regression to Improve performance if not distributed then use below methods to bring it guassian distribution

normal test for check normal distribution

anderson teset use for check any distribution

       a.Guassian Transformation
    
       b.Logarithmic Transformation
    
       c.Reciprocal Trnasformation
    
       d.Square Root Transformation
    
       e.Exponential Transdormation
    
       f.BoxCOx Transformation
    
       g.log(1+x) Transformation
       
       h.johnson

g.Remove low variance feature by using VarianceThreshold

h.Same variable(only 1 variable) in feature then remove feature

i.Outilers removing outilers depond on problem we are solving

  2 type of outilers available: Global outiler, Local outiler

  eg: incase of fraud detection outilers are very important
  
  methods to find outiler: Standard Deviation,zscore,boxplot,scatter plot,IQR,TensorFlow_Data_Validation
  
  Automatic Outlier Detection:Isolation Forest,Local Outlier Factor,Minimum Covariance Determinant,Robust Random Cut Forest,DBScan Clustering
  
  outiler treatment: mean/median/random imputation,drop,discretization (binning)
  
  if outiler present then use robust scaling
  
  alibi-detect https://github.com/SeldonIO/alibi-detect#adversarial-detection   https://docs.seldon.io/projects/alibi-detect/en/latest/
  
  https://medium.com/towards-artificial-intelligence/outlier-detection-and-treatment-a-beginners-guide-c44af0699754

j.Anomaly

 clustering techniques to find it
 
 Isolation Forest(for Big Data),dbscan
 
 Anomaly detection using PyOD  https://pyod.readthedocs.io/en/latest/   https://www.youtube.com/watch?v=QPjG_313GOw

k.Sampling techniques

 a.biased sampling
 
 b.unbiased sampling

3.Exploratory Data Analysis(eda)

Explore the dataset by using  python or microsoft excel or tableau or powerbi, etc...

Data visualization (Matplotlib,Seaborn,Plotly,pyqtgraph,Bokeh,Pygal,Dash,Pydot,Geoplotlib,ggplot,visualizer,etc...)

Scatterplot,multi line plot,bubble chart,bar chart,histogram,boxplot,distplot,bubble charts,area plot,heat map,index plot,violin plot,time series plot,density plot,dot plot,strip plot,plotly,Choropleth Map,Kepler,PDF,Kernel density function,networkx,Scatter_matrix,Bootstrap_plot,functionvis,Higher-Dimensional Plots,3-D Plots,Word Clouds,HoloViz

https://towardsdatascience.com/8-free-tools-to-make-interactive-data-visualizations-in-2021-no-coding-required-2b2c6c564b5b

https://datavizproject.com/   https://datavizcatalogue.com/

https://attachments.convertkitcdnm.com/232198/ee18f415-1406-4e5c-94f1-49a2c6e3ec4e/Statistics-The-Big-Picture-Poster.pdf

https://towardsdatascience.com/8-free-tools-to-make-interactive-data-visualizations-in-2021-no-coding-required-2b2c6c564b5b

HiPlot (high dimensional data)-https://github.com/facebookresearch/hiplot

https://towardsdatascience.com/top-6-python-libraries-for-visualization-which-one-to-use-fe43381cd658

https://www.kaggle.com/abhishekvaid19968/data-visualization-using-matplotlib-seaborn-plotly

๐—ž๐—ฒ๐—ฟ๐—ฎ๐˜€ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น ๐˜ƒ๐—ถ๐˜€๐˜‚๐—ฎ๐—น๐—ถ๐˜‡๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ด๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ผ๐—ฟ(ann-visualizer)- ๐—ฝ๐—ถ๐—ฝ๐Ÿฏ ๐—ถ๐—ป๐˜€๐˜๐—ฎ๐—น๐—น ๐—ด๐—ฟ๐—ฎ๐—ฝ๐—ต๐˜ƒ๐—ถ๐˜‡

univariate and bivariate and multivariate analysis

model visualization Tensorboard,netron,playground tensorflow,plotly,TensorDash,Dash,Microscope,Lucid

distributions(discerte,continous)

data distributions-normal distribution,Standard Normal Distribution,Student's t-Distribution,Bernoulli Distribution,Binomial Distribution,Poisson Distribution,๏‚ทUniform Distribution,F Distribution,Covariance and Correlation

Types of Statistics  

1.Descriptive

2.Inferential

Types of data

1) Categorical (nomial,ordinal)
 
2) Numerical   (discerte,continous)

random variable(discerte random variable ,continous random variable)

Central Limit Theorem,Bayes Theorem,Confidence Interval,Hypothesis Testing,z test, t test,f test,Confidence Interval,1 tail test, 2 tail test,chisquare test,anova test,A/B testing

4.Feature selection

1.Filter methods (correleation,chisquare  test,Ttest,anova test,mutal information,hypothesis test,information gain etc...)

2.Wrapper methods (recursive feature eliminiation,boruta,forward selection,backwaed elimination,stepwise selection etc...)

3.Embedded method (lasso,ridge regression,elasticnet,tree based etc...)

DropConstantFeatures  DropDuplicateFeatures    DropCorrelatedFeatures  

4.Feature Importance

   a.ExtraTreesClassifier,ExtraTreesregressor

   b.SelectKBest

   c.Logistic Regression

   d.Random_forest_importance
   
   e.decision tree
   
   f.Linear Regression
   
   g.xgboost

5.curse of dimensionality (as dimension increases performance decreases)

6.highly correleated features then can take any 1 feature (multicollinearity)

7.dimension reduction

8.lasso regression to penalise unimportant features

9.VarianceThreshold 

10.model based selection

11.Mutual Information Feature Selection

12.remove features with very low variance (quasi constant feature dropping)

13.Univariate  feature selection

14.importance of feature (random forest importance)

15.feature importance with decision trees

16.PyImpetus

17.drop constant features (variance=0)

18.variance inflation factor(vif)

19.Recursive Feature Elimination     RecursiveFeatureAddition

20.exchaustive feature selection

21.Statistical Methods , Hypothesis Testing ,Recursive Feature Elimination

22.Boruta https://github.com/scikit-learn-contrib/boruta_py 

https://machinelearningmastery.com/feature-selection-with-real-and-categorical-data/ https://machinelearningmastery.com/statistical-hypothesis-tests-in-python-cheat-sheet/

https://www.analyticsvidhya.com/blog/2020/10/a-comprehensive-guide-to-feature-selection-using-wrapper-methods-in-python/

5.Data splitting

 Splitting ratio of data deponds on size of dataset available

 Training data,Validation data,Testing data

6.Model selection

Machine learning

A.Supervised learning (have label data)

 1.Regression (output feature in continous data form)
 
   linear regression,polynomial regression,Robust Regression,support vector regression,Decision Tree Regression,Random Forest Regression,
   
   least square method,Random Forest Regression,xgboost,ridge(L2 Regularization),lasso(L1 Regularization (more sparse)),catboost,gradientboosting,adaboost,
   
   elsatic net,light gbm,ordinary least squares,cart,Stepwise Regression,Multivariate Adaptive Regression Splines 
   
   use cases:

 2.Classification (output feature in categorical data form)
 
    Binary,Multi-class,Multi-labe
 
    Logistic Regression,K-Nearest Neighbors,Support Vector Machine,Kernel SVM,Naive Bayes,Decision Tree Classification,
    
    Random Forest Classification,xgboost,adaboost,Gradient Boost,catboost,gaussian NB,LGBMClassifier,LinearDiscriminantAnalysis, Extreme Gradient Boosting Machine, passive aggressive classifier algorithm,cart,c4.5,c5.0

B.Unsupervised learning(no label(target) data)

 1.Dimensionality reduction - PCA,SVD,LDA,som,tsne,plsr,pcr,autoencoders,kpca,lsa,Factor Analysis,

 2.Clustering :https://scikit-learn.org/stable/modules/clustering.html
 
 https://www.kdnuggets.com/2020/12/algorithms-explained-k-means-k-medoids-clustering.html
 
 K-Means 8x faster, 27x lower error than Scikit-learn in 25 lines   https://www.kdnuggets.com/2021/01/k-means-faster-lower-error-scikit-learn.html#.YAHAAIpnx4A.linkedin

 3.Association Rule Learning - support,lift,confidence,aprior,elcat,Fp-growth,Fp-tree construction, association_rules

 4.Recommendation system -
 
     a.collaborative Recommendation system (model based, memory based(item based,user based))  user-item interaction matrix
    
     b.content based Recommendation system 
     
     similarity based(user-user similarity,item-item similarity)
     
     matrix factorization
     
     c.utility based Recommendation system 
     
     d.knowledge based Recommendation system 
     
     e.demographic based Recommendation system 
     
     f.hybrid based Recommendation system 
     
     g.Average Weighted Recommendation
     
     h.using K Nearest Neighbor
     
     i.cosine distance recommender system
     
     j.TensorFlow Recommenders https://www.tensorflow.org/recommenders
     
     k.suprise baseline model
     
     l.Tf-Rec https://github.com/Praful932/Tf-Rec
     
     https://analyticsindiamag.com/top-open-source-recommender-systems-in-python-for-your-ml-project/

C.Ensemble methods

 1.Stacking models

 2.Bagging models

 3.Boosting models
 
 4.Blending
 
 5.Voting (Hard Voting,Soft Voting)
 
 Shapley value of players (models) in weighted voting games  https://github.com/benedekrozemberczki/shapley

D.Reinforcement learning

  2 types a)model free   b)model based

  agent,environment,policy(On-Policy vs Off-Policy),reward function,value function,state,action,episode,actor-critic

  agent apply action to environment get corresponding reward so that it learn environment
  
  1.Q-Learning
  
  2.Deep Q-Learning
  
  3.Deep Convolutional Q-Learning
  
  Deep Deterministic Policy Gradient
  
  4.Twin Delayed DDPG,DQN
  
  5.A3C  (Actor Critic)
  
  6.Advantage weighted actor critic (AWAC). 
  
  7.XCS
  
  8.genetic algorithm,sarsa
  
  https://simoninithomas.github.io/deep-rl-course/
  
   Environments-OpenAI Gym, DeepMind Lab, Unity ML-Agents
   
   https://data-flair.training/news/python-libraries-for-reinforcement-learning/
   
   https://analyticsindiamag.com/8-best-free-resources-to-learn-deep-reinforcement-learning-using-tensorflow/   
   
   https://analyticsindiamag.com/top-8-autonomous-driving-open-source-projects-one-must-try-hands-on/
   
   https://analyticsindiamag.com/8-toolkits-for-reinforcement-learning-models-that-make-reasoning-explainability-core-to-ai/
   
   https://neptune.ai/blog/best-reinforcement-learning-tutorials-examples-projects-and-courses
   
   https://neptune.ai/blog/best-reinforcement-learning-tutorials-examples-projects-and-courses?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-best-reinforcement-learning-tutorials-examples-projects-and-courses
   
   Open AI Gym - https://gym.openai.com/
   
   DeepMindโ€™s MuZero  https://deepmind.com/blog/article/muzero-mastering-go-chess-shogi-and-atari-without-rules?utm_campaign=Learning%20Posts&utm_content=150411901&utm_medium=social&utm_source=twitter&hss_channel=tw-3018841323
   
   KerasRL https://github.com/keras-rl/keras-rl
   
   pyqlearning
   
   tensorforce https://tensorforce.readthedocs.io/en/latest/index.html  
   
   Practical_RL https://github.com/yandexdataschool/Practical_RL
   
   rl_coach https://github.com/IntelLabs/coach#installation        MushroomRL https://mushroomrl.readthedocs.io/en/latest/
   
   TFAgents  https://github.com/tensorflow/agents (https://www.tensorflow.org/agents)   https://deepmind.com/blog/article/trfl    
   
   Automate The Stock Market Using FinRL (Deep Reinforcement Learning Library)  https://analyticsindiamag.com/stock-market-prediction-using-finrl/
   
   Stable Baselines  https://github.com/openai/baselines
   
   https://www.youtube.com/playlist?list=PL_iWQOsE6TfURIIhCrlt-wj9ByIVpbfGc
   
   https://neptune.ai/blog/the-best-tools-for-reinforcement-learning-in-python?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-the-best-tools-for-reinforcement-learning-in-python

Semi-Supervised Learning-small amount of labeled data with a large amount of unlabeled data during training

E.Deep-learning (use when have huge data and data is highly complex and state of art for unstructured data)

Frameworks:Pytorch,Tensorflow,Keras,caffe,theano,MXNet,Matlab,Microsoft Cognitive Toolkit,opacus(Train PyTorch models with Differential Privacy)

1.Multilayer perceptron(MLP)

 1.Regression task

 2.Classification task

2.Convolutional neural network ( use for image data)

 1.Classification of image
 
   create own model,Lenet,Alexnet,Resenet,GoogleNet,Inception,Vgg16,vgg19,,Efficient,Nasnet,STN,nasneta,senet,amoebanetc,DeiT (tiny,small,base)
 
 2.Localization of object in image
 
 3.Object detection and object segmentation 
 
   rcnn,fastrcnn,fastercnn,TensorFlow Object Detection,yolo v1,yolo v2,yolo v3,yolo v4,scaled yolov4,efficinetdet,fast yolo,yolo tiny,yolo lite,yolo tiny++,yolo act++,
   
   maskrcnn,DeepLab-v3-plus,ssd,detectron,detectron2,mobilenet,retinanet,R-fcn,detr facebook,pspnet,segnet,U-net,UNet++,EfficientDet,Vision Transformer,deit
   
   3 kind of object segmentation are available semantic segmentation,instance segmentation,panoptic segmentation
   
   PyTorch based low code object detection-https://github.com/alankbi/detecto
   
   autogluon 
   
   https://awesomeopensource.com/project/hoya012/deep_learning_object_detection
 
 4.objecttracking  (mean shit and optical flow and kalman filter)
 
   Tracktor++,Trackrcnn,Jde,DeepSORT,FairMOT
   
   mmtracking https://github.com/open-mmlab/mmtracking
 
 5.Deepdream,Neural style transfer, Pose estimation 
 
 6.DEEP LEARNING METHODS FOR 2D :OpenPose,DeepPose,MultiPoseNet,AlphaPose,VIBE,DeeperCut,Mask RCNN,DeepCut,Convolutional Pose Machines,PoseNet
 
 openpose wrnchai  densepose
 
 3D POSE ESTIMATION
 
 3D Image Classification https://keras.io/examples/vision/3D_image_classification/
 
 TensorFlow 2 Object Detection API tutorial https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/
 
 https://blog.paperspace.com/how-to-train-scaled-yolov4-object-detection/
 
 albumentations https://github.com/albumentations-team/albumentations
 
 TensorFlow2.0-Examples  https://github.com/YunYang1994/TensorFlow2.0-Examples
 
 unadversarial  https://github.com/microsoft/unadversarial/ https://analyticsindiamag.com/microsoft-research-unadversarial/
 
 CNNs 'see' - FilterVisualizations, Heatmaps,Saliency Maps,Heat Map Visualizations,GradCAM,Class Activation Maps,ZFNet,Lucid,Activation Atlas,Blur Integrated Gradients,concept whitening,Integrated Gradients,SmoothGrad
 
 https://github.com/utkuozbulak/pytorch-cnn-visualizations 
 
 Mediapipe for Python https://google.github.io/mediapipe/
 
 imageai.Detection for Object detection
 
 cnn-raccoon  interactive dashboards for your Convolutional Neural Networks with a single line of code https://github.com/lucko515/cnn-raccoon
 
 deit https://github.com/facebookresearch/deit   https://wandb.ai/thibault-neveu/detr-tensorflow-log/reports/Finetuning-DETR-Object-Detection-with-Transformers-on-Tensorflow-A-step-by-step-tutorial--VmlldzozOTYyNzQ  https://github.com/Visual-Behavior/detr-tensorflow
   
 awesome-computer-vision-models https://github.com/nerox8664/awesome-computer-vision-models
 
 EfficientDet https://github.com/ravi02512/efficientdet-keras
 
 Vision Transformer - Pytorch  https://github.com/lucidrains/vit-pytorch   https://github.com/alohays/awesome-visual-representation-learning-with-transformers 
 
 https://github.com/ashishpatel26/Vision-Transformer-Keras-Tensorflow-Pytorch-Examples https://github.com/google-research/vision_transformer 
 
 DeepLab-v3-plus Semantic Segmentation in TensorFlow https://github.com/rishizek/tensorflow-deeplab-v3-plus
 
 DEEP LEARNING METHODS FOR 3D:3D human pose estimation= 2D pose estimation + matching,Integral Human Pose Regression,Towards 3D Human Pose Estimation in the

Wild: a Weakly-supervised Approach,A Simple Yet Effective Baseline for 3d Human Pose Estimation,

 Data Augmentation apply to increase size of dataset and performance of model
 
 low code object detection -  detecto  https://github.com/alankbi/detecto 
 
 AutoML  https://github.com/dataloop-ai/AutoML
 
 Object Detection with 10 lines of code-https://www.datasciencecentral.com/profiles/blogs/object-detection-with-10-lines-of-code
 
 OneNet-https://analyticsindiamag.com/onenet/
 
 Norfair https://github.com/tryolabs/norfair
 
 Remo Improves Image Management  https://www.freecodecamp.org/news/manage-computer-vision-datasets-in-python-with-remo/
 
 yolo https://github.com/zzh8829/yolov3-tf2 https://github.com/ultralytics/yolov5 https://github.com/ashishpatel26/Yolov5-King-of-object-Detection  https://github.com/sicara/tf2-yolov4
 
 clip https://github.com/openai/CLIP

3.Recurrent neural network (use when series of data)

 1.RNN
 
 2.GRU
 
 3.LSTM (have memory cell,forget gate  etc..)
 
 all above 3 models have bidirectional also based on problem statement use bidirectional models

4.Generative adversarial network https://poloclub.github.io/ganlab/ https://developers.google.com/machine-learning/gan/training

 Cycle gan,Dcgan,SRGAN,InfoGAN,stargan,attan gan,stylegan,,PixelRNN,StackGAN,DiscoGAN,lsGAN,Conditional GAN(Pix2Pix),Progressive GANs( produces higher resolution images,Image-to-Image Translation),Face Inpainting,Super-resolution
 
 Imaginaire https://analyticsindiamag.com/guide-to-nvidia-imaginaire-gan-library-in-python/
 
 StyleFlow https://github.com/RameenAbdal/StyleFlow 
 
 https://github.com/hindupuravinash/the-gan-zoo

5.Autoencoder

  1.sparse Autoencoder
  
  2.denoising Autoencoder
  
  3.Contractive Autoencoder
  
  4.stacked Autoencoder
  
  5.deep Autoencoder
  
  6.variational autoencoder

6.BoltzmannMachines,Restricted Boltzmann Machine,deep belief network,deep BoltzmannMachines

7.Self Organizing Maps (SOM)

8.Natural language processing

 Clean data(removing stopwords depond on problem ,lowering data,tokenization,postagging,stemmimg or lemmatization depond on problem,skipgram,n-gram,chunking)
 
 Nltk,spacy,genism,textblob,inltk,Pattern,stanza,OpenNLP,polygot,corenlp,polyglot,PyDictionary,Huggiing face,spark nlp,allen nlp,rasa nlu,Megatron,texthero,Flair,textacy,finetune,gluon-nlp,VnCoreNLP,fasttext  libraries
 
 clean-text  https://github.com/jfilter/clean-text https://www.youtube.com/watch?v=i2TjAgga1YU
 
 NLU,NLG,NER,text summarization,Sentiment Analysis,Text Classifications,machine translation,chat bot,Text Generation,Speech Recognition
  
 1.bag of words
 
 2.Tfidf
 
 3.wordembedding
    
    a.using pretrained model 
      
      i)word2vec( cbow,skipgram)
      
      ii)glove
      
      iiI)fasttext
    
    b.creating own embedding  (use when have huge data)
    
      i)word2vec library
      
      ii)keras embedding 
      
  elmo (store semantic of word)
    
 4.Document embedding-Doc2vec
  
 5.sentence embedding

   sense2vec,SENT2VEC,Universal sentence encoder
   
 Top2Vec 
 
 6.using rnn,lstm,gru
 
   for above 3 models have bidirectional also
 
 7.Encoder and Decoder(sequence to sequence), ProphetNet(new pretrained seq2seq model)
  
 8.attention 
 
   self attention,Global Attention,Multi-Head Attention,Local Attention (monotonic,predictive)    https://github.com/uzaymacar/attention-mechanisms
 
 9.Transformer (big breakthrough in NLP) - http://jalammar.github.io/illustrated-transformer/  
 
    FastFormers  https://medium.com/ai-in-plain-english/fastformers-233x-faster-transformers-inference-on-cpu-4c0b7a720e1
 
    Shrinking Transformers (reduce size)  1.quantization,distillation,pruning,
    
    Reformer,Performers,vision transformer
    
    Reformer: The Efficient Transformer
    
    Longformer: The Long-Document Transformer
    
    ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
    
    Tree-Transformer https://github.com/yaushian/Tree-Transformer
  
 10.BERT,ConvBert,Quantized MobileBERT,ALBERT,ARBERT,MARBERTElectra,Transformer-XL,Reformer,DistilBERT,ELMo,ROBERTA,XLNet,XLM-RoBERTa,DeBERTa,T5,DISTILBERT,GPT,GPT2,GPT3,PRADO,PET,BORT,MuRIL
 
    https://analyticsindiamag.com/top-ten-bert-alternatives-for-nlu-projects/
 
    http://jalammar.github.io/    http://jalammar.github.io/illustrated-bert/   http://jalammar.github.io/a-visual-guide-to-using-bert-for-the-first-time/
    
    https://jalammar.github.io/explaining-transformers/    https://jalammar.github.io/hidden-states/
    
 11.Speech
   
    speech to text   
    
    text to speech
    
    Acoustic model,Speaker diarisation,apis
    
 SpeechRecognition
 
 googletrans (google Translator)   https://pypi.org/project/googletrans/
 
 lang-identification   Google Compact Language Detector,FastText
 
 ๐—ด๐—ง๐—ง๐—ฆ for text to speech conversion , ๐˜€๐—ฝ๐—ฒ๐—ฒ๐—ฐ๐—ต_๐—ฟ๐—ฒ๐—ฐ๐—ผ๐—ด๐—ป๐—ถ๐˜๐—ถ๐—ผ๐—ป
 
 Speech-Transformer-tf2.0 https://github.com/xingchensong/Speech-Transformer-tf2.0
 
 The Super Duper NLP Repo  https://notebooks.quantumstat.com/
 
 ecco https://github.com/jalammar/ecco  https://www.eccox.io/   https://www.youtube.com/watch?v=rHrItfNeuh0&feature=youtu.be 
 
 autonlp  https://analyticsindiamag.com/hands-on-guide-to-using-autonlp-for-automating-sentiment-analysis/
    
 https://medium.com/towards-artificial-intelligence/natural-language-processing-nlp-with-python-tutorial-for-beginners-1f54e610a1a0
 
 https://pakodas.substack.com/p/neural-search-on-indian-languages     
 
 https://www.linkedin.com/pulse/natural-language-processing-2020-year-review-ivan-bilan/?trackingId=CYfd1ZyLStu6x09tjVIoGw%3D%3D
 
 ConvBert https://github.com/yitu-opensource/ConvBert
 
 SentenceTransformers  https://www.sbert.net/
 
 Reformer โ€“ The Efficient Transformer  https://analyticsindiamag.com/hands-on-guide-to-reformer-the-efficient-transformer/
 
 Funnel-Transformer https://github.com/laiguokun/Funnel-Transformer
 
 CLIP โ€“ Connecting Text To Images  https://analyticsindiamag.com/hands-on-guide-to-openais-clip-connecting-text-to-images/ 
 
 Topic Modeling in One Line with Top2Vec https://towardsdatascience.com/topic-modeling-in-one-line-with-top2vec-a413991aa0ef
 
 MT5-https://venturebeat.com/2020/10/26/google-open-sources-mt5-a-multilingual-model-trained-on-over-101-languages/?utm_content=144321587&utm_medium=social&utm_source=linkedin&hss_channel=lcp-3740012
 
 VADER does not require any training data https://pypi.org/project/vaderSentiment/  https://analyticsindiamag.com/sentiment-analysis-made-easy-using-vader/
 
 APPLICATIONS OF MACHINE TRANSLATIO-Text-to-text,Text-to-speech,Speech-to-text,Speech-to-speech,Image (of words)-to-text
 
 Google-GNMT (Tensorflow),Facebook-fairseq (Torch),Amazon-Sockeye (MXNet),NEMATUS (Theano),THUMT (Theano),OpenNMT (PyTorch),StanfordNMT (Matlab),DyNet-lamtram(CMU),EUREKA(MangoNMT
 
 awesome-gpt3 https://github.com/elyase/awesome-gpt3
 
 Robustness Gym: Evaluation Toolkit for NLP https://github.com/robustness-gym/robustness-gym
 
 https://analyticsindiamag.com/best-nlp-based-seo-tools-for-2021/
 
 https://www.kdnuggets.com/2020/05/best-nlp-deep-learning-course-free.html   https://analyticsindiamag.com/flair-hands-on-guide-to-robust-nlp-framework-built-upon-pytorch/
 
 https://medium.com/modern-nlp/nlp-metablog-a-blog-of-blogs-693e3a8f1e0c

classification,clustering,recommender systems,topic modelling,sentiment analysis,semantic analysis,summarization,machine translation,conversational interface,named entity recognition

F.Time Series

  here data split is different (train,test,validate)
  
  here handling missing data different 
  
  generally used  to impute data in Time Series
  
  1.ffill
  
  2.bfill
  
  3.do mean of previous or future x samples and impute
  
  4.take previous season value and impute (data with trend)
  
  5.mean,mode,median,random sample imputation (data without trend and without seasonality)
  
  6.linear interpolation(data with trend and without seasonality)
  
  7.seasonal +interpolation(data with trend and with seasonality)
  
  here model selection deponds on different property of data like stationary,trend,seasonality,cyclic 
  
  Anomaly Detection using Isolation Forest,AutoEncoders
  
  Granger Causality Statistical Test use for variable usable for forecast 
  
  adfuller test  for  Stationarity        Non Stationary Statistical Test - KPSS and ADF
  
  Handling Data with Regular Gaps using Facebook Prophet
  
  models 
  
  1.arma,Arima , auto arima ,seasonal arima
  
  2.Autoregressive
  
  3.Moving average,Exponential Moving average,Exponential Smoothing
  
  4.Lstm(neural network)
  
  5.GARCH
  
  atspy	Automated time-series models
  
  6.Navie forecasts
  
  7.Smoothing (moving average,exponential smoothing)
  
  8.Facebook prophet (note:expceted date column as ds and target column as y)
  
  NeuralProphet Model- https://ourownstory.github.io/neural_prophet/model-overview/
  
  hmmlearn https://github.com/ushareng/StockPricePredictionUsingHMM_Byte/blob/master/StockPricePredictionUsingHMM.ipynb
  
  stumpy https://github.com/TDAmeritrade/stumpy
  
  Informer (for Long Sequence Time-Series Forecasting) https://analyticsindiamag.com/informer/ 
  
  deepar is global model
  
  pmdarima for Auto ARIMA
  
  GluonTS
  
  9.Holts winter,Holts linear trend
  
  10.Auto_Timeseries by auto-ts   
  
  AutoTS-https://analyticsindiamag.com/hands-on-guide-to-autots-effective-model-selection-for-multiple-time-series/  https://github.com/AutoViML/Auto_TS
  
  AutoTS  https://github.com/winedarksea/AutoTS
  
  GluonTS , PytorchTS   https://analyticsindiamag.com/gluonts-pytorchts-for-time-series-forecasting/
  
  11.Temporal Convolutional Neural
  
  12.Atspy For Automating The Time-Series Forecasting-https://analyticsindiamag.com/hands-on-guide-to-atspy-for-automating-the-time-series-forecasting/
  
  13.Darts-https://analyticsindiamag.com/hands-on-guide-to-darts-a-python-tool-for-time-series-forecasting/
  
  14.Bayesian Neural Network , TsEuler
  
  15.PyFlux (easy way to compare different models)-https://analyticsindiamag.com/pyflux-guide-python-library-for-time-series-analysis-and-prediction/
  
  16.Orbit , DeepAR ,NeuralProphet(https://github.com/ourownstory/neural_prophet    https://ourownstory.github.io/neural_prophet/model-overview/)
  
  best article-https://www.analyticsvidhya.com/blog/2018/02/time-series-forecasting-methods/,
  
  time series visualization tool https://plotjuggler.io/
  
  fastquant โ€” Backtest and optimize your trading strategies with only 3 lines of code  https://github.com/enzoampil/fastquant 

  pytorch-forecasting  https://github.com/jdb78/pytorch-forecasting  https://analyticsindiamag.com/guide-to-pytorch-time-series-forecasting/ 
  
  https://pytorch-forecasting.readthedocs.io/en/latest/  https://pytorch-forecasting.readthedocs.io/en/latest/tutorials/ar.html
  
  sktime-https://github.com/alan-turing-institute/sktime  https://analyticsindiamag.com/sktime-library/
  
  atspy  https://github.com/firmai/atspy
  
  tcn https://towardsdatascience.com/farewell-rnns-welcome-tcns-dd76674707c8
  
  https://machinelearningmastery.com/time-series-forecasting-methods-in-python-cheat-sheet/
  
  https://www.machinelearningplus.com/time-series/time-series-analysis-python/
  
  https://github.com/Apress/hands-on-time-series-analylsis-python
  
  https://otexts.com/fpp2/simple-methods.html
      
  https://analyticsindiamag.com/top-time-series-deep-learning-methods/

G.Semi supervised learning,Self-Supervised Learning,Multi-Instance Learning

H.Active learning,Multi-Task Learning,Online Learning

I.Transfer learning(Inductive Transfer learning(similar domain,different task),Unsupervised Transfer Learning(different task,different domain but similar enough) ,Transductive Transfer Learning(similar task,different domain))

https://github.com/artix41/awesome-transfer-learning

https://towardsdatascience.com/a-comprehensive-hands-on-guide-to-transfer-learning-with-real-world-applications-in-deep-learning-212bf3b2f27a

J.Deep dream,Style transfer

K.One-shot learning,Zero-shot learning

l.Incremental Training https://blog.rasa.com/rasa-new-incremental-training/

https://github.com/ChristosChristofidis/awesome-deep-learning

101 Machine Learning Algorithms for Data Science with Cheat Sheets https://blog-datasciencedojo-com.cdn.ampproject.org/c/s/blog.datasciencedojo.com/machine-learning-algorithms/amp/

TYPES OF ACTIVATION FUNCTIONS: LINEAR ACTIVATION,RELU,LEAKY RELU,SIGMOID ACTIVATION,TANH ACTIVATION,elu,PReLU,Softmax,Swish,Softplus

Optimizer- Gradient Descent(Batch Gradient Descent,Stochastic Gradient Descent,Mini batch Gradient Descent),sgd with momentum,Adagrad,RMSProp,Adam,AdaBelief

https://analyticsindiamag.com/ultimate-guide-to-pytorch-optimizers/ https://analyticsindiamag.com/guide-to-tensorflow-keras-optimizers/

Regularization- L1, L2, dropout, early stopping, and data augmentation,batch normalisation,tree purning

Learning rate scheduling,Weight Decay,Gradient clipping

Different Normalization Layers - https://towardsdatascience.com/different-normalization-layers-in-deep-learning-1a7214ff71d6

Hyperparameters Number of hidden layers,Dropout,activation function,Weights initialization , learning rate,epoch, iterations and batch size

DropBlock-Keras-Implementation https://github.com/iantimmis/DropBlock-Keras-Implementation https://github.com/miguelvr/dropblock https://github.com/DHZS/tf-dropblock

Hyperparameter tuning

a.GridSearchCV (check every given parameter so take long time)

b.RandomizedSearchCV (search randomly narrow down our time)

c.Bayesian Optimization , Hyperopt

d.Sequential Model Based Optimization(Tuning a scikit-learn estimator with skopt)

e.Optuna

f.Genetic Algorithms

g.Keras tuner

h.Scikit-Optimize

i.ray[tune] and aisaratuners https://towardsdatascience.com/choosing-a-hyperparameter-tuning-library-ray-tune-or-aisaratuners-b707b175c1d7

Milano   https://github.com/NVIDIA/Milano

Auto-PyTorch https://github.com/automl/Auto-PyTorch

https://towardsdatascience.com/10-hyperparameter-optimization-frameworks-8bc87bc8b7e3

Cross validation techniques- https://towardsdatascience.com/understanding-8-types-of-cross-validation-80c935a4976d

 1.Loocv
 
 2.Kfoldcv
 
 3.Stratfied cross validation
 
 4.Time Series cross-validation
 
 5.Holdout cross-validation
 
 6.Repeated cross-validation

Tensorboard,Neptune to visualization of model performance

Distributed Training with TensorFlow

6.Testing model

Generally used metrics

 Always check bias variance tradeoff to know how model is performing
 
 Model can be overfitting(low bias,high variance),underfitting(high bias,high variance),good fit(low bias,low variance)
 
 https://scikit-learn.org/stable/modules/model_evaluation.html   https://scikit-learn.org/stable/modules/classes.html#module-sklearn.linear_model
 
 https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-machine-learning-tips-and-tricks
 
1.Regression task - mean-squared-error, Root-Mean-Squared-Error,mean-absolute error, Rยฒ, Adjusted Rยฒ,Cross-entropy loss,Mean percentage error 

2.Classification task-Accuracy,confusion matrix,Precision,Recall,F1 Score,Binary Crossentropy,Categorical Crossentropy,AUC-ROC curve,log loss,Average precision,Mean average precision

3.Reinforcement learning - generally  use rewards

4.Incase of machine translation use bleu score

5.Clustering then use External: Adjusted Rand index, Jaccard Score, Purity Score    Internal:silhouette_score, Davies-Bouldin Index, Dunn Index

6.Object Detection loss-localization loss,classification loss,Focal Loss,IOU,L2 loss

7.Distance Metrics - Euclidean Distance,Manhattan Distance,Minkowski Distance,Hamming Distance

metric-Built-in metrics, Custom metric without external parameters,Custom metric with external parameters,Subclassing custom metric layer

Robustness Gym: Evaluation Toolkit for NLP https://github.com/robustness-gym/robustness-gym

https://medium.com/swlh/custom-loss-and-custom-metrics-using-keras-sequential-model-api-d5bcd3a4ff28

loss-Built-in loss, Custom loss without external parameters,Custom loss with external parameters,Subclassing loss layer

https://analyticsindiamag.com/all-pytorch-loss-function/   https://analyticsindiamag.com/ultimate-guide-to-loss-functions-in-tensorflow-keras-api-with-python-implementation/

Docker and Kubernetes

https://towardsdatascience.com/deploy-machine-learning-app-built-using-streamlit-and-pycaret-on-google-kubernetes-engine-fd7e393d99cb

simplest way to serve your ML models on Kubernetes https://towardsdatascience.com/the-simplest-way-to-serve-your-ml-models-on-kubernetes-5323a380bf9f

7.deployment

Platform as a Service (PaaS),Infrastructure as a Service (IaaS),SaaS (Software as a Service)

3 main approaches of Saving and Reloading an ML Model-Pickle Approach,Joblib Approach,JSON approach  

https://www.datacamp.com/community/tutorials/pickle-python-tutorial

1.Azure

2.Heroku

3.Amazon Web Services

4.Google cloud platform

MODEL DEPLOYMENT USING TF SERVING
 
TensorFlow Extended (TFX) is an end-to-end platform for deploying production ML pipelines https://www.tensorflow.org/tfx

Models visualization using Tensorboard,netron, TensorBoard.dev

Python web Frameworks for App Development- Flask,Streamlit,fastapi,Django,Web2py,Pyramid,CherryPy,Voila,Kivy and Kivymd  

streamlit,plotly jupyterdash,h2o wave

https://analyticsindiamag.com/top-8-python-tools-for-app-development/

PyQt and Tkinter , PySimpleGUI are GUI programming in Python  https://github.com/tirthajyoti/DS-with-PySimpleGUI

DearPyGui https://github.com/hoffstadt/DearPyGui

snapyml Deploy AI Models For Free -http://snapyml.snapy.ai/

h20wave-apps https://github.com/h2oai/wave-apps  https://h2oai.github.io/wave/docs/installation/

DS-with-PySimpleGUI  https://github.com/tirthajyoti/DS-with-PySimpleGUI

Web-Based GUI (Gradio)- https://analyticsindiamag.com/guide-to-gradio-create-web-based-gui-applications-for-machine-learning/

Bamboolib https://medium.com/ai-in-plain-english/bamboolib-a-data-warriors-weapon-9f734f4c2553

web application(dash)- https://dash.plotly.com/

https://towardsdatascience.com/pycaret-2-1-is-here-whats-new-4aae6a7f636a

Create a Website with AIhttps://www.bookmark.com/ 

Jupyter Notebook into an interactive dashboard (voila)-https://voila.readthedocs.io/en/stable/

high-level app and dashboarding solution(Panel)-https://panel.holoviz.org/

https://github.com/gradio-app/gradio

Tensorflow lite:Use of tensorflow lite to reduce size of model https://www.tensorflow.org/lite https://codelabs.developers.google.com/codelabs/recognize-flowers-with-tensorflow-on-android-beta/#0 https://tfhub.dev/s?deployment-format=lite https://www.tensorflow.org/lite/examples https://www.tensorflow.org/lite/microcontrollers https://www.tensorflow.org/lite/models

six different types of methods:

  1. Pruning
  2. Quantization Post-Training Quantization โ€” Reduce Float16 โ€” Hybrid Quantization โ€” Integer Quantization 2. During-Training Quantization 3. Post-Training Pruning 4. Post-Training Clustering
  3. Knowledge distillation
  4. Parameter sharing
  5. Tensor decomposition
  6. Linear Transformer

model optimization (architecture)

TinyML https://blog.tensorflow.org/2020/08/the-future-of-ml-tiny-and-bright.html

Post-training Quantization in TensorFlow Lite https://www.tensorflow.org/lite/performance/post_training_quantization

pruning

Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications https://github.com/Tencent/PocketFlow

leverage of model architecture

Quantization:Use Quantization to reduce size of model

8.Mointoring model

CI CD pipeline used- circleci , jenkins

In real world project use pipeline -https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

1.easy debugging

2.better readability

BIG DATA: hadoop,apache spark

research paper-https://arxiv.org/ ,https://arxiv.org/list/cs.LG/recent, https://www.kaggle.com/Cornell-University/arxiv

arXiv.org https://arxiv.org/list/cs.AI/recent https://arxiv.org/list/stat.ML/recent https://arxiv.org/list/cs.CL/recent https://arxiv.org/list/cs.CV/recent

https://github.com/amitness/papers-with-video

Semantic Scholar searches: https://www.semanticscholar.org/search?q=%22neural%20networks%22&sort=relevance&ae=false

https://www.semanticscholar.org/search?q=%22machine%20learning%22&sort=relevance&ae=false

https://www.semanticscholar.org/search?q=%22natural%20language%22&sort=relevance&ae=false

https://www.semanticscholar.org/search?q=%22computer%20vision%22&sort=relevance&ae=false

https://www.semanticscholar.org/search?q=%22deep%20learning%22&sort=relevance&ae=false

code for Research Papers-https://chrome.google.com/webstore/detail/find-code-for-research-pa/aikkeehnlfpamidigaffhfmgbkdeheil

Summarise Research Papers - https://www.semanticscholar.org/

programming language for data science is Python,R,Julia,Java,Scala,JAVA SCRIPT(Tensorflow.js)

IDE:jupyter notebook,spyder,pycharm,visual studio

BEST ONLINE COURSES

1.COURSERA

2.UDEMY

3.EDX

4.DATACAMP

5.Udacity

6.https://www.skillbasics.com/

BEST YOUTUBE CHANNEL TO FOLLOW

1.Krish Naik-https://www.youtube.com/user/krishnaik06

2.Codebasics-https://www.youtube.com/channel/UCh9nVJoWXmFb7sLApWGcLPQ  

3.Abhishek thakur-https://www.youtube.com/user/abhisheksvnit

4.AIEngineering-https://www.youtube.com/channel/UCwBs8TLOogwyGd0GxHCp-Dw

5.Ineuron-https://www.youtube.com/channel/UCb1GdqUqArXMQ3RS86lqqOw

6.Ken jee-https://www.youtube.com/c/KenJee1/featured       

7.3Blue1Brown-https://www.youtube.com/c/3blue1brown/featured

8.The AI Guy -https://www.youtube.com/channel/UCrydcKaojc44XnuXrfhlV8Q 

9.Unfold Data Science-https://www.youtube.com/channel/UCh8IuVJvRdporrHi-I9H7Vw

BEST BLOGS TO FOLLOW

https://www.cybrhome.com/topic/data-science-blogs

1.Towards data science-https://towardsdatascience.com/

2.Analyticsvidhya-https://www.analyticsvidhya.com/blog/?utm_source=feed&utm_medium=navbar       https://analyticsindiamag.com/

3.Medium-https://medium.com/

4.Machinelearningmastery-https://machinelearningmastery.com/blog/

5.ML+  -https://www.machinelearningplus.com/

6.analyticsinsight https://www.analyticsinsight.net/category/latest-news/   

7.KDnuggets https://www.kdnuggets.com/  https://www.kdnuggets.com/news/index.html   

https://machinelearningknowledge.ai/   

https://github.com/rushter/data-science-blogs

https://www.datamuni.com/

https://blog.ml.cmu.edu/?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=best-machine-learning-blogs-to-follow

https://www.amazon.science/blog?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine+learning+blog&utm_content=machine+learning+blog&f0=0000016e-2ff1-d205-a5ef-aff9651e0000&s=0

https://distill.pub/?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=best-machine-learning-blogs-to-follow

https://ai.googleblog.com/search/label/Machine%20Learning?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=best-machine-learning-blogs-to-follow

https://neptune.ai/blog?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine+learning+blog&utm_content=machine+learning+blog

https://bair.berkeley.edu/blog/?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=best-machine-learning-blogs-to-follow

https://deepmind.com/research?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=machine-learning-blogs-to-follow&filters=%7B%22category%22:%5B%22Research%22%5D%7D

https://ai.facebook.com/blog/?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=machine-learning-blogs-to-follow

https://becominghuman.ai/top-25-ai-and-machine-learning-blogs-for-data-scientists-9f121bcfd9a2

https://medium.com/towards-artificial-intelligence/best-machine-learning-blogs-to-follow-ml-research-ai-3994e01967f9

BEST RESOURCES

https://amitness.com/toolbox/ https://github.com/khuyentran1401/Data-science https://github.com/ml-tooling/best-of-ml-python

https://github.com/ml-tooling/best-of-ml-python#machine-learning-frameworks

https://towardsdatascience.com/data-science-tools-f16ecd91c95d https://mathdatasimplified.com/

1.paperswithcode-https://paperswithcode.com/methods

paperswithcode-client https://github.com/paperswithcode/paperswithcode-client

2.madewithml-https://madewithml.com/topics/ https://madewithml.com/courses/applied-ml-in-production/

Weights & Biases- https://wandb.ai/gallery sotabench-https://sotabench.com/

3.Deep learning-https://course.fullstackdeeplearning.com/#course-content

4.pytorch deep learning-https://atcold.github.io/pytorch-Deep-Learning/

https://www.kdnuggets.com/2019/08/pytorch-cheat-sheet-beginners.html https://www.kdnuggets.com/2019/04/nlp-pytorch.html

PyTorch Lightning-https://github.com/PyTorchLightning/pytorch-lightning

PYTORCH - https://pytorch.org/ https://pytorch.org/ecosystem/ https://pytorch.org/tutorials/ https://pytorch.org/docs/stable/index.html https://github.com/pytorch/pytorch

PYTORCH Lightning https://pytorchlightning.ai/community#projects https://seannaren.medium.com/introducing-pytorch-lightning-sharded-train-sota-models-with-half-the-memory-7bcc8b4484f2

๐—ข๐—ฝ๐—ฎ๐—ฐ๐˜‚๐˜€ (๐˜๐—ฟ๐—ฎ๐—ถ๐—ป๐—ถ๐—ป๐—ด ๐—ฃ๐˜†๐—ง๐—ผ๐—ฟ๐—ฐ๐—ต ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€ ๐˜„๐—ถ๐˜๐—ต ๐—ฑ๐—ถ๐—ณ๐—ณ๐—ฒ๐—ฟ๐—ฒ๐—ป๐˜๐—ถ๐—ฎ๐—น ๐—ฝ๐—ฟ๐—ถ๐˜ƒ๐—ฎ๐—ฐ๐˜†)-https://opacus.ai/

light-face-detection https://github.com/borhanMorphy/light-face-detection

DALLE-pytorch https://github.com/lucidrains/DALLE-pytorch

PyTorch JIT -https://lernapparat.de/jit-optimization-intro/

jax- https://github.com/google/jax

incubator-mxnet - https://github.com/apache/incubator-mxnet

ignite-https://github.com/pytorch/ignite

fastText - https://github.com/facebookresearch/fastText

rapidminer-https://rapidminer.com/

5.deep-learning-drizzle-https://deep-learning-drizzle.github.io/ https://deep-learning-drizzle.github.io/index.html

6.Fastaibook-https://github.com/fastai/fastbook , https://course.fast.ai/ https://www.fast.ai/2019/07/08/fastai-nlp/ https://www.fast.ai/2020/08/21/fastai2-launch/

neptune.ai-https://docs.neptune.ai/index.html

Dive into Deep Learning http://d2l.ai/

7.TopDeepLearning-https://github.com/aymericdamien/TopDeepLearning

8.NLP-progress-https://github.com/sebastianruder/NLP-progress

9.EasyOCR-https://github.com/JaidedAI/EasyOCR

10.Awesome-pytorch-list-https://github.com/bharathgs/Awesome-pytorch-list https://shivanandroy.com/awesome-nlp-resources/

11.free-data-science-books-https://github.com/chaconnewu/free-data-science-books

12.arcgis-https://github.com/Esri/arcgis-python-api https://geemap.org/

13.data-science-ipython-notebooks-https://github.com/donnemartin/data-science-ipython-notebooks

14.julia-https://github.com/JuliaLang/julia , https://docs.julialang.org/en/v1/

15.google-research-https://github.com/google-research/google-research

16.reinforcement-learning-https://github.com/dennybritz/reinforcement-learning

17.keras-applications-https://github.com/keras-team/keras-applications , https://github.com/keras-team/keras https://keras.io/examples/

18.opencv-https://github.com/opencv/opencv

19.transformers-https://github.com/huggingface/transformers

20.code implementations for research papers-https://chrome.google.com/webstore/detail/find-code-for-research-pa/aikkeehnlfpamidigaffhfmgbkdeheil

21.regarding satellite images - Geo AI,Arcgis,geemap

ersi arcgis-https://www.esri.com/en-us/arcgis/about-arcgis/overview

earthcube-https://www.earthcube.eu/

geemap-https://geemap.org/

22.Monk_Object_Detection-https://github.com/Tessellate-Imaging/Monk_Object_Detection

https://github.com/Tessellate-Imaging/monk_v1

https://analyticsindiamag.com/build-computer-vision-applications-with-few-lines-of-code-using-monk-ai/

pyradox https://github.com/Ritvik19/pyradox

23.NLP-progress - https://github.com/sebastianruder/NLP-progress

24.interview-question-data-science-https://github.com/iNeuronai/interview-question-data-science-

25.recommenders-https://github.com/microsoft/recommenders

26.Awesome-NLP-Resources -https://github.com/Robofied/Awesome-NLP-Resources https://shivanandroy.com/awesome-nlp-resources/ https://github.com/keon/awesome-nlp

27.Tool for visualizing attention in the Transformer model-https://github.com/jessevig/bertviz

28.TransCoder-https://github.com/facebookresearch/TransCoder

29.Tessellate-Imaging-https://github.com/Tessellate-Imaging/monk_v1

Monk_Object_Detection-https://github.com/Tessellate-Imaging/Monk_Object_Detection/tree/master/application_model_zoo

Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials- https://github.com/TarrySingh/Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials

30.Machine-Learning-with-Python-https://github.com/tirthajyoti/Machine-Learning-with-Python

31.huggingface contain almost all nlp pretrained model and all tasks related to nlp field

https://github.com/huggingface https://github.com/huggingface/transformers https://huggingface.co/transformers/ https://huggingface.co/transformers/master/ https://github.com/huggingface/tokenizers

ktrain https://github.com/amaiya/ktrain

32.multi-task-NLP-https://github.com/hellohaptik/multi-task-NLP

33.gpt-2 - https://github.com/openai/gpt-2

34.Powerful and efficient Computer Vision Annotation Tool (CVAT)-https://github.com/openvinotoolkit/cvat, https://github.com/abreheret/PixelAnnotationTool

https://github.com/UniversalDataTool/universal-data-tool http://www.robots.ox.ac.uk/~vgg/software/via/

35.Data augmentation for NLP-https://github.com/makcedward/nlpaug

36.awesome Data Science-https://github.com/academic/awesome-datascience

37.mlops-https://github.com/visenger/awesome-mlops

https://neptune.ai/blog/mlops-what-it-is-why-it-matters-and-how-to-implement-it-from-a-data-scientist-perspective?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-mlops-what-it-is-why-it-matters-and-how-to-implement-it-from-a-data-scientist-perspective

38.gym-https://github.com/openai/gym

39.Super Duper NLP Repo-https://notebooks.quantumstat.com/ https://models.quantumstat.com/ https://miro.com/app/board/o9J_kqndLls=/ https://datasets.quantumstat.com/

https://notebooks.quantumstat.com/?utm_campaign=NLP%20News&utm_medium=email&utm_source=Revue%20newsletter

40.papers summarizing the advances in the field-https://github.com/eugeneyan/ml-surveys

41.deep-translator-https://github.com/nidhaloff/deep-translator

42.detext-https://github.com/linkedin/detext

43.nlpaug-https://github.com/makcedward/nlpaug

44.ipython-sql-https://github.com/catherinedevlin/ipython-sql

45.libra-https://github.com/Palashio/libra

46.opencv-https://github.com/opencv/opencv

47.learnopencv-https://github.com/spmallick/learnopencv , https://www.learnopencv.com/

48.math is fun-https://www.mathsisfun.com/ , https://pabloinsente.github.io/intro-linear-algebra, https://hadrienj.github.io/posts/Deep-Learning-Book-Series-Introduction/

49.DEEP LEARNING WITH PYTORCH: A 60 MINUTE BLITZ - https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html

50.https://data-flair.training/blogs/

https://data-flair.training/blogs/python-tutorials-home/ https://data-flair.training/blogs/hadoop-tutorials-home/ https://data-flair.training/blogs/spark-tutorials-home/

https://data-flair.training/blogs/tableau-tutorials-home/ https://data-flair.training/blogs/data-science-tutorials-home/

Spark Release 3.0.1-https://spark.apache.org/releases/spark-release-3-0-1.html

mllib https://spark.apache.org/docs/2.0.0/api/python/pyspark.mllib.html https://spark.apache.org/docs/2.0.0/api/python/index.html

https://data-flair.training/blogs/spark-tutorial/ Spark Core,Spark SQL,Spark Streaming,Spark MLlib,Spark GraphX,etc...

Machine Learning with Optimus on Apache Spark https://www.kdnuggets.com/2017/11/machine-learning-with-optimus.html

BigDL: Distributed Deep Learning Framework for Apache Spark https://github.com/intel-analytics/BigDL

51.for more cheatsheets-https://github.com/FavioVazquez/ds-cheatsheets , https://medium.com/swlh/the-ultimate-cheat-sheet-for-data-scientists-d1e247b6a60c

https://www.theinsaneapp.com/2020/12/machine-learning-and-data-science-cheat-sheets-pdf.html

https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-supervised-learning

52.text2emotion-https://pypi.org/project/text2emotion/

53.ExploriPy-https://analyticsindiamag.com/hands-on-tutorial-on-exploripy-effortless-target-based-eda-tool/

54.TCN-https://github.com/philipperemy/keras-tcn

55.deeplearning-models-https://github.com/rasbt/deeplearning-models

56.earthengine-py-notebooks-https://github.com/giswqs/earthengine-py-notebooks

57.NLP-progress -https://github.com/sebastianruder/NLP-progress

58.numerical-linear-algebra -https://github.com/fastai/numerical-linear-algebra

59.Super Duper NLP Repo- https://notebooks.quantumstat.com/

60.reinforcement learning by using PyTorch-https://github.com/SforAiDl/genrl

61.chatbot- from scratch,google dialogflow,rasa nlu,azure luis, chatterbot,Amazon lex,Wit.ai,Luis.ai,IBM Watson etc...

https://github.com/fendouai/Awesome-Chatbot

https://www.analyticsinsight.net/category/chatbots/

https://blog.ubisend.com/optimise-chatbots/chatbot-training-data

  1. No Code Machine Learning / Deep Learning

Teachable Machine-https://teachablemachine.withgoogle.com/

Microsoft Lobe -https://lobe.ai/

WEKA - https://www.cs.waikato.ac.nz/ml/weka/

Monk_Gui-https://github.com/Tessellate-Imaging/Monk_Gui

FlashML https://www.flash-ml.com/

igel https://github.com/nidhaloff/igel

obviously https://www.obviously.ai/

machine learning straight from Microsoft Excel https://venturebeat.com/2020/12/30/you-dont-code-do-machine-learning-straight-from-microsoft-excel/

ENNUI-https://math.mit.edu/ennui/ https://github.com/martinjm97/ENNUI https://www.youtube.com/watch?v=4VRC5k0Qs2w

Knime https://www.knime.com/

Accord.net http://accord-framework.net/

H2O Driverless AI https://www.h2o.ai/products/h2o-driverless-ai/

Rapid Miner https://rapidminer.com/

opennn https://www.opennn.net/

datarobot https://www.datarobot.com/

dataiku https://www.dataiku.com/product/get-started/

ludwig https://github.com/ludwig-ai/ludwig

orange https://orange.biolab.si/

OpenBlender https://openblender.io/#/welcome

create neural networks with one line of code https://github.com/PraneetNeuro/nnio.l

Machine Learning in JUST ONE LINE OF CODE libra https://github.com/Palashio/libra/ https://www.youtube.com/watch?v=N_T_ljj5vc4

perceptilabs https://towardsdatascience.com/easy-model-building-with-perceptilabs-interactive-tensorflowvisualization-gui-834d5bb3c973

64.tensorflow development-https://blog.tensorflow.org/

TensorFlow Hub (trained ready-to-deploy machine learning models in one place) - https://tfhub.dev/

TensorBoard.dev - https://tensorboard.dev/

tutorials-https://www.tensorflow.org/tutorials https://www.tensorflow.org/guide

TensorFlow Graphics - https://www.tensorflow.org/graphics Lattice-https://www.tensorflow.org/lattice

TensorFlow Probability-https://www.tensorflow.org/probability TensorFlow Privacy- tensorflow-privacy

63.Data Science in the Cloud-Amazon SageMaker,Amazon Lex,Amazon Rekognition,Azure Machine Learning (Azure ML) Services,Azure Service Bot framework,Google Cloud AutoML

64.platforms to build and deploy ML models -Uber has Michelangelo,Google has TFX,Databricks has MLFlow,Amazon Web Services (AWS) has Sagemaker

65.Time Complexity Of Machine Learning Models -https://www.thekerneltrip.com/machine/learning/computational-complexity-learning-algorithms/

66.ML from scratch-https://dafriedman97.github.io/mlbook/content/introduction.html

https://aihubprojects.com/machine-learning-from-scratch-python/

https://www.datasciencecentral.com/profiles/blogs/a-complete-tutorial-to-learn-data-science-with-python-from

https://medium.com/@mattybv3/learn-data-science-from-scratch-curriculum-with-20-free-online-courses-8cff96d6cbe5

67.turn-on visual training for most popular ML algorithms https://github.com/lucko515/ml_tutor https://pypi.org/project/ml-tutor/

68.mlcourse.ai is a free online- https://mlcourse.ai/

69.using pretrained model provided by tfhub- https://tfhub.dev/

70.Deep-Learning-with-PyTorch- https://pytorch.org/assets/deep-learning/Deep-Learning-with-PyTorch.pdf

71.MIT 6.S191 Introduction to Deep Learning-http://introtodeeplearning.com/

72.R for Data Science-https://r4ds.had.co.nz/ ,Fundamentals of Data Visualization-https://clauswilke.com/dataviz/

74.machine learning in JavaScript-https://www.tensorflow.org/js https://www.tensorflow.org/js/models https://tensorflow-js-object-detection.glitch.me/

TensorFlow.jl Julia with TensorFlow https://malmaud.github.io/tfdocs/ https://malmaud.github.io/TensorFlow.jl/latest/tutorial.html

Sonnet is a library built on top of TensorFlow 2 https://github.com/deepmind/sonnet

TensorFlow Federated (TFF) ( facilitate open research and experimentation with Federated Learning)-https://www.tensorflow.org/federated

TFX is an end-to-end platform for deploying production ML pipelines https://www.tensorflow.org/tfx https://github.com/tensorflow/tfx

Federated Learning -https://www.tensorflow.org/federated/tutorials/federated_learning_for_image_classification

Neural Structured Learning-https://www.tensorflow.org/neural_structured_learning/tutorials/graph_keras_mlp_cora

Responsible AI-https://www.tensorflow.org/resources/responsible-ai

https://www.tensorflow.org/graphics

Multilingual Representations for Indian Languages https://tfhub.dev/google/MuRIL/1

75.free list of AI/ Machine Learning Resources/Courses-https://www.marktechpost.com/free-resources/

https://github.com/kabartay/OpenUnivCourses

https://www.kdnuggets.com/2018/11/10-free-must-see-courses-machine-learning-data-science.html

https://www.kdnuggets.com/2018/12/10-more-free-must-see-courses-machine-learning-data-science.html

https://www.theinsaneapp.com/2020/12/machine-learning-and-data-science-cheat-sheets-pdf.html

https://www.theinsaneapp.com/2020/11/free-machine-learning-data-science-and-python-books.html

65 Machine Learning and Data books for free- https://towardsdatascience.com/springer-has-released-65-machine-learning-and-data-books-for-free-961f8181f189

https://www.deeplearningbook.org/ http://d2l.ai/

https://www.datasciencecentral.com/profiles/blogs/free-500-page-book-on-applications-of-deep-neural-networks-1 https://github.com/jeffheaton/t81_558_deep_learning

https://www.theinsaneapp.com/2020/12/free-data-science-books-pdf.html

https://www.datasciencecentral.com/profiles/blogs/free-500-page-book-on-applications-of-deep-neural-networks-1

https://github.com/chaconnewu/free-data-science-books

https://www.kdnuggets.com/2020/03/24-best-free-books-understand-machine-learning.html

https://www.kdnuggets.com/2020/12/15-free-data-science-machine-learning-statistics-ebooks-2021.html

http://introtodeeplearning.com/

https://www.theinsaneapp.com/2020/12/free-data-science-books-pdf.html

http://d2l.ai/index.html https://www.kdnuggets.com/2020/09/best-free-data-science-ebooks-2020-update.html

https://www.youtube.com/playlist?app=desktop&list=PLypiXJdtIca5ElZMWHl4HMeyle2AzUgVB https://mit6874.github.io/

76.Code for Research Papers-https://chrome.google.com/webstore/detail/find-code-for-research-pa/aikkeehnlfpamidigaffhfmgbkdeheil

77.Natural Language Processing 365- https://ryanong.co.uk/natural-language-processing-365/

78.Top Computer Vision Google Colab Notebooks- https://www.qblocks.cloud/creators/computer-vision-google-colab-notebooks

79.For practice -https://www.confetti.ai/exams

80.Yellowbrick-https://towardsdatascience.com/introduction-to-yellowbrick-a-python-library-to-explain-the-prediction-of-your-machine-learning-d63ecee10ecc

81.Mathematics of Machine Learning,deep learning-https://towardsdatascience.com/the-mathematics-of-machine-learning-894f046c568

https://github.com/hrnbot/Basic-Mathematics-for-Machine-Learning

https://towardsdatascience.com/the-roadmap-of-mathematics-for-deep-learning-357b3db8569b

https://medium.com/towards-artificial-intelligence/basic-linear-algebra-for-deep-learning-and-machine-learning-ml-python-tutorial-444e23db3e9e

https://www.kdnuggets.com/2020/02/free-mathematics-courses-data-science-machine-learning.html

https://towardsai.net/p/data-science/how-much-math-do-i-need-in-data-science-d05d83f8cb19

https://www.mltut.com/how-to-learn-math-for-machine-learning-step-by-step-guide/

https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-machine-learning-tips-and-tricks#

https://www.datasciencecentral.com/profiles/blogs/free-online-book-machine-learning-from-scratch

https://hadrienj.github.io/posts/Essential-Math-for-Data-Science-Introduction_to_matrices_and_matrix_product/?utm_source=linkedin&utm_medium=social&utm_campaign=linkedin_matrices

https://www.youtube.com/playlist?list=PLRDl2inPrWQW1QSWhBU0ki-jq_uElkh2a https://github.com/jonkrohn/ML-foundations

https://ocw.mit.edu/resources/res-18-001-calculus-online-textbook-spring-2005/textbook/

82.Googleai-https://ai.google/education

83.ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions

PyBrain is a modular Machine Learning Library for Python

84.Best Online Courses for Machine Learning and Data Science-https://www.mltut.com/best-online-courses-for-machine-learning-and-data-science/

Comprehensive Project Based Data Science Curriculum https://julienbeaulieu.github.io/2019/09/25/comprehensive-project-based-data-science-curriculum/

AI Expert Roadmap-https://i.am.ai/roadmap/#data-science-roadmap

85.FastAPI-https://fastapi.tiangolo.com/deployment/deta/

86.Yann LeCunโ€™s Deep Learning Course at CDS-https://cds.nyu.edu/deep-learning/ https://atcold.github.io/pytorch-Deep-Learning/

https://atcold.github.io/pytorch-Deep-Learning/

https://www.cs.cmu.edu/~ninamf/courses/601sp15/lectures.shtml

87.Four Important Computer Vision Annotation Tools https://heartbeat.fritz.ai/4-important-computer-vision-annotation-tools-you-need-to-know-in-2020-9f964931ed7

88.Python Data Science Handbook https://jakevdp.github.io/PythonDataScienceHandbook/

89.for low code object detection (detecto)- https://github.com/alankbi/detecto

90.1 line for hundreds of NLP models and algorithms- https://github.com/JohnSnowLabs/nlu

91.AudioFeaturizer when deal with audio data- https://pypi.org/project/AudioFeaturizer/

liborsa library https://librosa.org/doc/latest/index.html

MAGENTA-https://magenta.tensorflow.org/

92.Palladium-https://palladium.readthedocs.io/en/latest/

93.KNIME-https://www.knime.com/

94.Facebook Open Sourced New Frameworks to Advance Deep Learning Research https://www.kdnuggets.com/2020/11/facebook-open-source-frameworks-advance-deep-learning-research.html

95.Software Engineering for Machine Learning https://github.com/SE-ML/awesome-seml

96.Atlas web-based dashboard -https://www.atlas.dessa.com/

97.Pytest (test code) https://docs.pytest.org/en/latest/index.html (test code)

98.keras- https://keras.io/ https://keras.io/api/ https://keras.io/examples/

99.High-Performance Jupyter Notebook - BlazingSQL Notebooks https://blazingsql.com/notebooks

jupyter-tabnine https://github.com/wenmin-wu/jupyter-tabnine

100.CV-pretrained-model- https://github.com/balavenkatesh3322/CV-pretrained-modelCV-pretrained-model-

101.Kubeflow Machine Learning Toolkit for Kubernetes https://www.kubeflow.org/

102.Daily AI updates to your inbox- https://sago-ai.news/#/

103.Three API styles - Sequential Model,functional API,Model subclassing

104.Deep Learning Toolkit for Medical Image Analysis -https://github.com/DLTK/DLTK

106.Interpret The ML Model

lime(explain black box models)- https://lime-ml.readthedocs.io/en/latest/

SHAP https://medium.com/towards-artificial-intelligence/explain-your-machine-learning-predictions-with-kernel-shap-kernel-explainer-fed56b9250b8

https://github.com/slundberg/shap

Shapash makes Machine Learning models transparent and understandable by everyone https://github.com/MAIF/shapash

interpret https://github.com/interpretml/interpret

Captum Model Interpretability for PyTorch https://captum.ai/ https://github.com/pytorch/captum

ecco https://github.com/jalammar/ecco https://jalammar.github.io/explaining-transformers/ https://www.eccox.io/

dalex https://pypi.org/project/dalex/ https://blog.learningdollars.com/2021/01/02/ai-in-medical-diagnosis/ https://www.kdnuggets.com/2020/11/dalex-explain-tensorflow-model.html

google AI Explanations for AI Platform https://cloud.google.com/ai-platform/prediction/docs/ai-explanations/overview?utm_source=youtube&utm_medium=Unpaidsocial&utm_campaign=guo-20200423-Intro-Aiexp

eli5 https://eli5.readthedocs.io/en/latest/

TabNet: Attentive Interpretable Tabular Learning https://github.com/dreamquark-ai/tabnet

skater https://oracle.github.io/Skater/

what if tool https://pair-code.github.io/what-if-tool/ https://pair-code.github.io/what-if-tool/demos/uci.html

DeepLIFT https://github.com/kundajelab/deeplift

Arena https://medium.com/responsibleml/python-has-now-the-new-way-of-exploring-xai-explanations-4248846426cf

tabnet https://cloud.google.com/blog/products/ai-machine-learning/ml-model-tabnet-is-easy-to-use-on-cloud-ai-platform

explainerdashboard https://towardsdatascience.com/the-quickest-way-to-build-dashboards-for-machine-learning-models-ec769825070d

Responsible AI-https://www.tensorflow.org/resources/responsible-ai

fairlearn https://github.com/fairlearn/fairlearn

Google Facets https://pair-code.github.io/facets/

Googleโ€™s Model Card Toolkit

Opening the AI Black Box -https://zetane.com/gallery

AI Explainability 360 Toolkit from IBM Research https://aix360.mybluemix.net/

onnx https://github.com/onnx/onnx

torch-dreams https://github.com/Mayukhdeb/torch-dreams

https://github.com/jphall663/awesome-machine-learning-interpretability

https://analyticsindiamag.com/8-explainable-ai-frameworks-driving-a-new-paradigm-for-transparency-in-ai/

https://christophm.github.io/interpretable-ml-book/ https://github.com/christophM/interpretable-ml-book

https://www.kdnuggets.com/2018/12/machine-learning-explainability-interpretability-ai.html

Fairness

How to easily check if your Machine Learning model is fair (dalex) https://www.kdnuggets.com/2020/12/machine-learning-model-fair.html

LinkedIn Fairness Toolkit,Fairlearn,AI Fairness 360,scikit-fairness,Algofairness,Aequitas,CERTIFAI,ML-fairness-gym

107.deep-learning-drizzle -https://deep-learning-drizzle.github.io/

108.Machine Learning University - https://aws.amazon.com/machine-learning/mlu/

109.mlflow https://mlflow.org/ An open source platform for the machine learning lifecycle

https://www.kdnuggets.com/2021/01/5-tools-effortless-data-science.html

https://neptune.ai/

https://azure.microsoft.com/en-us/services/machine-learning/

https://github.com/VertaAI/modeldb

110.Data Preparation / ETL https://airflow.apache.org/ https://intake.readthedocs.io/en/latest/

111.fairlearn https://github.com/fairlearn/fairlearn/blob/master/README.md Evaluating fairness of AI/ML models and training data and for mitigating bias in models determined to be unfair.

AI Fairness 360 evaluating fairness of AI/ML models and training data and mitigating bias in current models https://aif360.mybluemix.net/

An ethics checklist for data scientists https://deon.drivendata.org/

112.MONAI Framework For Medical Imaging Research https://analyticsindiamag.com/monai-datatsets-managers/

torchio https://github.com/fepegar/torchio https://analyticsindiamag.com/torchio-3d-medical-imaging/

MolBert: Molecular Representation learning with AI

medicalAI https://github.com/aibharata/medicalAI

Biopython is a set of freely available tools https://github.com/biopython/biopython

DeepIPW https://github.com/ruoqi-liu/DeepIPW

113.OpenVINO https://opencv.org/openvino-model-optimization/ https://opencv.org/how-to-speed-up-deep-learning-inference-using-openvino-toolkit-2/

114.MLOPS https://www.analyticsinsight.net/top-mlops-based-tools-for-enabling-effective-machine-learning-lifecycle/

https://mlops.githubapp.com/

115.Code faster https://www.tabnine.com/

116.Pytest for Data Scientists https://towardsdatascience.com/4-lessor-known-yet-awesome-tips-for-pytest-2117d8a62d9c

117.mlflow https://mlflow.org/docs/latest/index.html

MLOps https://github.com/microsoft/MLOps

DevOps https://github.com/collections/devops-tools

airflow https://github.com/apache/airflow

kubeflow https://github.com/kubeflow/kubeflow

kubernetes https://github.com/kubernetes/kubernetes

pipeline https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

118.algorithm to use by problem https://www.datasciencecentral.com/profiles/blogs/which-machine-learning-deep-learning-algorithm-to-use-by-problem

119.Connect the world to your data and fuel your ML.

OpenBlender Enrich ML Models with adding new Variables from Any Source to Boost Performance https://www.youtube.com/channel/UCCFN8DDrA6k7eHYLvZGdNVA https://openblender.io/

  1. Google's MuRIL (Multilingual Representations for Indian Languages) https://tfhub.dev/google/MuRIL/1

121.mxnet https://mxnet.apache.org/versions/master/api/python/docs/tutorials/getting-started/crash-course/index.html

122.tools-https://towardsdatascience.com/data-science-tools-f16ecd91c95d

123.Elements of AI free online course https://www.elementsofai.com/

124.Best_AI_paper_2020 https://github.com/louisfb01/Best_AI_paper_2020

125.roadmap https://github.com/graykode/nlp-roadmap

https://www.freecodecamp.org/news/data-science-learning-roadmap/

https://mohammedazeem665.medium.com/plan-to-learn-machine-learning-data-science-in-2021-note-these-assets-from-2020-e84389d94097

https://github.com/AMAI-GmbH/AI-Expert-Roadmap

data-engineer-roadmap https://github.com/datastacktv/data-engineer-roadmap

126.https://neptune.ai/blog/best-data-science-tools-to-increase-machine-learning-model-understanding?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-best-data-science-tools-to-increase-machine-learning-model-understanding

Visualizing the Execution of Python Program http://pythontutor.com/ https://www.youtube.com/watch?v=pCSlWQjfCzA

MLPerf Model performance debugging tools https://mlperf.org/

Model debugging tools Manifold https://eng.uber.com/manifold/

Icecream https://towardsdatascience.com/stop-using-print-to-debug-in-python-use-icecream-instead-79e17b963fcc

Experiment tracking tools WandB https://wandb.ai/site

Comet manage and organize machine learning experiments https://www.comet.ml/site/

MLflow Open-source platform for tracking machine learning experiments https://mlflow.org/

neptune https://neptune.ai/

weights & biases https://wandb.ai/site

127.19 Best JupyterLab Extensions for Machine Learning https://neptune.ai/blog/jupyterlab-extensions-for-machine-learning

128.coreml https://developer.apple.com/machine-learning/core-ml/

129.Protect Your Neural Networks Against Hacking Adversarial Robustness Toolbox (ART) https://analyticsindiamag.com/adversarial-robustness-toolbox-art/

130.https://www.kdnuggets.com/2021/01/10-underappreciated-python-packages-machine-learning-practitioners.html

131.datascience-fails https://github.com/xLaszlo/datascience-fails

132.Jupyter notebook integration for Microsoft Excel https://github.com/pyxll/pyxll-jupyter https://towardsdatascience.com/python-jupyter-notebooks-in-excel-5ab34fc6439

Voilร  turns Jupyter notebooks into standalone web applications https://github.com/voila-dashboards/voila https://github.com/voila-dashboards/voila-gridstack

How to Optimize Your Jupyter Notebook https://www.kdnuggets.com/2020/01/optimize-jupyter-notebook.html

TabNet: Attentive Interpretable Tabular Learning https://github.com/dreamquark-ai/tabnet

133.rapidly develop data applications with Python https://github.com/dstackai/dstack

134.Google Research: Looking Back at 2020, and Forward to 2021 https://ai.googleblog.com/2021/01/google-research-looking-back-at-2020.html

135.cortex Run inference at scale https://www.cortex.dev/ https://github.com/cortexlabs/cortex

Follow leaders in the field to update yourself in the field

1.Linkedin

2.Twitter

CPU/GPU/TPU

1.Google cloab (FREE)

2.Kaggle kernel(read terms and conditions before use) (FREE)

3.Paperspace Gradient(read terms and conditions before use)

4.knime - https://www.knime.com/(read terms and conditions before use)

5.RapidMiner (read terms and conditions before use)

https://github.com/zszazi/Deep-learning-in-cloud

So what next ?

participate online competition and do project and apply to intership ,job,solving real world problems, etc...

applications of data science in many industry

1.E-commerce- Identifying consumers,Recommending Products,Analyzing Reviews

2.Manufacturing- Predicting potential problems,Monitoring systems,Automating manufacturing units, Maintenance Scheduling,Anomaly Detection

3.Banking- Fraud detection,Credit risk modeling,Customer lifetime value

4.Healthcare- Medical image analysis, Drug discovery,Bioinformatics,Virtual Assistants,image segmentation

5.Transport- Self-driving cars,Enhanced driving experience,Car monitoring system,Enhancing the safety of passengers

6.Finance- Customer segmentation,Strategic decision making,Algorithmic trading,Risk analytics

7.Marketing (Added from comments Credits: Jawad Ali)- LTV predictions,Predictive analytics for customer behavior,Ad targeting

and many more fields - https://www.topbots.com/enterprise-ai-companies-2020/ , https://venturebeat.com/2020/10/21/the-2020-data-and-ai-landscape/

Research blogs

1.https://ai.facebook.com/ https://ai.facebook.com/blog/

2.https://ai.googleblog.com/

3.https://deepmind.com/blog https://deepai.org/definitions

4.https://openai.com/blog/

5.https://www.malongtech.com/en/research.html

6.https://blogs.nvidia.com/blog/tag/artificial-intelligence/

https://ai.googleblog.com/2021/01/google-research-looking-back-at-2020.html?m=1

7.https://blog.tensorflow.org/

8.https://pytorch.org/blog/

kdnuggets.com

https://www.kdnuggets.com/2020/01/top-10-ai-ml-articles-to-know.html

RESEARCH LABS IN THE WORLD

https://ai.facebook.com/ https://ai.googleblog.com/ https://research.google/ https://ai.google/research/

1.The Alan Turing Institute:https://www.turing.ac.uk/

2.J.P. Morgan AI Research Lab:https://www.jpmorgan.com/insights/tec...

3.Oxford ML Research Group:http://www.robots.ox.ac.uk/~parg/proj...

4.Microsoft Research Lab- AI:https://www.microsoft.com/en-us/resea...

5.Berkeley AI Research:https://bair.berkeley.edu/

6.LIVIA:https://en.etsmtl.ca/Unites-de-recher...

7.MIT Computer Science and Artificial :https://www.csail.mit.edu/

online competitions:

1.Kaggle-https://www.kaggle.com/

2.hackerearth-https://www.hackerearth.com/challenges/

3.machinehack-https://www.machinehack.com/

4.analyticsvidhya-https://datahack.analyticsvidhya.com/contest/all/

5.zindi-https://zindi.africa/competitions

6.crowdai-https://www.crowdai.org/

7.driven data-https://www.drivendata.org/

8.dockship-https://dockship.io/

9.SIGNATE Competition- https://signate.jp/about?rf=competition_about

9.International Data Analysis Olympiad (IDAHO)

10.Codalab

11.Iron Viz

12.Data Science Challenges

13.Tianchi Big Data Competition

14.https://www.techgig.com/hackathon/ml_hackathon

15.https://www.openml.org/

https://towardsdatascience.com/12-data-science-ai-competitions-to-advance-your-skills-in-2021-32e3fcb95d8c

Some useful content :

  1. H20.ai automl, google automl,google ml kit(https://developers.google.com/ml-kit) ,Azure Cognitive Services,Azure Machine Learning Service,amazon ml,Azure Machine Learning Studio,Google Cloud Platform,gcp automl ision,Weka,Microsoft Cognitive Toolkit,Google Cloud AutoML,DataRobot AutoML,Databricks AutoML,Azure ML,azure machine learning studio,IBM Watson ml studio,AWS Sagemaker Studio,aws rekognition,Google AI Platform,Databricks,Domino Data Lab,roboflow

https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet

https://neptune.ai/blog/best-machine-learning-as-a-service-platforms-mlaas?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-best-machine-learning-as-a-service-platforms-mlaas

https://codegnan.com/blog/35-best-data-sciecne-tools-for-beginners-to-master/

mlkit-https://firebase.google.com/products/ml

  1. Tpot

auto_ml https://github.com/ClimbsRocks/auto_ml

  1. autopandas

  2. AutoGluon https://analyticsindiamag.com/how-to-automate-machine-learning-tasks-using-autogluon/

AutoGL: The First Ever AutoML Framework for Graph Datasets https://analyticsindiamag.com/meet-autogl-the-first-ever-automl-framework-for-graph-datasets/

  1. autosklearn,autokeras,LightAutoML (https://github.com/sberbank-ai-lab/LightAutoML)

AutoNeuro https://autoneuro.challenge-ineuron.in/

  1. autoviml

    ๐—ฎ๐˜‚๐˜๐—ผ๐—บ๐—ฎ๐˜๐—ฒ ๐—บ๐—ผ๐˜€๐˜ ๐—ผ๐—ณ ๐˜๐—ต๐—ฒ ๐—ฑ๐—ฎ๐˜๐—ฎ ๐˜€๐—ฐ๐—ถ๐—ฒ๐—ป๐—ฐ๐—ฒ https://github.com/Muhammad4hmed/GML

    CodeLess https://pypi.org/project/codeless/ https://github.com/porky5191/codeless_demo_project

  2. autoViz

  3. hyperopt

  4. sweetviz (EDA purpose) - https://pypi.org/project/sweetviz/

  5. pandasprofiling(display whole EDA) - https://pypi.org/project/pandas-profiling/ https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/index.html

  6. autokeras,AutoSklearn,Neural Network Intelligence

    FeatureTools automated feature engineering.

    MLBox,Lightwood,mindsdb(machine learning models using SQL queries),mljar-supervised,Ludwig(deep learning models without the need to write code)

    AdaNet is a lightweight TensorFlow-based framework

  7. pycaret- https://pycaret.org/

Machine Learning in Power BI using PyCaret https://www.kdnuggets.com/2020/05/machine-learning-power-bi-pycaret.html

https://towardsdatascience.com/build-your-first-anomaly-detector-in-power-bi-using-pycaret-2b41b363244e

mindsdb Machine Learning in 5 Lines of Code https://mindsdb.com/

automated feature engineering https://github.com/alteryx/featuretools

AutoML toolkit https://github.com/microsoft/nni

mljar-supervised Automates Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning https://github.com/mljar/mljar-supervised

MLBox is a powerful Automated Machine Learning python library https://github.com/AxeldeRomblay/MLBox

12.Auto_Timeseries by auto_ts

13.AutoNLP_Sentiment_Analysis by autoviml

14.automl lazypredict https://github.com/shankarpandala/lazypredict

AutoML Toolkit for Graph Datasets & Tasks AutoGL(Auto Graph Learning)https://medium.com/syncedreview/tsinghua-university-releases-first-automl-toolkit-for-graph-datasets-tasks-c61ea0261d78

AutoFeat-https://analyticsindiamag.com/guide-to-automatic-feature-engineering-using-autofeat/

15.bamboolib or pandas-ui or pandas-summary or pandas_visual_analysis or Dtale(get code also) (python package for easy data exploration & transformation)

Automating EDA using Pandas Profiling, Sweetviz and Autoviz,DataPrep,vaex,Datapane,Sweetviz,PandasGUI,Datatable,Dora,Pywedge,D-Tale,lux,Dabl,Pretty pandas,AWS Glue DataBrew,speedML,edaviz,Altair,voyager,Mito,Facets,KNIME

explainerdashboard https://towardsdatascience.com/the-quickest-way-to-build-dashboards-for-machine-learning-models-ec769825070d

Facets https://github.com/PAIR-code/facets https://towardsdatascience.com/visualize-your-data-with-facets-d11b085409bc

https://github.com/mstaniak/autoEDA-resources

ExploriPy import EDA-https://analyticsindiamag.com/hands-on-tutorial-on-exploripy-effortless-target-based-eda-tool/

Lens- Statistical Analysis of Data https://analyticsindiamag.com/hands-on-tutorial-on-lens-python-tool-for-swift-statistical-analysis/

Dashboard in Less Than 10 Lines of Code https://towardsdatascience.com/build-dashboards-in-less-than-10-lines-of-code-835e9abeae4b

MitoSheets https://analyticsindiamag.com/guide-to-mitosheets-harnessing-power-of-spreadsheets-in-python/

Datacleaner-https://analyticsindiamag.com/tutorial-on-datacleaner-python-tool-to-speed-up-data-cleaning-process/

Datacleaner :dora ,Voilร  -Jupyter Notebooks quickly into standalone web applications , Plotly Dash - for more advanced and production level dashboards

featurewiz(Select the best features from your data set fast with a single line of code) - https://github.com/AutoViML/featurewiz

explainerdashboard https://medium.com/analytics-vidhya/explainer-dashboard-build-interactive-dashboards-for-machine-learning-models-fda63e0eab9

Panel - web apps

Automating report generation with Jupyter Notebooks https://medium.com/applied-data-science/full-stack-data-scientist-5-automating-report-generation-with-jupyter-notebooks-919e32e88d18

Datapane ( Build Interactive Reports) https://towardsdatascience.com/introduction-to-datapane-a-python-library-to-build-interactive-reports-4593fd3cb9c8

pomegranate probabilistic modelling in Python https://github.com/jmschrei/pomegranate https://www.kdnuggets.com/2020/12/fast-intuitive-statistical-modeling-pomegranate.html

16.CUPY (array process parallel in gpu) https://pypi.org/project/cupy/

17.Dabl-automate the known 80% of Data Science which is data preprocessing, data cleaning, and feature engineering https://pypi.org/project/dabl/

18.dask (parallel comptataion) https://docs.dask.org/en/latest/ https://medium.com/rapids-ai/reading-larger-than-memory-csvs-with-rapids-and-dask-e6e27dfa6c0f#cid=av01_so-nvsh_en-us

thundergbm Fast GBDTs and Random Forests on GPUs https://github.com/Xtra-Computing/thundergbm

thundersvm https://github.com/Xtra-Computing/thundersvm

pandas chunksize,Modin , Vaex , Dask,cuDF,mars,ray,rapids,joblib,snorkel https://www.youtube.com/watch?v=eJyjB3cNIB0&feature=youtu.be

19.dataprep (Understand your data with a few lines of code in seconds)

data-preparation-tools - https://improvado.io/blog/data-preparation-tools

20.Dora library is another data analysis library designed to simplify exploratory data analysis. https://pypi.org/project/Dora/

21.FastAPI is a modern, fast (high-performance), web framework for building APIs. https://fastapi.tiangolo.com/

22.faster Hyper Parameter Tuning(sklearn-nature-inspired-algorithms) https://pypi.org/project/sklearn-nature-inspired-algorithms/

23.FlashText (A library faster than Regular Expressions for NLP tasks) https://pypi.org/project/flashtext/

24.Guietta (tool that makes simple GUIs simple) https://pypi.org/project/guietta/

pandas-visual-analysis -https://analyticsindiamag.com/hands-on-guide-to-pandas-visual-analysis-way-to-speed-up-data-visualization/

25.hummingbird (make code fastly exexcute) https://pypi.org/project/Hummingbird/ https://analyticsindiamag.com/guide-to-hummingbird-a-microsofts-library-for-expediting-traditional-machine-learning-models/

CUML- increase the speed of training your machine learning model https://towardsdatascience.com/train-your-machine-learning-model-150x-faster-with-cuml-69d0768a047a

https://docs.rapids.ai/api/cuml/stable/

26.memory-profiler (tell memory consumption line by line) https://pypi.org/project/memory-profiler/

Cython A Speed-Up Tool for your Python Function https://towardsdatascience.com/cython-a-speed-up-tool-for-your-python-function-9bab64364bfd

Python Tricks for Keeping Track of Your Data https://towardsdatascience.com/python-tricks-for-keeping-track-of-your-data-aef3dc817a4e

27.numexpr (incerease speed of execution of numpy) https://github.com/pydata/numexpr

pypolars instead of pandas (beating-pandas-performance) https://www.youtube.com/watch?v=1-O_KnLZEso

50X speed up your Pandas apply function https://github.com/jmcarpenter2/swifter

JAX Autograd and XLA, facilitating high-performance machine learning research https://github.com/google/jax

Numba (optimise performance of numpy and high performance python compiler) http://numba.pydata.org/

28.pandarallel (simple and efficient tool to parallelize your pandas computation on all your CPUs) https://pypi.org/project/pandarallel/

29.PDFTableExtract(by PyPDF2) https://github.com/ashima/pdf-table-extract

Camelot-https://towardsdatascience.com/extracting-tabular-data-from-pdfs-made-easy-with-camelot-80c13967cc88

30.PyImpuyte(Python package that simplifies the task of imputing missing values in big datasets) https://pypi.org/project/PyImpuyte/

31.libra(Automates the end-to-end machine learning process in just one line of code) https://pypi.org/project/libra/

32.debug code by puyton -m pdp -c continue

33.cURL (This is a useful tool for obtaining data from any server via a variety of protocols including HTTP.) https://stackabuse.com/using-curl-in-python-with-pycurl/

34.csvkit https://pypi.org/project/csvkit/

35.IPython IPython gives access to enhanced interactive python from the shell.

36.pip install faker (Create our own Dataset) https://pypi.org/project/Faker/

37.Python debugger %pdb

38.๐šŸ๐š˜๐š’๐š•๐šŠ-From notebooks to standalone web applications and dashboards https://voila.readthedocs.io/en/stable/ https://github.com/voila-dashboards/voila

39.๐š๐šœ๐š•๐šŽ๐šŠ๐š›๐š— for timeseries data https://github.com/tslearn-team/tslearn

40.texthero text-based dataset in Pandas Dataframe quickly and effortlessly https://github.com/jbesomi/texthero

41.๐š”๐šŠ๐š•๐šŽ๐š’๐š๐š˜(web-based visualization libraries like your Jupyter Notebook with zero dependencies) https://pypi.org/project/kaleido/

42.Vaex- Reading And Processing Huge Datasets in seconds https://github.com/vaexio/vaex

43.Uberโ€™s Ludwig is an Open Source Framework for Low-Code Machine Learning https://eng.uber.com/introducing-ludwig/

44.Google's TAPAS, a BERT-Based Model for Querying Tables Using Natural Language https://github.com/google-research/tapas

45.RAPIDS open GPU Data Science https://rapids.ai/

RAPIDS cuML

tick is a lightweight machine learning library https://x-datainitiative.github.io/tick/

modular machine learning framework http://www.pybrain.org/docs/

machine learning framework It supports several programming languages notably: Python, R, Java, Scala, Ruby and Lua Shogun https://github.com/shogun-toolbox/shogun/

46.pyforest Lazy-import of all popular Python Data Science libraries. Stop writing the same imports over and over again. https://pypi.org/project/pyforest/0.1.1/

47.Modin Get faster Pandas with Modin https://github.com/modin-project/modin

48.Text2Code for Jupyter notebook - https://github.com/deepklarity/jupyter-text2code , https://towardsdatascience.com/data-analysis-made-easy-text2code-for-jupyter-notebook-5380e89bb493

49.Openrefine Tool-For Data Preprocessing Without Code https://analyticsindiamag.com/openrefine-tutorial-a-tool-for-data-preprocessing-without-code/

50.Microsoft Releases Latest Version Of DeepSpeed deep learning optimisation library known as DeepSpeed- https://github.com/microsoft/DeepSpeed

https://analyticsindiamag.com/microsoft-releases-latest-version-of-deepspeed-its-python-library-for-deep-learning-optimisation/

51.4-pandas-tricks-https://towardsdatascience.com/4-pandas-tricks-that-most-people-dont-know-86a70a007993

52.tkinter to deploy machine learning model-https://analyticsindiamag.com/complete-tutorial-on-tkinter-to-deploy-machine-learning-model/

53.autoplotter is a python package for GUI based exploratory data analysis-https://github.com/ersaurabhverma/autoplotter

54.3 NLP Interpretability Tools For Debugging Language Models-https://www.topbots.com/nlp-interpretability-tools/

55.New Algorithm For Training Sparse Neural Networks (RigL)-https://analyticsindiamag.com/rigl-google-algorithm-neural-networks/

56.Read Data from pdf and Word-PyPDF2,PDFMiner,PDFQuery,tabula-py,pdflib for Python,PDFTables,PyFPDF2

OpenCV to Extract Information From Table Images-https://analyticsindiamag.com/how-to-use-opencv-to-extract-information-from-table-images/

57.Text Annotation-https://towardsdatascience.com/tortus-e4002d95134b

58.GDMix, A Framework That Trains Efficient Personalisation Models - https://analyticsindiamag.com/linkedin-open-sources-gdmix-a-framework-that-trains-efficient-personalisation-models/

59.Learn Machine Learning Concepts Interactively-https://towardsdatascience.com/learn-machine-learning-concepts-interactively-6c3f64518da2

60.Folium, Python Library For Geographical Data Visualization-https://analyticsindiamag.com/hands-on-tutorial-on-folium-python-library-for-geographical-data-visualization/

61.GPU Technology Conference (GTC) Keynote Oct 2020-https://www.youtube.com/watch?v=Dw4oet5f0dI&list=PLZHnYvH1qtOYOfzAj7JZFwqtabM5XPku1

62.jiant nlp task-https://github.com/nyu-mll/jiant

63.painted your machine learning model-https://koaning.github.io/human-learn/

64.Vector AI-https://github.com/vector-ai/vectorai

65.NVIDIA NeMo(for Conversational AI)-https://github.com/NVIDIA/NeMo

66.Deep Learning Models Without Coding(DeepCognition)-https://analyticsindiamag.com/how-to-use-deepcognition-to-build-drag-and-drop-deep-learning-models-without-coding/

67.100 Machine Learning Projects-https://medium.com/@amankharwal/100-machine-learning-projects-aff22b22dd6e

68.Question generation using Natural Language Processing-https://github.com/ramsrigouthamg/Questgen.ai

69.PixelLib(image segmentation,Blur Background,Gray Background,Background Colour Change,Background Change)-https://github.com/ayoolaolafenwa/PixelLib

70.High-Resolution 3D Human Digitization-https://shunsukesaito.github.io/PIFuHD/

71.AI model that translates 100 languages without relying on English data - https://ai.facebook.com/blog/introducing-many-to-many-multilingual-machine-translation/

72.800 free textbooks - https://open.umn.edu/opentextbooks

73.TensorDash is an application that lets you remotely monitor your deep learning model's metrics and notifies you when your model training is completed or crashed.

https://github.com/CleanPegasus/TensorDash

HyperDash https://towardsdatascience.com/how-to-monitor-and-log-your-machine-learning-experiment-remotely-with-hyperdash-aa7106b15509

74.YellowBrick -select features, tune hyperparameters, select the best models, and understand the performance metrics.

75.Freely Available Python Books-https://rajukumarmishrablog.com/freely-available-python-books/

Collection of Python Cheat Sheets- https://rajukumarmishrablog.com/collection-of-python-cheat-sheets/

76.Add External Data to Your Pandas Dataframe - https://towardsdatascience.com/add-external-data-to-your-pandas-dataframe-with-a-one-liner-f060f80daaa4

https://www.openblender.io/#/welcome

77.visualize the model architecture-https://github.com/PerceptiLabs/PerceptiLabs

78.Train Conversational AI in 3 lines of code with NeMo and Lightning-https://towardsdatascience.com/train-conversational-ai-in-3-lines-of-code-with-nemo-and-lightning-a6088988ae37

79.Machine Learning for Healthcare by mit-https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-s897-machine-learning-for-healthcare-spring-2019/

80.pydot is an interface to Graphviz ,AutoGraph-Easy control flow for graphs,Neo4j-Graph Data Science Library,pyRDF2Vec-Representations of Entities in a Knowledge Graph,igraph,NetworkX,euler,pyvis

https://www.tensorflow.org/neural_structured_learning

AutoGL: The First Ever AutoML Framework for Graph Datasets https://analyticsindiamag.com/meet-autogl-the-first-ever-automl-framework-for-graph-datasets/

https://analyticsindiamag.com/complete-guide-to-autogl-the-latest-automl-framework-for-graph-datasets/

open-source project for analysis of graphs or networks GrasPy / graspologic https://graspy.neurodata.io/

https://www.kdnuggets.com/2019/05/60-useful-graph-visualization-libraries.html

81.HTML tables into Google Sheets -https://towardsdatascience.com/import-html-tables-into-google-sheets-effortlessly-f471eae58ac9

82.Gradio - take input frpm user https://gradio.app/getting_started

  1. Mito, an editable spreadsheet inside your Jupyter Notebook. - https://trymito.io/

84.Google Introduces Document AI (DocAI) https://www.marktechpost.com/2020/11/05/google-introduces-document-ai-docai-platform-for-automated-document-processing/

85.100 Machine Learning Projects-https://amankharwal.medium.com/100-machine-learning-projects-aff22b22dd6e

86.https://towardsdatascience.com/25-hot-new-data-tools-and-what-they-dont-do-31bf23bd8e56

87.Opacus: A high-speed library for training PyTorch models-https://ai.facebook.com/blog/introducing-opacus-a-high-speed-library-for-training-pytorch-models-with-differential-privacy

88.lazynlp https://github.com/chiphuyen/lazynlp

89.yfinance to get finance data

90.Pseudo-Labeling (deal with small datasets)https://towardsdatascience.com/pseudo-labeling-to-deal-with-small-datasets-what-why-how-fd6f903213af

91.Project List A - Comparatively Easy Wine Quality Analysis,Boston Housing Prediction,Spam Email Classification,Survival Prediction - Titanic Disaster,Stock Market Prediction Class of Flower Prediction,Bigmart Sales Prediction,Air Pollution Prediction,IMDB Prediction,Optimizing Product Price,Web Traffic Time Series Forecasting,Insurance Purchase Prediction,Tweet Classification

Project List B - Comparatively Difficult,Domain-Specific Chatbot,Fake News Detection,Human Action Recognition,Video Classification,Driver Drowsiness Detection,Medical Report Gen Using CT Scans,Sign Language Detection,Image Caption Generator,Celebrity Voice Prediction,Speech Emotion Recognition,Job Recommendation System,Interest Level in Rental Properties,Google Ads Keywords Generator

https://www.analyticsvidhya.com/blog/2018/05/24-ultimate-data-science-projects-to-boost-your-knowledge-and-skills/

https://ml-showcase.paperspace.com/ https://github.com/ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code

https://dev.to/hb/30-machine-learning-ai-data-science-project-ideas-gf5

https://medium.com/coders-camp/180-data-science-and-machine-learning-projects-with-python-6191bc7b9db9

https://www.analyticsvidhya.com/blog/2020/12/10-data-science-projects-for-beginners/?utm_source=linkedin&utm_medium=AV|link|high-performance-blog|blogs|44195|0.375

https://medium.com/the-innovation/130-machine-learning-projects-solved-and-explained-605d188fb392

https://thecleverprogrammer.com/machine-learning/ https://www.kdnuggets.com/2020/03/20-machine-learning-datasets-project-ideas.html

https://www.analyticsvidhya.com/blog/2018/05/24-ultimate-data-science-projects-to-boost-your-knowledge-and-skills/?utm_source=linkedin&utm_medium=KJ|link|blackbelt|blogs|44081|0.625

https://data-flair.training/blogs/machine-learning-datasets/# https://data-flair.training/blogs/machine-learning-project-ideas/

https://data-flair.training/blogs/artificial-intelligence-ai-tutorial/

https://data-flair.training/blogs/cartoonify-image-opencv-python/ https://data-flair.training/blogs/python-project-calorie-calculator-django/

https://www.theinsaneapp.com/2020/11/machine-learning-projects-with-source-codes.html https://www.theinsaneapp.com/2020/11/data-science-projects-with-source-code.html

https://amankharwal.medium.com/20-machine-learning-projects-on-future-prediction-with-python-93932d9a7f7f

https://medium.com/coders-camp/20-deep-learning-projects-with-python-3c56f7e6a721 https://amankharwal.medium.com/12-machine-learning-projects-on-object-detection-46b32adc3c37

https://amankharwal.medium.com/7-python-gui-projects-for-beginners-87ae2c695d78

https://amankharwal.medium.com/20-machine-learning-projects-for-portfolio-81e3dbd167b1 https://amankharwal.medium.com/4-chatbot-projects-with-python-5b32fd84af37

https://amankharwal.medium.com/30-python-projects-solved-and-explained-563fd7473003

https://www.aiquotient.app/projects https://www.aiquotient.app/ https://www.mltut.com/best-machine-learning-projects-for-beginners/

https://medium.com/coders-camp/20-machine-learning-projects-on-nlp-582effe73b9c

  1. Visual Programming (Orange) https://orange.biolab.si/

93.The Linux Command Handbook-https://www.freecodecamp.org/news/the-linux-commands-handbook/

94.130 Machine Learning Projects Solved and Explained-https://medium.com/the-innovation/130-machine-learning-projects-solved-and-explained-605d188fb392

95.DataBrew-do drag-and-drop data cleansing

96.stratascratch- https://www.stratascratch.com/

97.5 ways to celebrate TensorFlow's 5th birthday-https://blog.google/technology/ai/5-ways-celebrate-tensorflows-5th-birthday/

98.TensorFlow.js: Machine Learning in Javascript https://blog.tensorflow.org/2018/03/introducing-tensorflowjs-machine-learning-javascript.html

99.Language Interpretability Tool open-source platform for visualization and understanding of NLP models - https://pair-code.github.io/lit/

100.Deep Learning Hardware Guide https://towardsdatascience.com/another-deep-learning-hardware-guide-73a4c35d3e86

101.johnsnowlabs- https://nlp.johnsnowlabs.com/ https://nlp.johnsnowlabs.com/docs/en/quickstart https://nlp.johnsnowlabs.com/docs/en/licensed_release_notes

103.Edit a spreadsheet Generate Python https://trymito.io/?source=twitter1

104.Clarifai-https://www.clarifai.com/ https://analyticsindiamag.com/clarifai/

105.rapidly build and deploy machine learning models https://analyticsindiamag.com/top-10-datarobot-alternatives-one-must-know/

106.Hive Data full-stack AI https://thehive.ai/hive-data

107.real-time remote service to get the Keras callbacks to the telegram including the details of metrics https://github.com/ksdkamesh99/TensorGram

108.Language Interpretability Tool - https://pair-code.github.io/lit/demos/

109.Docly will handle the comments http://thedocly.io/

110.machine-learning-roadmap-2020 https://whimsical.com/machine-learning-roadmap-2020-CA7f3ykvXpnJ9Az32vYXva

111.Django models https://www.deploymachinelearning.com/#create-django-models https://www.deploymachinelearning.com/

112.freecodecamp - https://www.freecodecamp.org/learn

113.image_to_string (pytesseract)

Extract Tables in PDFs to pandas DataFrames - tabula-py

114.NLP Pipelines in a single line of code https://medium.com/analytics-vidhya/nlp-pipelines-in-a-single-line-of-code-500b3266ac7b

115.Best and Worst Cases of Machine-Learning Models https://medium.com/towards-artificial-intelligence/best-and-worst-cases-of-machine-learning-models-part-1-36cdb9296611

https://www.youtube.com/watch?v=mlumJPFvooQ&list=PLZoTAELRMXVM0zN0cgJrfT6TK2ypCpQdY

116.aitextgen #for ai text generation

117.http://introtodeeplearning.com/ http://cs231n.stanford.edu/ http://web.stanford.edu/class/cs224n/index.html#schedule https://www.youtube.com/playlist?list=PLkFD6_40KJIwhWJpGazJ9VSj9CFMkb79A https://www.youtube.com/playlist?list=PLkFD6_40KJIwhWJpGazJ9VSj9CFMkb79A https://www.youtube.com/playlist?list=PLwRJQ4m4UJjPiJP3691u-qWwPGVKzSlNP https://www.youtube.com/playlist?list=PLoROMvodv4rMC6zfYmnD7UG3LVvwaITY5

117.https://data-flair.training/blogs/data-science-tutorials-home

118.Integrating Tableau With Python https://analyticsindiamag.com/tabpy/

Qlib https://analyticsindiamag.com/qlib/

119.Pystiche - Create Your Artistic Image Using Pystiche https://analyticsindiamag.com/pystiche/ https://pystiche.readthedocs.io/en/latest/index.html

120.Low Light Image Enhancement using Python & Deep Learning https://github.com/soumik12345/MIRNet/ https://www.youtube.com/watch?v=b5Uz_c0JLMs

I will be so happy that this repository helps you. Thank you for reading.

                                                    HAPPY LEARNING

About

Complete-Life-Cycle-of-a-Data-Science-Project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published