Stars
A collection of utilities for writing labeling functions, transformation functions, and slicing functions.
The Big List of Naughty Strings is a list of strings which have a high probability of causing issues when used as user-input data.
FastAPI framework, high performance, easy to learn, fast to code, ready for production
A collection of tutorials for Snorkel
Ultimate Plumber is a tool for writing Linux pipes with instant live preview
Hyperparameter Experiments with TensorFlow and Keras
Snorkel MeTaL: A framework for training models with multi-task weak supervision
A library for efficient similarity search and clustering of dense vectors.
A library for Multilingual Unsupervised or Supervised word Embeddings
Open standard for machine learning interoperability
Caffe2 is a lightweight, modular, and scalable deep learning framework.
An open-source C++ library developed and used at Facebook.
An open-source NLP research library, built on PyTorch.
Learning to Compose Domain-Specific Transformations for Data Augmentation
Super simple fit method for PyTorch Modules
Programming exercises for the Stanford Unsupervised Feature Learning and Deep Learning Tutorial
Data and code behind the articles and graphics at FiveThirtyEight
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Some notes on things I find interesting and important.
Collection of tools for building diachronic/historical word vectors
Tensors and Dynamic neural networks in Python with strong GPU acceleration
MacroBase: A Search Engine for Fast Data
High-performance runtime for data analytics applications
A probabilistic programming language in TensorFlow. Deep generative models, variational inference.