GitHub - debajyotidatta/RecurrentArchitectures: See the accompanying blog post

Why this? What is the goal?

The goal of this repository is to write all the recurrent architectures from scratch in tensorflow for learning purposes. This is a Work-In-Progress. I plan to implement some more architectures and publish the results and performances for all of them. The inspiration for this post was the last paragraph of this post: Understanding LSTMs Chris Olah mentioned two papers that did extensive study on recurrent architectures and I wanted to implement all the architectures in these two papers. A short Google search resulted that Jim Flemming already did half the work here, so I decided to implement all the remaining architectures of Jozefowicz's paper. (I also updated parts of his code so that all the architectures work in the newest version of tensorflow. Both these papers are fantastic and worth a read. Feel free to send me a pull request if you spot an error and/or find other papers with recurrent architecture variants. As and when time permits, I will implement them. All the implementations are in Tensorflow (0.12).

Deep Learning Recurrent Architectures

LSTM Network Variants This tutorial has a very nice approach to creating variations of LSTM Networks. A good approach to learning how to code a new network architecture and more importantly a methodical approach to understanding the gates in LSTM
Empirical Exploration of Recurrent Network Architectures

This was mainly because I wanted to learn the actual implementations of various recurrent neural network architecures and implement them from scratch without using pre defined lstm, gru etc. This is directly a fork of LSTM Network Variants, with the code changes to run on the most recent version of tensorflow. (0.12.0 as of this writing). I will keep this repositiory upto date with the new changes.

Also this repo has more network architectures from here: Empirical Exploration of Recurrent Network Architectures

The implementations are not optimal, in the sense, that in the actual implementations of the LSTM, GRU and RNN cells the states and input are concatenated before multiplications to reduce the number of matrix multiplications whereas this is directly an implementation of the lstm network that you would see in a textbook.

Recurrent Architectures Implemented

If with a (*) then it was implemented in LSTM Network Variants, else was implemented by me based on Empirical Exploration of Recurrent Network Architectures . Also network architectures that I have implemented follow the conventions and syntax of Empirical Exploration of Recurrent Network Architectures.

mut1 : Variant 1 from Empirical Exploration of Recurrent Network Architectures
mut2 : Variant 2 from Empirical Exploration of Recurrent Network Architectures
mut3 : Variant 3 from Empirical Exploration of Recurrent Network Architectures
vanillaRNN : Just a vanilla RNN Network
gru : Gated Recurrent Unit
cifg (*) : Coupled input-forget gate
fgr (*) : Full Gate Recurrence
lstm (*) : Long Short Term Memory
nfg (*) : No forget gate
niaf (*) : No input activation function
nig (*) : No input gate
noaf (*) : No output activation function
nog (*): No output gate
np (*): No peephole connections

Instructions

See the jupyter notebook here: https://github.com/debajyotidatta/RecurrentArchitectures/blob/master/Empirical%20Exploration%20of%20Recurrent%20Network%20Architectures.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
results		results
variants		variants
.gitignore		.gitignore
Empirical Exploration of Recurrent Network Architectures.ipynb		Empirical Exploration of Recurrent Network Architectures.ipynb
README.md		README.md
Results.ipynb		Results.ipynb
__init__.py		__init__.py
language_model.py		language_model.py
lstm_main_journey.ipynb		lstm_main_journey.ipynb
main.py		main.py
ptb_reader.py		ptb_reader.py
reader.py		reader.py
reader_test.py		reader_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Why this? What is the goal?

Deep Learning Recurrent Architectures

Other Tutorials with that are also helpful

Recurrent Architectures Implemented

Instructions

About

Releases

Packages

Languages

debajyotidatta/RecurrentArchitectures

Folders and files

Latest commit

History

Repository files navigation

Why this? What is the goal?

Deep Learning Recurrent Architectures

Other Tutorials with that are also helpful

Recurrent Architectures Implemented

Instructions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages