Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a setup.py and made path to tables and models configurable #32

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@ This code is written in python. To use it you will need:
* [Keras](https://github.com/fchollet/keras) (for Semantic-Relatedness experiments only)
* [gensim](https://radimrehurek.com/gensim/) (for vocabulary expansion when training new models)

Or you can navigate to the skip-thoughts folder and run:

python setup.py install

## Getting started

You will first need to download the model files and word embeddings. The embedding files (utable and btable) are quite large (>2GB) so make sure there is enough space available. The encoder vocabulary can be found in dictionary.txt.
Expand All @@ -28,12 +32,14 @@ You will first need to download the model files and word embeddings. The embeddi

NOTE to Toronto users: You should be able to run the code as is from any machine, without having to download.

Once these are downloaded, open skipthoughts.py and set the paths to the above files (path_to_models and path_to_tables). Now you are ready to go. Make sure to set the THEANO_FLAGS device if you want to use CPU or GPU.
Now you are ready to go. Make sure to set the THEANO_FLAGS device if you want to use CPU or GPU.

Open up IPython and run the following:

path_to_models = '/path/to/models/'
path_to_tables = '/path/to/tables/'
import skipthoughts
model = skipthoughts.load_model()
model = skipthoughts.load_model(path_to_models, path_to_tables)

Now suppose you have a list of sentences X, where each entry is a string that you would like to encode. To get vectors, just run the following:

Expand Down
Empty file added __init__.py
Empty file.
17 changes: 17 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/bin/env python
import os
from setuptools import setup, find_packages

# Utility function to read the README file for long description
def read(fname):
return open(os.path.join(os.path.dirname(__file__), fname)).read()

setup(name='Skip Thoughts',
version='1.0',
description='Sent2Vec encoder and training code from the paper Skip-Thought Vectors.',
author='Ryan Kiros',
url='https://github.com/ryankiros/skip-thoughts',
long_description=read('README.md'),
packages=find_packages(exclude=['contrib', 'docs', 'tests']),
install_requires=['theano','keras','nltk','scikit-learn','gensim','scipy']
)
20 changes: 6 additions & 14 deletions skipthoughts.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,21 +17,13 @@

profile = False

#-----------------------------------------------------------------------------#
# Specify model and table locations here
#-----------------------------------------------------------------------------#
path_to_models = '/u/rkiros/public_html/models/'
path_to_tables = '/u/rkiros/public_html/models/'
#-----------------------------------------------------------------------------#

path_to_umodel = path_to_models + 'uni_skip.npz'
path_to_bmodel = path_to_models + 'bi_skip.npz'


def load_model():
def load_model(path_to_models='./models/', path_to_tables='./models/'):
"""
Load the model with saved tables
"""
path_to_umodel = path_to_models + 'uni_skip.npz'
path_to_bmodel = path_to_models + 'bi_skip.npz'

# Load model options
print 'Loading model parameters...'
with open('%s.pkl'%path_to_umodel, 'rb') as f:
Expand All @@ -56,7 +48,7 @@ def load_model():

# Tables
print 'Loading tables...'
utable, btable = load_tables()
utable, btable = load_tables(path_to_tables)

# Store everything we need in a dictionary
print 'Packing up...'
Expand All @@ -71,7 +63,7 @@ def load_model():
return model


def load_tables():
def load_tables(path_to_tables):
"""
Load the tables
"""
Expand Down