Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Docker and CI Pipeline #70

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions .github/workflows/poetry_ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: Poetry CI

on:
push:
branches:
- master

jobs:
build:
runs-on: ubuntu-latest

steps:
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.11

- name: Install Poetry
run: curl -sSL https://install.python-poetry.org | python3 -

- name: Checkout code
uses: actions/checkout@v2

- name: Build
run: |
poetry install
poetry build

deploy:
needs: build
runs-on: ubuntu-latest

steps:
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.9

- name: Install Poetry
run: curl -sSL https://install.python-poetry.org | python3 -

- name: Checkout code
uses: actions/checkout@v2

- name: Publish to PyPI
run: |
poetry config pypi-token.pypi ${{ secrets.PYPI_PASSWORD }}
poetry publish --username ${{ secrets.PYPI_USERNAME }} --no-interaction
20 changes: 20 additions & 0 deletions .github/workflows/pylint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: Pylint

on: [push]

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: 3.9
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pylint
- name: Analysing the code with pylint
run: |
pylint $(git ls-files '*.py')
25 changes: 25 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Use an appropriate base image
FROM python:2.7

# Set the working directory inside the container
WORKDIR /app

# Copy the project code into the container
COPY . /app

RUN tar -xjvf data.tar.bz2

# Install dependencies
RUN pip install torch torchvision
RUN pip install -r requirements.txt
RUN pip install future

# Download and extract the glove embedding
RUN bash download_glove.sh
RUN python extract_vocab.py

# Expose any necessary ports
# EXPOSE <port_number>

# Set the command to run when the container starts
CMD ["python", "train.py"]
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ This repo provides an implementation of SQLNet and Seq2SQL neural networks for p
```

## Installation
Project installation can be done either locally or by running a Docker container

## Local installation
The data is in `data.tar.bz2`. Unzip the code by running
```bash
tar -xjvf data.tar.bz2
Expand All @@ -28,18 +31,26 @@ The code is written using PyTorch in Python 2.7. Check [here](http://pytorch.org
pip install -r requirements.txt
```

## Downloading the glove embedding.
### Downloading the glove embedding.
Download the pretrained glove embedding from [here](https://github.com/stanfordnlp/GloVe) using
```bash
bash download_glove.sh
```

## Extract the glove embedding for training.
### Extract the glove embedding for training.
Run the following command to process the pretrained glove embedding for training the word embedding:
```bash
python extract_vocab.py
```

## Container Installation
Make sure you have Docker already installed in your machine

To build the Docker image, go to the root directory of the project and run on the terminal:
```bash
docker build -t sql-net .
```

## Train
The training script is `train.py`. To see the detailed parameters for running:
```bash
Expand Down
Binary file added dist/sqlnet_predict-0.1.0-py3-none-any.whl
Binary file not shown.
Binary file added dist/sqlnet_predict-0.1.0.tar.gz
Binary file not shown.
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Binary file added docs/build/doctrees/README.doctree
Binary file not shown.
Binary file added docs/build/doctrees/environment.pickle
Binary file not shown.
Binary file added docs/build/doctrees/index.doctree
Binary file not shown.
Binary file added docs/build/doctrees/readme_link.doctree
Binary file not shown.
4 changes: 4 additions & 0 deletions docs/build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: bb8a3b91062c0bdde59c392479c30864
tags: 645f666f9bcd5a90fca523b33c5a78b7
189 changes: 189 additions & 0 deletions docs/build/html/README.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
<!DOCTYPE html>

<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />

<title>SQLNet &#8212; SQLNet 0.1.0 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="_static/alabaster.css" />
<script data-url_root="./" id="documentation_options" src="_static/documentation_options.js"></script>
<script src="_static/doctools.js"></script>
<script src="_static/sphinx_highlight.js"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />

<link rel="stylesheet" href="_static/custom.css" type="text/css" />


<meta name="viewport" content="width=device-width, initial-scale=0.9, maximum-scale=0.9" />

</head><body>


<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">


<div class="body" role="main">

<section id="sqlnet">
<h1>SQLNet<a class="headerlink" href="#sqlnet" title="Permalink to this heading">¶</a></h1>
<p>This repo provides an implementation of SQLNet and Seq2SQL neural networks for predicting SQL queries on <a class="reference external" href="https://github.com/salesforce/WikiSQL">WikiSQL dataset</a>. The paper is available at <a class="reference external" href="https://arxiv.org/abs/1711.04436">here</a>.</p>
<section id="citation">
<h2>Citation<a class="headerlink" href="#citation" title="Permalink to this heading">¶</a></h2>
<blockquote>
<div><p>Xiaojun Xu, Chang Liu, Dawn Song. 2017. SQLNet: Generating Structured Queries from Natural Language Without Reinforcement Learning.</p>
</div></blockquote>
</section>
<section id="bibtex">
<h2>Bibtex<a class="headerlink" href="#bibtex" title="Permalink to this heading">¶</a></h2>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="nd">@article</span><span class="p">{</span><span class="n">xu2017sqlnet</span><span class="p">,</span>
<span class="n">title</span><span class="o">=</span><span class="p">{</span><span class="n">SQLNet</span><span class="p">:</span> <span class="n">Generating</span> <span class="n">Structured</span> <span class="n">Queries</span> <span class="n">From</span> <span class="n">Natural</span> <span class="n">Language</span> <span class="n">Without</span> <span class="n">Reinforcement</span> <span class="n">Learning</span><span class="p">},</span>
<span class="n">author</span><span class="o">=</span><span class="p">{</span><span class="n">Xu</span><span class="p">,</span> <span class="n">Xiaojun</span> <span class="ow">and</span> <span class="n">Liu</span><span class="p">,</span> <span class="n">Chang</span> <span class="ow">and</span> <span class="n">Song</span><span class="p">,</span> <span class="n">Dawn</span><span class="p">},</span>
<span class="n">journal</span><span class="o">=</span><span class="p">{</span><span class="n">arXiv</span> <span class="n">preprint</span> <span class="n">arXiv</span><span class="p">:</span><span class="mf">1711.04436</span><span class="p">},</span>
<span class="n">year</span><span class="o">=</span><span class="p">{</span><span class="mi">2017</span><span class="p">}</span>
<span class="p">}</span>
</pre></div>
</div>
</section>
<section id="installation">
<h2>Installation<a class="headerlink" href="#installation" title="Permalink to this heading">¶</a></h2>
<p>The data is in <code class="docutils literal notranslate"><span class="pre">data.tar.bz2</span></code>. Unzip the code by running</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>tar<span class="w"> </span>-xjvf<span class="w"> </span>data.tar.bz2
</pre></div>
</div>
<p>The code is written using PyTorch in Python 2.7. Check <a class="reference external" href="http://pytorch.org/">here</a> to install PyTorch. You can install other dependency by running</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>pip<span class="w"> </span>install<span class="w"> </span>-r<span class="w"> </span>requirements.txt
</pre></div>
</div>
</section>
<section id="downloading-the-glove-embedding">
<h2>Downloading the glove embedding.<a class="headerlink" href="#downloading-the-glove-embedding" title="Permalink to this heading">¶</a></h2>
<p>Download the pretrained glove embedding from <a class="reference external" href="https://github.com/stanfordnlp/GloVe">here</a> using</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>bash<span class="w"> </span>download_glove.sh
</pre></div>
</div>
</section>
<section id="extract-the-glove-embedding-for-training">
<h2>Extract the glove embedding for training.<a class="headerlink" href="#extract-the-glove-embedding-for-training" title="Permalink to this heading">¶</a></h2>
<p>Run the following command to process the pretrained glove embedding for training the word embedding:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>extract_vocab.py
</pre></div>
</div>
</section>
<section id="train">
<h2>Train<a class="headerlink" href="#train" title="Permalink to this heading">¶</a></h2>
<p>The training script is <code class="docutils literal notranslate"><span class="pre">train.py</span></code>. To see the detailed parameters for running:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>train.py<span class="w"> </span>-h
</pre></div>
</div>
<p>Some typical usage are listed as below:</p>
<p>Train a SQLNet model with column attention:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>train.py<span class="w"> </span>--ca
</pre></div>
</div>
<p>Train a SQLNet model with column attention and trainable embedding (requires pretraining without training embedding, i.e., executing the command above):</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>train.py<span class="w"> </span>--ca<span class="w"> </span>--train_emb
</pre></div>
</div>
<p>Pretrain a <a class="reference external" href="https://arxiv.org/abs/1709.00103">Seq2SQL model</a> on the re-splitted dataset</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>train.py<span class="w"> </span>--baseline<span class="w"> </span>--dataset<span class="w"> </span><span class="m">1</span>
</pre></div>
</div>
<p>Train a Seq2SQL model with Reinforcement Learning after pretraining</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>train.py<span class="w"> </span>--baseline<span class="w"> </span>--dataset<span class="w"> </span><span class="m">1</span><span class="w"> </span>--rl
</pre></div>
</div>
</section>
<section id="test">
<h2>Test<a class="headerlink" href="#test" title="Permalink to this heading">¶</a></h2>
<p>The script for evaluation on the dev split and test split. The parameters for evaluation is roughly the same as the one used for training. For example, the commands for evaluating the models from above commands are:</p>
<p>Test a trained SQLNet model with column attention</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>test.py<span class="w"> </span>--ca
</pre></div>
</div>
<p>Test a trained SQLNet model with column attention and trainable embedding:</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>test.py<span class="w"> </span>--ca<span class="w"> </span>--train_emb
</pre></div>
</div>
<p>Test a trained <a class="reference external" href="https://arxiv.org/abs/1709.00103">Seq2SQL model</a> withour RL on the re-splitted dataset</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>test.py<span class="w"> </span>--baseline<span class="w"> </span>--dataset<span class="w"> </span><span class="m">1</span>
</pre></div>
</div>
<p>Test a trained Seq2SQL model with Reinforcement learning</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>python<span class="w"> </span>test.py<span class="w"> </span>--baseline<span class="w"> </span>--dataset<span class="w"> </span><span class="m">1</span><span class="w"> </span>--rl
</pre></div>
</div>
</section>
</section>


</div>

</div>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
<div class="sphinxsidebarwrapper">
<h1 class="logo"><a href="index.html">SQLNet</a></h1>








<h3>Navigation</h3>
<ul>
<li class="toctree-l1"><a class="reference internal" href="readme_link.html">README</a></li>
</ul>

<div class="relations">
<h3>Related Topics</h3>
<ul>
<li><a href="index.html">Documentation overview</a><ul>
</ul></li>
</ul>
</div>
<div id="searchbox" style="display: none" role="search">
<h3 id="searchlabel">Quick search</h3>
<div class="searchformwrapper">
<form class="search" action="search.html" method="get">
<input type="text" name="q" aria-labelledby="searchlabel" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false"/>
<input type="submit" value="Go" />
</form>
</div>
</div>
<script>document.getElementById('searchbox').style.display = "block"</script>








</div>
</div>
<div class="clearer"></div>
</div>
<div class="footer">
&copy;2023, xiaojunxu.

|
Powered by <a href="http://sphinx-doc.org/">Sphinx 7.0.1</a>
&amp; <a href="https://github.com/bitprophet/alabaster">Alabaster 0.7.12</a>

|
<a href="_sources/README.md.txt"
rel="nofollow">Page source</a>
</div>




</body>
</html>
Loading