Skip to content

Running on Ubuntu OS

Nikhil VJ edited this page Oct 13, 2018 · 11 revisions

A. Directly on system Python3

Note: It is recommended to go the virtual environment way instead. Still, if you're fine with it...

~~ nope, do it in a virtual environment only. 
I ain't looking into your bugs if you do this in the main env. 
Scroll to next heading. ~~

B. Using a virtual environment

This is the recommended way, to ensure you have the exact versions of python and dependencies that the programmer had at the time of making the software, and to avoid interference with the rest of your system.

All commands together

git clone https://github.com/WRI-Cities/static-GTFS-manager.git
cd static-GTFS-manager
pip install virtualenv --user
virtualenv -p python3.6 ~/VIRTUAL
source ~/VIRTUAL/bin/activate
which python
pip install -r requirements.txt
python GTFSManager.py

One by one

  1. Open Terminal (linux command prompt) and clone this repo to your side:
    git clone https://github.com/WRI-Cities/static-GTFS-manager.git

  2. Navigate into the folder created.
    cd static-GTFS-manager

  3. Install virtualenv in your system if not already installed:
    pip3 install virtualenv --user

  4. Initiate a python3.6 virtual environment in a new folder and activate it.

virtualenv -p python3.6 ~/VIRTUAL
source ~/VIRTUAL/bin/activate
which python
  1. Last command shows from where the current python environment is running. It should come as [user home folder]/VIRTUAL/bin/python3

  2. We'll be using pip the package installer to install the required python modules. Proceed with the next steps, and IF you encounter an error, please jump down to the heading "Solving pip pandas installation problem", do one of the workarounds offered, then come back here and proceed.

  3. Install the required python dependencies, in the virtual environment where it will not interfere with you system's main python:
    pip install -r requirements.txt
    Note: you can also just read requirements.txt and install each package manually using pip install package==version. You may have to do this if any one of the modules is showing an error in installation and you want to troubleshoot, or if you're using a customized version of one of the modules, etc.

  4. Run GTFSManager.py in the python of the virtual environment (which is python 3.6):
    python GTFSManager.py

  5. The program should load in a new web browser tab. You can now operate the program from your web browser. In case it doesn't load up, see the terminal for the URL, it is most likely http://localhost:5000/ or so.

  6. See the terminal for instructions and reporting of various processes. There are some recurring warnings which you can ignore, like WARNING:tornado.access:404 GET /favicon.ico (::1) 1.35ms

Note: there is a password input box at top right corner. It's a basic idiot-proofing measure. For any operation involving editing, import or export of data, the password should be typed in. Please scroll down to find ways to change the password or work around it.

Closing

  1. The program will keep running while you operate on the browser. To terminate the program, come back to the Terminal and press Ctrl+C or close the window.

  2. To get out of the python3.6 virtual environment, run: deactivate. To get back in, run source ~/VIRTUAL/bin/activate. You may navigate to your Home folder and delete off the VIRTUAL folder; it won't have any effect on the rest of your system.

Note: During this whole time, we did create a virtual environment at ~/VIRTUAL/ but in the Terminal we stayed at the program's working folder only. We didn't navigate anywhere else.


C. Notes, extended explanations

Python 3.6 and not 3.7

As of Oct 2018, one of the modules required to run this program, PyTables, does not install in python 3.7 via the pip install.. command. Take a look at their page on pypi.org: https://pypi.org/project/tables/#files
No "cp37" version there. For this reason, as of now we can only run this in python upto 3.6.

Location of virtual environment

You can setup the VIRTUAL folder under your user home folder (indicated by ~/) or anywhere on your system that doesn't have a space in the path name. This is a big current bug about virtual environments: It can't tolerate spaces in the absolute path. Which is also why I didn't show a relative path way.. there's a chance that the folder you're on has a space in its absolute path. This problem is only at the virtual env stage, though.. once that is set up it's ok to run things as you normally would, use "" quotes for filenames with spaces etc.

Do we really need to be so strict about package versions?

Not really. This is just the recommended settings that we know to work for sure. Go ahead and try with the latest.. there might even be some improvements in performance! If the program is working fine with a later version, then please go to the requirements.txt file in this repo, edit it and make a pull request. We'll have to test the whole thing out properly though before proceeding.


Solving pip pandas installation problem

Update, Sep 2018: The problem described is not happening any more with latest versions of pip (18.0) and pandas (0.23)

As of pip v10.0.1 in April 2018, there is an incompatibility issue happening when installing one of the modules, pandas, affecting 32-bit linux machines. You might see something like this on trying to install pandas:

Collecting pandas
  Using cached https://files.pythonhosted.org/packages/08/01/803834bc8a4e708aedebb133095a88a4dad9f45bbaf5ad777d2bea543c7e/pandas-0.22.0.tar.gz
  Could not find a version that satisfies the requirement numpy==1.9.3 (from versions: 1.11.1rc1, 1.11.1, 1.11.2rc1, 1.11.2, 1.11.3, 1.12.0b1, 1.12.0rc1, 1.12.0rc2, 1.12.0, 1.12.1rc1, 1.12.1, 1.13.0rc1, 1.13.0rc2, 1.13.0, 1.13.1, 1.13.3, 1.14.0rc1, 1.14.0, 1.14.1, 1.14.2)
No matching distribution found for numpy==1.9.3

This is because pandas v0.22.0 has within its package a doc pandas-0.22.0/pyproject.toml telling it to only accept numpy v1.9.3 (as a dependency install) whereas pip has since moved on and is only having later versions of numpy on board. Welcome to the world of obsoleted dependencies. This is also one of the reasons why it's better to be installing this stuff in virtual environments rather than the main system python3 engine which in Linux systems also powers the whole OS.

First, install the latest numpy package independently:
pip3 install numpy

Then, do ANY ONE of these two workarounds:

  1. Run this: pip3 install pandas --no-build-isolation
    This should tell pip3 to ignore the exact numpy version mandate in the pandas installer and instead go with what's already there.

OR

  1. Set pip package installer to 9.0.3 version. Back then, it did not pay heed to the exact version mandate. pip3 install pip==9.0.3
    And then, pip3 install pandas

Hopefully this versions conflict will get resolved as pip and pandas progress and new versions come. These are well covered in issues raised on their github repos. Reference: https://github.com/pandas-dev/pandas/issues/20697#issuecomment-384350250