-
Notifications
You must be signed in to change notification settings - Fork 4.3k
CNTK 2.0 Python API
The first cut of the CNTK v2 Python and C++ APIs are now available. These APIs enable programmatically defining CNTK models and drive their training/evaluation, using either built-in data readers or user supplied data in native Python numpy/C++ arrays.
[Note: This is an alpha release meant for early users to try the bits and provide feedback on the usability and functional aspects of the API. Currently there are a few known limitations and rough edges (listed at the bottom of this page) that are being addressed. ]
[Note: If you previously installed an earlier version of the CNTK 2.0 Python pip package, you can skip steps 1 through 3 below and directly jump to step 4 to update your existing CNTK 2.0 package installation from your Python 3.4 environment]
-
Follow the instructions on the CNTK Github Wiki page CNTK Binary Download and Configuration to install the necessary prerequisites for running CNTK binary installation on your machine.
[Note: Please only follow the prerequisites section – download of the binaries is not required since they are part of the pip package you will install in the next step.]
-
If you have an existing Python 3.4 install with numpy and scipy, you may use that. Otherwise we recommend to install Anaconda Python 3.5 for Linux or Windows and create a Python 3.4.4 environment for CNTK by running these commands:
conda create --name cntk-py34 python=3.4.4 numpy scipy
Activate on Windows: activate cntk-py34
Activate on Linux: source activate cntk-py34
[Note: Make sure that this Python version above is what you use for the remainder of the instructions.]
-
Upgrade pip: python -m pip install --upgrade pip
[Note: If you get an error about insufficient permissions, run the command from an elevated command prompt]
-
Install the CNTK 2.0 alpha3 Pip package:
Windows: pip install --upgrade https://cntk.ai/PipPackages/gpu/cntk-2.0a3-cp34-cp34m-win_amd64.whl
Linux: pip install --upgrade https://cntk.ai/PipPackages/gpu/cntk-2.0a3-cp34-cp34m-linux_x86_64.whl
-
Optional: run the Python test included in the CNTK module
pip install pytest
python -c "import cntk, os; print(os.path.dirname(os.path.abspath(cntk.file)))"
pytest [the directory output by the previous command]
-
Get a clone (or update your existing clone) of the CNTK repository (master branch) to get the Python examples and training data files used in these examples.
-
Include the examples directory in PYTHONPATH:
Windows: setx PYTHONPATH [CNTK repo root]\bindings\python\examples;%PYTHONPATH%
Linux: export PYTHONPATH=[CNTK repo root]/bindings/python/examples:$PYTHONPATH
-
Verify PYTHONPATH is appropriately updated (on Windows, using setx, this will require launching a new command window) and run an example from inside the [CNTK clone root]/bindings/python directory to verify your installation:
python examples/NumpyInterop/FeedForwardNet.py
If your build and setup succeeded, you should see following output on the console:
Minibatch: 0, Train Loss: 0.7915553283691407, Train Evaluation Criterion: 0.48
Minibatch: 20, Train Loss: 0.6266774368286133, Train Evaluation Criterion: 0.48
Minibatch: 40, Train Loss: 1.0378565979003906, Train Evaluation Criterion: 0.64
Minibatch: 60, Train Loss: 0.6558118438720704, Train Evaluation Criterion: 0.56
Note: If you see an error saying "RuntimeError: module compiled against API version 0xa but this version of numpy is 0x9", your numpy version is outdated and needs to be updated:
pip install --upgrade numpy
-
If you do not have a CNTK development environment already setup on your machine, follow the instructions on CNTK Github Wiki to do so.
-
Install SWIG, version 3.0.10 or greater.
Windows: SWIG 3.0.10
Linux: Run the [CNTK clone root]/bindings/python/cntk/swig_install.sh script
-
If you have an existing Python 3.4 install with numpy and scipy, you may use that. Otherwise we recommend to install Anaconda Python 3.5 for Linux or Windows and create a Python 3.4.4 environment by running these commands:
conda create --name cntk-py34 python=3.4.4 numpy scipy
activate cntk-py34
[Note: Make sure that the Python version installed above is what you use for the remainder of the instructions.]
-
If you previously installed any version of the CNTK 2.0 pip-package on your machine, uninstall it:
pip uninstall cntk
-
On Linux:
To configure a build with Python, include these two option when running configure:
--with-swig[=directory]
--with-py34-path[=directory]
Only Release builds are supported at this stage. For example, if you installed SWIG to $HOME/swig-3.0.10 and your Python 3.4 environment is located at $HOME/anaconda3/envs/cntk-py34 provide these additional parameters to configure:
--swig-swig=$HOME/swig-3.0.10 --with-py34-path=$HOME/anaconda3/envs/cntk-py34
Afterwards, run make as you normally would, which will build the CNTK Python module inside bindings/python/cntk and also produce a package (.whl) in a subfolder python of your build output folder (e.g., build/gpu/release/python).
cd bindings/python
export PYTHONPATH=$PWD/bindings/python:$PYTHONPATH
export LD_LIBRARY_PATH=$PWD/cntk/libs:$LD_LIBRARY_PATH
Note: in contrast to the setup shown for the Pip package installation, here we will load the CNTK module from the CNTK repository clone, not as an installed package in your Python environment.
Run an example to validate:
python examples/NumpyInterop/FeedForwardNet.py
On Windows:
We are going to build the Python module using the CNTK Visual Studio solution, CNTK.sln. To prepare for that we need to set up two environment variables:
setx SWIG_PATH [path to the folder containing swig.exe]
setx CNTK_PY34_PATH [paths for your Python 3.4 environment]
Note: the value for the CNTK_PY34_PATH environment variable can be determined by running
conda ..activate cmd.exe [name-or-path-of-your-environment]
For example, run conda ..activate cmd.exe cntk-py34, if you have set up the Python environment as suggested above.
Make sure that Visual Studio sees the updated environment variables.
Build a Release configuration in the CNTK Visual Studio solution, CNTK.sln.
This will build the CNTK Python module inside bindings/python/cntk and also produce a package (.whl) in a subfolder Python of your build output folder (e.g., x64\Release\Python).
Make sure your path includes the build output folder (e.g., x64\Release), and that PYTHONPATH includes the bindings/python directory:
setx PYTHONPATH [CNTK repo root]\bindings\python;%PYTHONPATH%
setx PATH [CNTK repo root][build output directory];%PATH%
Note: in contrast to the setup shown for the Pip package installation, here we will load the CNTK module from the CNTK repository clone, not as an installed package in your Python environment.
Run an example to validate:
python examples/NumpyInterop/FeedForwardNet.py
If your build and setup succeeded, you should following output on the console:
Minibatch: 0, Train Loss: 0.7915553283691407, Train Evaluation Criterion: 0.48
Minibatch: 20, Train Loss: 0.6266774368286133, Train Evaluation Criterion: 0.48
Minibatch: 40, Train Loss: 1.0378565979003906, Train Evaluation Criterion: 0.64
Minibatch: 60, Train Loss: 0.6558118438720704, Train Evaluation Criterion: 0.56
Note: If you see an error saying "RuntimeError: module compiled against API version 0xa but this version of numpy is 0x9", your numpy version is outdated and needs to be updated:
pip install --upgrade numpy
The API documentation is currently in progress and detailed operator and tutorials will become available very soon. Currently the main form of documentation are Docstrings which are available for most of the Python APIs that are displayed by IntelliSense.
The best way to learn about the APIs currently is to look at the following examples in the [CNTK clone root]/bindings/python/examples directory:
-
MNIST: A fully connected feed-forward model for classification of MNIST images. (follow the instructions in Examples/Image/DataSets/MNIST/README.md)
-
CifarRest: An image classification ResNet model for training on the CIFAR image dataset. (follow the instructions in Examples/Image/DataSets/CIFAR-10/README.md to get the CIFAR dataset and convert it to the CNTK supported format)
-
SequenceClassification: An LSTM sequence classification model for text data.
-
Sequence2Sequence: A sequence to sequence grapheme to phoneme translation model that trains on the CMUDict corpus.
-
NumpyInterop – numpy interop example showing how to train a simple feed-forward network with training data fed using numpy arrays.
-
SLUHandson – Language Understanding.
This is an alpha release meant for early users to try the bits and provide feedback on the usability and functional aspects of the API. These bits have undergone limited testing so far, so expect some rough edges. Also please expect the API to undergo changes over the coming weeks, which may break backwards compatibility of programs written against the alpha release.
-
Only a subset of the planned functionality is available; features like distributed training, automatic LR and MB size search and API extensibility will become available over the next few weeks.
-
Python 2.7 support is currently unavailable but will be part of the upcoming beta release.
-
On Windows only Python 3.4 is supported and not Python 3.5 since the latter requires Visual Studio 2015 which CNTK has not yet migrated to. This will also be addressed before the upcoming beta release.
-
The core API itself is implemented in C++ for speed and efficiency and python bindings are created through SWIG. We are increasingly creating thin python wrappers for the APIs to attach docstrings to, but this is a work in progress and for some of the APIs, you may directly encounter SWIG generated API definitions (which are not the prettiest to read).
-
Shape and dimension inference support is currently unavailable and the shapes of all Variable objects have to be fully specified.