Skip to content

Latest commit

 

History

History
340 lines (278 loc) · 12.4 KB

File metadata and controls

340 lines (278 loc) · 12.4 KB

Guidelines for contributing to the open Machine Learning book

Thank you for your interest in contributing to this open-source Machine Learning book! We greatly value feedback and contributions from our community.

Please read through this document before you submit any pull requests or issues. It will help us work together more effectively.

How to contribute

To contribute, send us a pull request. Please review our general Guidelines for contributing and Style guide before you start.

Notes for contributors

This section describes the development environment setup and workflow which should be followed when modifying/porting Python code and making changes to one of the machine learning frameworks in the book. We follow a set of pre-defined Style guide for consistent code quality throughout the book and expect the same from our community contributors. You may need to check other chapters from other contributors as well for this step.

All the chapter sections are generated by JupyterBook.

Install Python & Conda

Before you start, you will need Python and Conda on your computer.

Add the following paths(depending on your OS) to the environment variable `PATH`` if needed. To Windows,

D:\Python\Python310\Scripts\
D:\Python\Python310\
D:\anaconda3\Scripts

Install Jupyter Book

Follow the Jupyter Book official guidance to install the latest version.

Install draw.io

draw.io is needed for generating draw.io-based diagrams in build time. Install the draw.io desktop application on your local machine. By default, the draw.io execution is correctly located at the platform-appropriate path:

  • Windows: C:\Program Files\draw.io\draw.io.exe (Attention: Don't change the installation path.)
  • Linux: /opt/drawio/drawio or /opt/draw.io/drawio (older versions)
  • macOS: /Applications/draw.io.app/Contents/MacOS/draw.io.

Mostly, you don't need to do anything here. The executable will be picked up by sphinxcontrib-drawio automatically.

Initialize the environment

Clone the source code from remote through your preferred protocol.

# through HTTP
git clone https://github.com/ocademy-ai/machine-learning.git

Move to the working directory.

cd machine-learning/open-machine-learning-jupyter-book/

Initialize the Conda env.

# first time setup
conda env create -f environment.yml
# or update
conda env update -f environment.yml

To Mac,

Warning

You may see below Tensorflow installation failures, especially on the ARM-based M1 Mac.

ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)
ERROR: No matching distribution found for tensorflow

Solution:

  1. Comment out Tensorflow in environment.yml.
  2. Follow Apple's official documentation to install the Tensorflow.
  3. Run conda env update -f environment.yml again to install the remaining dependencies.
  4. Optional - try to uncomment the Tensorflow in environment.yml.

Warning

You may see below error when you have trouble access GitHub.

error: RPC failed; curl 56 LibreSSL SSL_read: error:02FFF03C:system library:func(4095):Operation timed out, errno 60
fatal: expected flush after ref listing

Solution:

Change your network. In order to proceed smoothly later, hope you can solve this problem here.

To Windows,

Warning

You may see below HTTP error first.

An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.

Create .condarc conda configuration file(This file should):

conda config --set show_channel_urls yes

This file is in your user directory by default,for example:

C:\Users\gouha\.gitconfig

Delete initial content in .condarc, the add the following content to .condarc.

channels:
  - defaults
show_channel_urls: true
default_channels:
  - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
  - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
  - http://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
custom_channels:
  conda-forge: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  msys2: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  bioconda: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  menpo: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  pytorch: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  simpleitk: http://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud

Warning

You may see below error when you have trouble access GitHub.

error: RPC failed; curl 56 LibreSSL SSL_read: error:02FFF03C:system library:func(4095):Operation timed out, errno 60
fatal: expected flush after ref listing

Solution:

Change your network. In order to proceed smoothly later, hope you can solve this problem here.

Warning

You may encounter download or run failures due to lack of administrator privileges.

error: Could not install packages due to an OSError: [WinError 5] Access denied.
Consider using the `--user` option or check the permissions.

Solution:

Turn off administrator privileges by using the command prompt.

  1. Run cmd as Administrator.

  2. Enter the command NET USER administrator /active:noand run.

Warning

When you are building the book, you may encounter an error when running terminal (like powershell).

error: Failed building wheel for jupyter-nbextensions-configurator 
       or Unable to load file: C:\Users\87897\Documents\WindowsPowerShell\profile.ps1

Solution:

Enter the command: set-ExecutionPolicy RemoteSigned, then enter Y.

Tips: You can use the command get-ExecutionPolicy to check , and if RemoteSigned appears, it means the modification is successful.

Activate the Conda environment

conda activate open-machine-learning-jupyter-book

Build the book

# official guidance - https://jupyterbook.org/en/stable/start/build.html

# Windows
jupyter-book build .

# Mac
# if you are using bash
bash ./build.sh
# or you can rebuild everything
bash ./build-force-all.sh

Then you should be able to follow the build success message to view the book locally.

To Mac,

Warning

You may encounter following problem when you program on ARM-based M1 Mac.

OSError: no library called "cairo-2" was found
no library called "cairo" was found
no library called "libcairo-2" was found

Solution:

  1. Install Homebrew.
  2. Fetch Homebrew sources:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
  1. Install the below missing dependencies through Homebrew:
brew install cairo pango gdk-pixbuf libxml2 libxslt libffi
  1. Find out the path of cairo, glib and pango installation, and export them to DYLD_LIBRARY_PATH:
# for example
export DYLD_LIBRARY_PATH=/opt/homebrew/Cellar/cairo/1.16.0_5/lib/:/opt/homebrew/Cellar/pango/1.50.9/lib/:/opt/homebrew/Cellar/glib/2.72.3_1/lib/

How to find out above pathes? Here is an example of cairo:

  • Run the command which brew.
  • If the response is /opt/homebrew/bin/brew, now we get the Homebrew root path as '/opt/homebrew/'.(The result may depend on your OS!!)
  • Check if cairo, glib, pango are existing in /opt/homebrew/Cellar.
  • Find out the lib path for above libraries, such as /opt/homebrew/Cellar/cairo/1.16.0_5/lib.(The result may depend on your OS!! Remind again.)
  1. Rerun jupyter-book build .
  2. Run pip uninstall xcffib if error still exists, and then try again.

To Windows,

Warning

You may encounter following problem when you program.

OSError: no library called "cairo-2" was found
no library called "cairo" was found
no library called "libcairo-2" was found

Solution:

Download GTK3.

Run the following command.

pip uninstall xcffib

Restart the terminal and build again.

Build the slides (optional)

The slides are implemented as notebooks in slides/, which is powered by RISE.

If you want to edit or preview the slides locally, you need to use Jupyter Notebook. Once you use Jupyter Notebook/JupyterLab to load the project, the slide will be launched in live mode after you open any corresponding notebook.

# Install javascript and css files
jupyter contrib nbextension install --user

# Enabling extensions
jupyter nbextension enable init_cell/main

# Launch the notebook
jupyter notebook

Warning

Please make sure the Jupyter Notebook is running in trusted mode, and the init_cell is configured for the first cell of slide notebook. So that the first cell will be automatically executed to load the CSS.

FAQ

Regarding the deletion and addition of the _toc.yml file:

 *  The _toc file is located in the open-machine-learning-jupyter-book [directory](https://github.com/ocademy-ai/machine-learning/blob/main/open-machine-learning-jupyter-book/_toc.yml)
 *  In Jupyter Book, the _toc.yml file is the file used to define the directory structure of the book, containing the book chapters, sub-chapters and page hierarchy.
 *  When you build your book using Jupyter Book, it reads the _toc.yml file and generates a navigation bar based on the directory structure in it.
 *  To speed up the local book build, you can keep only the content of the chapters you changed for the build. This speeds up the build and ignores errors reported by other chapters.
 *  However, when deleting other chapters, pay attention to ensure the integrity of the entire book structure, otherwise it may lead to error reporting, it is recommended that when you first get started, one by one CAPTION deletion.
 *  After the preview, please restore the original _toc structure

Non-consecutive header level increase

You may see below failures when building the books:

 WARNING: Non-consecutive header level increase; 0 to 2 [myst.header]

This error is caused by the presence of a non-consecutive heading level increase in the specified Jupyter notebook file. Specifically, the heading level increases directly from 0 to 2 without going through level 1. To resolve this issue, you can follow these steps:

1. Open the specified Jupyter notebook file.
2. Check the setting for the heading level, which in this lesson is the number of '#'.
3. Ensure that the heading level increases continuously without skipping any levels.
4. If you find a non-continuous heading level increase, adjust it to a continuous level increase.
5. Save the file and re-run the code to ensure the error has been resolved.
6. Take care of '---', this will be recognize as a title. If you are transforming md file to ipynb file, and confirm that the above situation does not exist, please delete the last '---'.

Can't run the code locally

You may encounter a situation where the code cannot be run locally. You can try uploading the document to Google Colab for running, and then download the file containing the results to submit a PR locally.

Couldn't find cache key

You may meet a error like this:

ERROR: Execution Failed: /home/runner/work/machine-learning/machine-learning/open-machine-learning-jupyter-book/data-science/data-visualization/visualization-distributions.md
ERROR: Couldn't find cache key for notebook file data-science/data-visualization/visualization-distributions.md. Outputs will not be inserted.

To solve this error, you can find the file and just add a ' ' in anywhere of the file, just to resubmit it. And then delete the ' ' in next commit to make sure the file not exist in you PR.