Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add a README for updating notebooks/cassettes #1705

Merged
merged 5 commits into from
Sep 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions .github/workflows/run_notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,7 @@ jobs:
TAVILY_API_KEY: ${{ secrets.TAVILY_API_KEY }}
LANGSMITH_API_KEY: ${{ secrets.LANGSMITH_API_KEY }}
run: |
for file in $(find docs/docs/how-tos -name "*.ipynb")
do
echo "Executing $file"
PIP_PRE=1 poetry run jupyter execute "$file"
done
./docs/_scripts/execute_notebooks.sh

- name: Stop services
run: make stop-services
61 changes: 61 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Setup

To setup requirements for building docs you can run:

```bash
poetry install --with test
```

## Serving documentation locally

To run the documentation server locally you can run:

```bash
make serve-docs
```

## Execute notebooks

If you would like to automatically execute all of the notebooks, to mimic the "Run notebooks" GHA, you can run:

```bash
python docs/_scripts/prepare_notebooks_for_ci.py
./docs/_scripts/execute_notebooks.sh
```

**Note**: if you want to run the notebooks without `%pip install` cells, you can run:

```bash
python docs/_scripts/prepare_notebooks_for_ci.py --comment-install-cells
./docs/_scripts/execute_notebooks.sh
```

`prepare_notebooks_for_ci.py` script will add VCR cassette context manager for each cell in the notebook, so that:
* when the notebook is run for the first time, cells with network requests will be recorded to a VCR cassette file
* when the notebook is run subsequently, the cells with network requests will be replayed from the cassettes

**Note**: this is currently limited only to the notebooks in `docs/docs/how-tos`

## Adding new notebooks

If you are adding a notebook with API requests, it's **recommended** to record network requests so that they can be subsequently replayed. If this is not done, the notebook runner will make API requests every time the notebook is run, which can be costly and slow.

To record network requests, please make sure to first run `prepare_notebooks_for_ci.py` script.

Then, run

```bash
jupyter execute <path_to_notebook>
```

Once the notebook is executed, you should see the new VCR cassettes recorded in `docs/cassettes` directory and discard the updated notebook.

## Updating existing notebooks

If you are updating an existing notebook, please make sure to remove any existing cassettes for the notebook in `docs/cassettes` directory (each cassette is prefixed with the notebook name), and then run the steps from the "Adding new notebooks" section above.

To delete cassettes for a notebook, you can run:

```bash
rm docs/cassettes/<notebook_name>*
```
5 changes: 5 additions & 0 deletions docs/_scripts/execute_notebooks.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
for file in $(find docs/docs/how-tos -name "*.ipynb" | grep -v ".ipynb_checkpoints")
do
echo "Executing $file"
poetry run jupyter execute "$file"
done
Loading