Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Github action for autogenerating table of contents json files #3133

Merged
merged 13 commits into from
Jan 24, 2025
Merged
Next Next commit
add github action for autogenerating table of contents files
Blargian committed Jan 23, 2025
commit f1bb14ec407f6b74e5e440429ca04f5bb8ed13bf
56 changes: 56 additions & 0 deletions .github/workflows/table_of_contents.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# This GitHub Action is used for triggering updates of
# the toc.json files present in any directory that
# needs an automatically generated table of contents.

name: Generate Table of Contents files

on:
push:
branches: ["main"]
schedule:
- cron: '0 0 * * *' # Run daily at midnight

Blargian marked this conversation as resolved.
Show resolved Hide resolved
jobs:
generate_toc_formats:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3

# Step 1 - Cache directory contents
- name: Cache directory contents # Generating the TOC if there are no files added/removed is wasteful
uses: actions/cache@v3
with:
path: |
docs/en/interfaces/formats
key: toc-cache-${{ hashFiles('docs/en/interfaces/formats/**')}}

# Step 2 - Check if Cache was hit (files have not changed) generate the TOC
- name: Generate Format Table Of Contents
if: steps.Cache.outputs.cache-hit != 'true' # If there's no changes
id: toc_gen
run: |
# Step 2.1 - Setup Python
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.x'

# Step 2.2: Install Python dependencies
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r 'scripts/knowledgebase-checker/requirements.txt'

# Step 2.3: Run scripts to generate TOCs:
- name: Generate TOCs
run: |
./scripts/table-of-contents-generator/toc_gen.py --kb-dir="docs/en/interfaces/formats" --single-toc
continue-on-error: true

# Step 6: Fail the build if any script returns exit code 1
- name: Check exit code
run: |
if [[ "${{ steps.toc_gen.outcome }}" == "failure" ]]; then
echo "Ran into trouble generating a table of contents. See the logs for details."
exit 1
fi
43 changes: 0 additions & 43 deletions scripts/autogenerate_table_of_contents.py

This file was deleted.

104 changes: 104 additions & 0 deletions scripts/table-of-contents-generator/toc_gen.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
"""
This script can be used to automatically generate a table of contents (JSON file) from the markdown files in a directory,
or multiple directories.
"""

#!/usr/bin/env python3

import json
import os
import argparse
import sys

def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter,
description="Script to generate .json table of contents from YAML frontmatter title, description and slug",
)
parser.add_argument(
"--single-toc",
action="store_true",
help="Generates a single TOC for all files in all sub-directories of provided directory. By default, generates TOC per folder.",
)
parser.add_argument(
"--dir",
help="Path to a folder containing markdown (.md, .mdx) documents containing YAML with title, description, slug."
)
return parser.parse_args()

def extract_title_description_slug(filename):
with open(filename, "r") as f:
lines = f.readlines()

title, description, slug = None, None, None
for line in lines:
if line.startswith("title:"):
title = line.strip().split(": ")[1]
if line.startswith("description:"):
description = line.strip().split(": ")[1]
elif line.startswith("slug:"):
slug = line.strip().split(": ")[1]
if title and slug and description:
return {"title": title, "description": description, "slug": slug}
return None

def walk_dirs(root_dir):
for root, dirs, files in os.walk(root_dir):
yield root

def write_to_file(json_array, output_path):
try:
os.makedirs(os.path.dirname(output_path), exist_ok=True) # Create directories if they don't exist
with open(output_path, "w") as f:
json.dump(json_array, f, indent=4)
f.write('\n')
except OSError as e:
if e.errno == 21:
print(f"Directory already exists: {e}")
else:
print(f"An error occurred creating directory: {e}")

def main():

# Extract script arguments
args = parse_args()
root_dir = args.dir
if root_dir is None:
print("Please provide a directory with argument --dir")
sys.exit(1)

if args.single_toc:
json_items = [] # single list for all directories

for directory in walk_dirs(root_dir): # Walk directories

if not args.single_toc:
json_items = [] # new list for each directory

for filename in os.listdir(directory): # for each directory
full_path = os.path.join(directory, filename)
if os.path.isfile(full_path) is False:
continue
else:
# index.md is ignored as we expect this to be the page for the table of contents
if (filename.endswith(".md") or filename.endswith(".mdx")) and filename != "index.md":
result = extract_title_description_slug(full_path)
if result is not None:
json_items.append(result)

if not args.single_toc:
json_array = sorted(json_items, key=lambda x: x.get("title"))

# don't write toc.json for empty folders
if len(json_items) != 0:
write_to_file(json_items, directory+"/toc.json")

if args.single_toc:
json_array = sorted(json_items, key=lambda x: x.get("title"))
# don't write toc.json for empty folders
if len(json_items) != 0:
write_to_file(json_items, root_dir+"/toc.json")

if __name__ == "__main__":
main()