Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added github workflow for automatic parsing and updated code accordingly #26

Merged
merged 8 commits into from
Jan 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions .github/workflows/parse_fyrliste.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
name: Run Python Script and Create PR

on:
schedule:
- cron: "0 0 * * 0" # Runs weekly at midnight UTC on Sunday
workflow_dispatch: # Allows manual triggering

jobs:
generate-and-pr:
runs-on: ubuntu-latest

steps:
# Step 1: Check out the repository
- name: Checkout Repository
uses: actions/checkout@v3
with:
fetch-depth: 0 # Needed for creating branches
ref: main

# Step 2: Set up Python
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.11" # Specify your Python version

# Step 3: Install dependencies (if any)
- name: Install Dependencies
run: |
python -m pip install --upgrade pip
pip install -r parse_fyrlys/requirements.txt # If you have dependencies

# Step 4: Run the Python script
- name: Run Script
run: |
python parse_fyrlys/parse.py # Update with your script path

# Step 5: Configure Git
- name: Configure Git
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"

# Step 6: Check for changes and commit
- name: Commit Changes
id: commit_changes
run: |
git add lighthouses.qml parse_fyrlys/lighthouses.json
if git diff --cached --quiet; then
echo "No changes to commit."
echo "::set-output name=changes::false"
else
git commit -m "Update generated files [skip ci]"
echo "::set-output name=changes::true"
fi

# Step 7: Create Pull Request if there are changes
- name: Create Pull Request
if: steps.commit_changes.outputs.changes == 'true'
uses: peter-evans/create-pull-request@v5
with:
token: ${{ secrets.GITHUB_TOKEN }}
commit-message: Update generated files
branch: update-generated-files-${{ github.run_number }}
title: "Update Generated Files"
body: |
This PR updates the generated files based on the latest run.
labels: automated-pr
# You can specify the base branch if different from the default
base: main
16 changes: 13 additions & 3 deletions parse_fyrlys/parse.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import json
import time
import os
from tqdm import tqdm
import pdfplumber
from parse_utils import color_map, dump_qml,extract_character, merge_text_elements, extract_character, find_text, find_element_containing_point, find_text_element_containing_point, extract_text_elements, perform_text_extraction, SCALING_FACTOR
Expand Down Expand Up @@ -153,7 +154,15 @@ def should_keep_lighthouse(lighthouse):
}
return lighthouses_on_page.values()

pdf_path = "Fyrliste_HeleLandet.pdf"

pdf_path = "parse_fyrlys/Fyrliste_HeleLandet.pdf"
if not os.path.exists(pdf_path):
print("Downloading Fyrliste_HeleLandet.pdf from https://nfs.kystverket.no/fyrlister/Fyrliste_HeleLandet.pdf")
# Download from https://nfs.kystverket.no/fyrlister/Fyrliste_HeleLandet.pdf
import requests
response = requests.get("https://nfs.kystverket.no/fyrlister/Fyrliste_HeleLandet.pdf")
with open(pdf_path, "wb") as f:
f.write(response.content)

total_number_of_lighthouses = 0
lighthouses = []
Expand All @@ -170,11 +179,12 @@ def should_keep_lighthouse(lighthouse):
text_elements = perform_text_extraction(pdf_page)
lighthouses_on_page = parse_lighthouses(text_elements)
lighthouses.extend(lighthouses_on_page)

lighthouses_as_dicts = [asdict(lighthouse) for lighthouse in lighthouses]
with open("lighthouses.json", "w") as f:
with open("parse_fyrlys/lighthouses.json", "w") as f:
json.dump(lighthouses_as_dicts, f, indent=2, ensure_ascii=False)
qml_string = dump_qml(lighthouses_as_dicts)
with open("../lighthouses.qml", "w") as f:
with open("lighthouses.qml", "w") as f:
f.write(qml_string)
print("total_number_of_lighthouses: ", total_number_of_lighthouses)
print("total_real_number_of_lighthouses: ", len(lighthouses))
3 changes: 3 additions & 0 deletions parse_fyrlys/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
pdfplumber==0.11.4
tqdm==4.67.1
requests==2.32.3