Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
Byaidu committed Nov 23, 2024
2 parents e550f8b + 913a2e9 commit b136e2d
Show file tree
Hide file tree
Showing 23 changed files with 679 additions and 420 deletions.
31 changes: 31 additions & 0 deletions .github/workflows/python-build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Build Python Package

on:
push:
branches:
- main
pull_request:

jobs:
build:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.x'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install build flake8 black
- name: Check code format
run: |
black --check --diff --color pdf2zh/*.py
flake8
- name: Build package
run: python -m build
14 changes: 14 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
files: '^.*\.py$'
repos:
- repo: local
hooks:
- id: black
name: black
entry: black --check --diff --color
language: python
- id: flake8
name: flake8
entry: flake8
language: python
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,15 @@ Feel free to provide feedback in [GitHub Issues](https://github.com/Byaidu/PDFMa

<h2 id="updates">Updates</h2>

- [Nov. 23 2024] Firewall for preventing web bots *(by [@Byaidu](https://github.com/Byaidu))*
- [Nov. 22 2024] GUI now supports Italian, and has been improved *(by [@Byaidu](https://github.com/Byaidu), [@reycn](https://github.com/reycn))*
- [Nov. 22 2024] You can now share your deployed service to others *(by [@Zxis233](https://github.com/Zxis233))*
- [Nov. 22 2024] Now supportsTencent Translation *(by [@hellofinch](https://github.com/hellofinch))*
- [Nov. 21 2024] GUI now supports downloading dual-document *(by [@reycn](https://github.com/reycn))*
- [Nov. 20 2024] GUI now supports specifying Ollama and OpenAI models *(by [@IuvenisSapiens](https://github.com/IuvenisSapiens), [@Byaidu](https://github.com/Byaidu))*
- [Nov. 20 2024] 🌟 [Demo](#demo) online! *(by [@reycn](https://github.com/reycn))*
- [Nov. 20 2024] Supports [Docker](#docker) *(by [@Byaidu](https://github.com/Byaidu))*
- [Nov. 20 2024] Supports [multiple-threads translation](#threads) *(by [@Byaidu](https://github.com/Byaidu))*
- [Nov. 19 2024] Provides an [interactive graphical user interface](#gui) *(by [@reycn](https://github.com/reycn))*
- [Nov. 18 2024] Supports [more services: DeepL, DeepLX, and Azure](#services) *(by [@reycn](https://github.com/reycn), [@Hanaasagi](https://github.com/Hanaasagi))*

<h2 id="preview">Preview</h2>

Expand Down
9 changes: 6 additions & 3 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,16 @@

<h2 id="updates">近期更新</h2>


- [Nov. 23 2024] 防止网页爬虫的防火墙 *(by [@Byaidu](https://github.com/Byaidu))*
- [Nov. 22 2024] 图形用户界面现已支持意大利语,并获得了一些更新 *(by [@Byaidu](https://github.com/Byaidu), [@reycn](https://github.com/reycn))*
- [Nov. 22 2024] 现在你可以将自己部署的服务分享给朋友了 *(by [@Zxis233](https://github.com/Zxis233))*
- [Nov. 22 2024] Now supportsTencent Translation *(by [@hellofinch](https://github.com/hellofinch))*
- [Nov. 21 2024] 图形用户界面现在支持下载双语文档 *(by [@reycn](https://github.com/reycn))*
- [Nov. 20 2024] 图形用户界面现在支持指定 Ollama 和 OpenAI 的模型 *(by [@IuvenisSapiens](https://github.com/IuvenisSapiens), [@Byaidu](https://github.com/Byaidu))*
- [Nov. 20 2024] 🌟 提供了 [在线演示](#demo)*(by [@reycn](https://github.com/reycn))*
- [Nov. 20 2024] 支持 [容器化部署](#docker) *(by [@Byaidu](https://github.com/Byaidu))*
- [Nov. 20 2024] 支持速度更快的 [多线程翻译](#threads) *(by [@Byaidu](https://github.com/Byaidu))*
- [Nov. 19 2024] 提供了[图形用户界面](#gui) *(by [@reycn](https://github.com/reycn))*
- [Nov. 18 2024] 支持更多翻译服务,包含 [DeepL, DeepLX, 和 Azure](#services) *(by [@reycn](https://github.com/reycn), [@Hanaasagi](https://github.com/Hanaasagi))*
- [Nov. 20 2024] 支持速度更快的 [多线程翻译](#threads) *(by [@Byaidu](https://github.com/Byaidu))*

<h2 id="preview">效果预览</h2>

Expand Down
28 changes: 18 additions & 10 deletions pdf2zh/cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@
import time
import hashlib
import shutil
cache_dir = os.path.join(tempfile.gettempdir(), 'cache')

cache_dir = os.path.join(tempfile.gettempdir(), "cache")
os.makedirs(cache_dir, exist_ok=True)
time_filename = 'update_time'
time_filename = "update_time"
max_cache = 5


Expand All @@ -16,25 +17,30 @@ def deterministic_hash(obj):


def get_dirs():
dirs = [os.path.join(cache_dir, dir) for dir in os.listdir(cache_dir) if os.path.isdir(os.path.join(cache_dir, dir))]
dirs = [
os.path.join(cache_dir, dir)
for dir in os.listdir(cache_dir)
if os.path.isdir(os.path.join(cache_dir, dir))
]
return dirs


def get_time(dir):
try:
timefile = os.path.join(dir, time_filename)
t = float(open(timefile, encoding='utf-8').read())
t = float(open(timefile, encoding="utf-8").read())
return t
except FileNotFoundError:
# handle the error as needed, for now we'll just return a default value
return float('inf') # This ensures that this directory will be the first to be removed if required

return float(
"inf"
) # This ensures that this directory will be the first to be removed if required


def write_time(dir):
timefile = os.path.join(dir, time_filename)
t = time.time()
print(t, file=open(timefile, "w", encoding='utf-8'), end='')
print(t, file=open(timefile, "w", encoding="utf-8"), end="")


def argmin(iterable):
Expand All @@ -44,7 +50,9 @@ def argmin(iterable):
def remove_extra():
dirs = get_dirs()
for dir in dirs:
if not os.path.isdir(dir): # This line might be redundant now, as get_dirs() ensures only directories are returned
if not os.path.isdir(
dir
): # This line might be redundant now, as get_dirs() ensures only directories are returned
os.remove(dir)
try:
get_time(dir)
Expand Down Expand Up @@ -73,11 +81,11 @@ def create_cache(hash_key):
def load_paragraph(hash_key, hash_key_paragraph):
filename = os.path.join(cache_dir, hash_key, hash_key_paragraph)
if os.path.exists(filename):
return open(filename, encoding='utf-8').read()
return open(filename, encoding="utf-8").read()
else:
return None


def write_paragraph(hash_key, hash_key_paragraph, paragraph):
filename = os.path.join(cache_dir, hash_key, hash_key_paragraph)
print(paragraph, file=open(filename, "w", encoding='utf-8'), end='')
print(paragraph, file=open(filename, "w", encoding="utf-8"), end="")
Loading

0 comments on commit b136e2d

Please sign in to comment.