翻译后的 PDF 文本覆盖原文（高质量扫描） #235

eliasjudin · 2024-12-15T07:30:42Z

我尝试使用本项目对一个高质量扫描的 PDF 进行翻译，但翻译结果中的文本覆盖在原文之上，导致无法正常阅读。可能是由于该 PDF 没有文本组件，仅包含扫描的图像。

附上原始 PDF 文件和翻译后的 PDF 文件以供参考。

pdf2zh original.pdf -li ru -lo en -p
 2 -f "(CM[^R]|(MS|XY|MT|BL|RM|EU|LA|RS)[A-Z]|LINE|LCIRCLE|TeX-|rsfs|
txsy|wasy|stmary|.*Mono|.*Code|.*Ital|.*Sym|.*Math)"

The text was updated successfully, but these errors were encountered:

Byaidu · 2024-12-15T07:37:29Z

#19

hellofinch · 2024-12-15T07:37:33Z

scanned file not support well. #19

eliasjudin · 2024-12-15T07:46:13Z

scanned file not support well. #19

so the file requires a text layer? if i ocr to add the text layer will it work?

hellofinch · 2024-12-16T02:24:46Z

I'm not sure. The texts were extracted from PDF, not from the picture.

Byaidu closed this as completed Dec 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

翻译后的 PDF 文本覆盖原文（高质量扫描） #235

翻译后的 PDF 文本覆盖原文（高质量扫描） #235

eliasjudin commented Dec 15, 2024

Byaidu commented Dec 15, 2024

hellofinch commented Dec 15, 2024

eliasjudin commented Dec 15, 2024

hellofinch commented Dec 16, 2024

翻译后的 PDF 文本覆盖原文（高质量扫描） #235

翻译后的 PDF 文本覆盖原文（高质量扫描） #235

Comments

eliasjudin commented Dec 15, 2024

Byaidu commented Dec 15, 2024

hellofinch commented Dec 15, 2024

eliasjudin commented Dec 15, 2024

hellofinch commented Dec 16, 2024