What's Changed
- fix: 修复issue #715 by @LollipopsAndWine in #971
- docs(README): update GPU hardware recommendations and table recognition options by @myhloli in #973
- docs: improve GPU support list formatting in README_zh-CN.md by @myhloli in #974
- docs: update feature description for table conversion by @myhloli in #975
- docs: update readme by @myhloli in #977
- update ci by @dt-yy in #986
- test(unitest): Restore unit test cases by @myhloli in #998
- refactor(tests): extract common test utilities into test_commons.py by @myhloli in #1001
- feat(ocr): improve handling of angled text boxes by @myhloli in #1010
- refactor(para): improve paragraph splitting logic by @myhloli in #1013
- build(setup): add old_linux specific dependencies by @myhloli in #1016
- refactor(para): adjust right margin threshold based on block width by @myhloli in #1018
- fix: using new data api replace old rw api by @icecraft in #1006
- delete unused pipeline file by @liugongjian in #1024
- refactor: move some constants or enums defs to config folder by @icecraft in #1027
- fix: remove test code by @icecraft in #1036
- fix(tools): handle empty language string in common.py by @myhloli in #1045
- refactor(ocr_dict_merge): add threshold parameter for line merging by @myhloli in #1046
- fix(ocr_mkcontent): improve hyphen handling at line ends by @myhloli in #1047
- fix(remove_overlaps_min_spans): optimize overlap detection in OCR span list modification by @myhloli in #1048
- feat(ocr): improve text detection and OCR accuracy by @myhloli in #1049
- refactor(txt_parse): improve text extraction accuracy with new algorithm by @myhloli in #1050
- fix: use concrete class instead of abstract class by @icecraft in #1052
- fix(pdf_parse): improve line stop flag detection accuracy by @myhloli in #1053
- test: comment out assertions for metascan classify and meta scan tests by @myhloli in #1054
- Add test cases to json compressor util by @liugongjian in #1056
- refactor(para): improve line stop flag and remove unused debug mode by @myhloli in #1058
- fix(table): add null check for OCR result in rapid table prediction by @myhloli in #1060
- refactor(model): move page total time logging to custom model analysis by @myhloli in #1061
- fix(table): add null check for OCR result in rapid table prediction by @myhloli in #1062
- fix(pdf_parse): improve OCR result handling by @myhloli in #1064
New Contributors
- @liugongjian made their first contribution in #1024
Full Changelog: magic_pdf-0.9.3-released...magic_pdf-0.10.0-released