Releases: Layout-Parser/layout-parser
v0.3.4: Patch Release
Bug fixes
- fix one critical bug for visualization mentioned in #131 by @lolipopshock in #132
Full Changelog: v0.3.3...v0.3.4
v0.3.3: Patch Release
Functional Updates
- Robust pdf loading for empty pages by @lolipopshock in #115
- fix to issue #94 -- avoiding TesseractAgent.detect() inferring any sequence of digit as float by @K-for-Code in #95
- Better layout comparison by @lolipopshock in #128
- Better visualization functions by @lolipopshock in #129
Example Updates
- Minor update to Deep Learning Parser example notebook by @Jim-Salmons in #56
- Set
inplace
to True in sorting function by @yusanshi in #104 - Add notebook for customizing LayoutParser Models with Label Studio Annotation by @lolipopshock in #124
New Contributors
- @Jim-Salmons made their first contribution in #56
- @yusanshi made their first contribution in #104
- @K-for-Code made their first contribution in #95
Full Changelog: v0.3.2...v0.3.3
v0.3.2: Patch Release
v0.3.1: Patch Release
v0.3.0: Multi-backend Support, Additional Models, Better Visualizations, and many more
We are excited to release LayoutParser v0.3.0, with a lot of exciting updates and functional improvements.
New Features
- The biggest change in this version is that LayoutParser now supports multiple deep learning backends: Detectron2, effdet, and paddledetection. This allows for more flexible usage of the
layoutparser
library, and makes it easier for implementing customized layout models in the future. #54 #67 - Additionally, the newly added
AutoModel
and improved model configuration parsing makes it easier load and use the layout detection models. #69- e.g,
model = lp.AutoLayoutModel("lp://efficientdet/PubLayNet")
.
- e.g,
- To support this multi-backend framework, we implement the dynamic importing mechanism as well as better ways for installing
layoutparser
and the needed dependencies (see instructions). #65 #68 - And now
layoutparser
supports directly loading PDF files into aslayout
objects: #71import layoutparser as lp pdf_layout, pdf_images = lp.load_pdf("path/to/pdf", load_images=True) lp.draw_box(pdf_images[0], pdf_layout[0])
- To support more flexible processing of the layout objects, a set of new toolkits are available: #72
import layout parser as lp page_layout = lp.load_pdf("tests/fixtures/io/example.pdf")[0] pdf_lines = lp.simple_line_detection(page_layout)
New Models
- Add MFD model that can detect (display) equation regions within scientific documents #59
Layout Parser v0.2.0: New features, models, and improvements!
Layout Parser v0.2.0 Release Notes
New Features
- Support for loading and exporting the layout data in
json
andcsv
, see #6 - Add support for
union
andintersect
operations, see #20 and the detailed explanation
Improvements
- Functional improvements:
-
When loading Layout Parser official models,
Detectron2LayoutModel
can automatically detect the label_map, . For example,model = lp.Detectron2LayoutModel("lp://HJDataset/faster_rcnn_R_50_FPN_3x/config") model.label_map # {1: 'Page Frame', ... }
-
Detectron2LayoutModel
now supports theenforce_cpu
flag that enforces using cpu even when CUDA devices are available. -
For
visualization.draw_box
, it now supports ashow_element_type
flag that shows the bbox category name on the top left corner of the layout objects.
-
- Improve installation command and documentation, especially for installing Detectron2 on Windows platforms #25
New Models
- Add the table bank detection models that can identify table regions
Fixes
New models and bug fixes
Improvements:
- Supports lazy loading for the Detectron2 module. Now the dependency for Detectron2 will be requested only when you explicitly create a
Detectron2LayoutModel
object. This might be helpful for using the plainlayoutparser
library without installing the Detectron2 module.
New models:
- Incorporated a pre-trained model based on the NewspaperNavigator dataset:
lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config
Fixes:
- Corrected a bug in visualization that might overwrite original the image
New models and improvements
In this version, we released a new model for publaynet and made several improvements:
- We released the
mask_rcnn_X_101_32x8d_FPN_3x
model trained on thepublaynet
dataset. Note: it's been trained on the full training set (while others are only trained on the validation set), and you could expect a 15% performance improvement based on this new model. - We improved the support for PIL images for both layout modeling and visualization
- We improved the Default Language Settings for the Tesseract OCR model
Model fixes and updates
Fixes
- Fixed a bug that could cause errors in loading Prima Models
Updates
- Update the prima MASK RCNN model with higher accuracy, and listed detailed evaluation reports.
v0.1.0: The foundation version that covers four major functionalities
layoutparser
now supports the following functionalities:
-
Coordinate system:
- Supports the 3 basic coordinate system and their geometric relationships
- Supports the TextBlook and Layout system for convenient coordinate and text processing
-
OCR System:
- Supports OCR based on Google Cloud Vision and Tesseract API.
-
Layout Modeling:
- Supports using pre-trained Deep Learning models for layout object detection using Detection2
-
Visualization:
- Supports highly-customizable presentation of the box coordinates and text in the detected layout