You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Operating System and architecture (arm64, amd64, x86, etc.)
No response
What is your Java version
java 22.0.2 2024-07-16
Log and information
No response
Further information
Hi,
I have multiple pdfs having tables that span across multiple pages. For eg:
Column 1
Column 2
Column 3
In the quiet town of Elmford, nestled between rolling hills and winding rivers, a peculiar event took place one autumn afternoon.
Children often played near the base of the tower, imagining it to be a gateway to another world. Among them was little Sophie, an adventurous girl with a wild imagination.
One day, while chasing a butterfly near the clock tower, Sophie stumbled upon an old, weathered key hidden beneath a pile of fallen leaves. Her heart raced with excitement as she wondered what the key might unlock.
----------------------------------------------------End of page 1----------------------------------------------------------------
Word of Sophie's discovery spread quickly through Elmford, and soon, the entire town was buzzing with anticipation. Some believed the key would unlock the mystery of the clock tower, while others thought it might lead to hidden treasure long forgotten. As dusk settled in and the town prepared for the annual harvest festival, Sophie stood in front of the clock tower, key in hand, ready to uncover the secrets it had kept for so long.
At the center of town stood an old clock tower, its hands frozen at 3:15, a mystery that had puzzled the townsfolk for decades. No one knew when, or why, the clock had stopped ticking, but it had become a symbol of the town's timeless charm.
One day, while chasing a butterfly near the clock tower, Sophie stumbled upon an old, weathered key hidden beneath a pile of fallen leaves. Her heart raced with excitement as she wondered what the key might unlock.
Problem:
When I try extracting this information using the grobid_python_client all the information on page 2 gets tagged to column 3. The client is not able to extract the information of each column when table spans multiple pages.
The text was updated successfully, but these errors were encountered:
Hi @BC-Naman, thanks for reporting this issue, we are focusing to tables that are contained into pages, so your case is a bit an edge, however if you could provide examples, that would be helpful. For this year 2025 we plan to advance on #963 so more examples are welcome.
The licence would be important as we could use only CC-BY examples for training data. Nevertheless, non-CC-BY examples would be used only as test cases.
Operating System and architecture (arm64, amd64, x86, etc.)
No response
What is your Java version
java 22.0.2 2024-07-16
Log and information
No response
Further information
Hi,
I have multiple pdfs having tables that span across multiple pages. For eg:
----------------------------------------------------End of page 1----------------------------------------------------------------
Problem:
When I try extracting this information using the grobid_python_client all the information on page 2 gets tagged to column 3. The client is not able to extract the information of each column when table spans multiple pages.
The text was updated successfully, but these errors were encountered: