Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/fix table id checking logic #3898

Merged
merged 2 commits into from
Jan 31, 2025
Merged

Conversation

badGarnet
Copy link
Collaborator

@badGarnet badGarnet commented Jan 31, 2025

  • there is a bug in deciding if a page has tables before performing table extraction. This logic checks if the id associated with Table type element is True
  • however, it should be checking if the id is None because sometimes the id can be 0 (the first type of element in the page)
  • the fix updates the logic
  • adds a unit test for this specific case

- there is a bug in deciding if a page has tables before performing
  table extraction. This logic checks if the id assocaited with Table
  type element is True
- however, it should be checking if the id is `None` because sometimes
  the id can be 0 (the first type of element in the page)
- the fix updates the logic
- adds a unit test for this specific case
@badGarnet badGarnet requested a review from vangheem January 31, 2025 17:22
@badGarnet badGarnet marked this pull request as ready for review January 31, 2025 17:22
@@ -276,7 +276,7 @@ def supplement_element_with_table_extraction(
from unstructured_inference.models.tables import cells_to_html

table_id = {v: k for k, v in elements.element_class_id_map.items()}.get(ElementType.TABLE)
if not table_id:
if table_id is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦

@badGarnet badGarnet enabled auto-merge January 31, 2025 17:47
@badGarnet badGarnet added this pull request to the merge queue Jan 31, 2025
@cragwolfe cragwolfe removed this pull request from the merge queue due to a manual request Jan 31, 2025
@cragwolfe cragwolfe merged commit 9d58b34 into main Jan 31, 2025
41 checks passed
@cragwolfe cragwolfe deleted the fix/fix-table-id-checking-logic branch January 31, 2025 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants