Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional numeric metadata #37

Open
ribenamaplesyrup opened this issue Jul 12, 2023 · 1 comment
Open

Additional numeric metadata #37

ribenamaplesyrup opened this issue Jul 12, 2023 · 1 comment

Comments

@ribenamaplesyrup
Copy link

Thank you for sharing this excellent library @drkane ! I'm currently using it to extract numeric data from Companies House reports and assessing how feasible it would be to determine which table in a document each numeric object belongs to. Looking at XBRL format, it seems that it would be possible to tell which page a numeric object belongs to but tables could be trickier as the table HTML element seems to be used for non-numeric as well. I'm wondering if this is a feature you have considered?

@drkane
Copy link
Member

drkane commented Jul 12, 2023

Hi - glad it's useful.

(Deleted my initial response as I got the wrong end of the stick). I think this is probably possible to do, although the structure of the xbrl files doesn't always make it easy - I think html tables are used a lot to help with file layout so there might be some confusion over which are "real" tables.

A good starting point might be the functionality that @avyfain added to get back the html tag for a given element. This should allow you to find where in the document the element is situated.

Do you have any example files along with what you'd like the output to look like? A first step might be to set up some tests to try and reproduce the expected result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants