Additional numeric metadata #37

ribenamaplesyrup · 2023-07-12T10:05:57Z

Thank you for sharing this excellent library @drkane ! I'm currently using it to extract numeric data from Companies House reports and assessing how feasible it would be to determine which table in a document each numeric object belongs to. Looking at XBRL format, it seems that it would be possible to tell which page a numeric object belongs to but tables could be trickier as the table HTML element seems to be used for non-numeric as well. I'm wondering if this is a feature you have considered?

drkane · 2023-07-12T21:25:05Z

Hi - glad it's useful.

(Deleted my initial response as I got the wrong end of the stick). I think this is probably possible to do, although the structure of the xbrl files doesn't always make it easy - I think html tables are used a lot to help with file layout so there might be some confusion over which are "real" tables.

A good starting point might be the functionality that @avyfain added to get back the html tag for a given element. This should allow you to find where in the document the element is situated.

Do you have any example files along with what you'd like the output to look like? A first step might be to set up some tests to try and reproduce the expected result.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional numeric metadata #37

Additional numeric metadata #37

ribenamaplesyrup commented Jul 12, 2023

drkane commented Jul 12, 2023

Additional numeric metadata #37

Additional numeric metadata #37

Comments

ribenamaplesyrup commented Jul 12, 2023

drkane commented Jul 12, 2023