How to report multiple documents on extract #391

mih · 2023-07-20T10:34:29Z

I implement a dataset-level metadata extractor. I think I need to be able to report multiple, individual metadata records. In principle, one be able to build these records in a way that they can be reported in a nested fashion (thereby reporting just a single object). However, in my case I have no control over the nature of these documents, and they might be linked (or not) in different ways.

What is a desirable approach here?

an arbitrary top-level key that maps onto an array?
a JSON-LD style @graph top-level key (as a realization of the above)?
something else?

Related: We might be talking about a lot of stuff to return. If I see things correctly, I need to load multiple standalone records into memory (many), report them via immediate_data as a single dict, such that they can be written out as JSON (again). I am yet to understand why meta-extract turns a single return value of type ExtractorResult into a result record, rather than dealing with result records directly. This would make the standard machinery of seemlessly switching between return values and generator yields applicable to metadata extractors too

The text was updated successfully, but these errors were encountered:

christian-monch · 2023-07-20T12:12:37Z

With the realization that the principle approach requires fixing, I would opt for the {"@graph": [ <objects>]}-approach as a quick-fix.

christian-monch · 2023-07-21T09:28:45Z

@mih I didn't think about that yesterday afternoon, but another option would be to return a list, which contains the individual results, in immediate_data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to report multiple documents on extract #391

How to report multiple documents on extract #391

mih commented Jul 20, 2023

christian-monch commented Jul 20, 2023

christian-monch commented Jul 21, 2023 •

edited

Loading

How to report multiple documents on extract #391

How to report multiple documents on extract #391

Comments

mih commented Jul 20, 2023

christian-monch commented Jul 20, 2023

christian-monch commented Jul 21, 2023 • edited Loading

christian-monch commented Jul 21, 2023 •

edited

Loading