Simplify base_document
column usage with auxiliary instructions in pipeline config
#228
Labels
enhancement
New feature or request
Currently, we expect users that are creating auxiliary instructions to create a
base_document
column that contains the original document, as well as ensuring that gets set as adataset_type
. An example from our full pipeline config:Is there a way to simplify this for authors of pipeline config, where we automatically handle the base_document dataset without the user ever needing to include references to that column in their config? That specific dataset_type string has a special meaning in the code, but how would a user know to include it without reading the code?
This issue is created to track a comment in another PR at #204 (comment) so we don't lose sight of improving this.
The text was updated successfully, but these errors were encountered: