You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
input structure and naming for different pipelines
is getting a lot more complicated than it seems necessary. #55 dealt with some short-term fixes. This issue is to track some cleanup work to make things more consistent and hopefully reduce the amount of data transformation necessary.
None of this matches the source format for the dataset (taxonomy). Allowing people to specify a custom pipeline implies specifying their expected sample dataset format somehow.
Another idea instead ...
Always assume a consistent dataset format.
Add a new pipeline capability for dataset transformation -- rename fields if you want, squash the rows into groups of 3 seed questions/answers per row (for the knowledge case)
I think something like this is going to be necessary to allow more configurable custom pipelines, as we'll need a way for a custom pipeline to declare the dataset format it is expecting from a known starting point.
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had activity within 90 days. It will be automatically closed if no further activity occurs within 30 days.
Added new approved Knowledge submission data sources, updated status for several.
Updated documentation with new process to take in requested knowledge sources to be open a PR against this devdoc.
Related to issue instructlab#59 which should be closed once this PR is reviewed and merged.
Co-Authored-by: JJ Asghar <[email protected]>
Signed-off-by: Leslie Hawthorn <[email protected]>
The code dealing with differences between:
is getting a lot more complicated than it seems necessary. #55 dealt with some short-term fixes. This issue is to track some cleanup work to make things more consistent and hopefully reduce the amount of data transformation necessary.
From #55
The text was updated successfully, but these errors were encountered: