-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import functions from the ilab repository #11
Comments
related to #6 |
@aakankshaduggal @russellb is this something we want to try and preserve commits for? if not I'm happy to take this (and the other issue I guess, seems one PR can resolve both) |
In general, yes, I would always prefer keeping history. I can also walk you through how I would do it. I haven't looked at these specifics. If the code in question is only used here, then moving it here seems like an easy decision. If it's used in other places, too, we should discuss further. |
ack @russellb - fine either way - AFAIK the code in question is only used in the CLI currently |
question is it called from only code other than what got moved here |
Okay looked into this a bit - we import from two locations within From config, we import the following:
From utils, we import the following:
|
You can write rules for Ruff or PyLint to detect these types of imports and raise an error. InstructLab uses Ruff for that:
|
This code was only used by instructlab.sdg, so move it over here instead of leaving it back in the `instructlab` repo. Part of issue instructlab#11 Signed-off-by: Russell Bryant <[email protected]>
Part of instructlab#11 sdg appears to be the main user of this, along with `ilab taxonomy diff`. We want to adapt the output of read_taxonomy() to be better suited to what sdg needs. This is the majority of src/instructlab/utils.py from commit commit 4737feb with read_taxonomy() and TaxonomyReadingException as the public API. Signed-off-by: Mark McLoughlin <[email protected]>
Part of instructlab#11 sdg appears to be the main user of this, along with `ilab taxonomy diff`. We want to adapt the output of read_taxonomy() to be better suited to what sdg needs. This is the majority of src/instructlab/utils.py from commit commit 4737feb with read_taxonomy() and TaxonomyReadingException as the public API. Signed-off-by: Mark McLoughlin <[email protected]>
Part of instructlab#11 sdg appears to be the main user of this, along with `ilab taxonomy diff`. We want to adapt the output of read_taxonomy() to be better suited to what sdg needs. This is the majority of src/instructlab/utils.py from commit commit 4737feb with read_taxonomy() and TaxonomyReadingException as the public API. Temporarily disable logging-fstring-interpolation to get lint passing. Signed-off-by: Mark McLoughlin <[email protected]>
It looks like |
This was the last import from the main `instructlab` package to remove. All it did was return this string constant, so just copy it over. Closes instructlab#11 Signed-off-by: Russell Bryant <[email protected]>
The change to remove the last import from the main |
This was the last import from the main `instructlab` package to remove. All it did was return this string constant, so just copy it over. Closes #11 Signed-off-by: Russell Bryant <[email protected]>
Some functions that are being called in the
generate_data.py
file are in this file - https://github.com/instructlab/instructlab/blob/main/src/instructlab/utils.pyto-do list:
DEFAULT_MULTIPROCESSING_START_METHOD
lab.py
but also this is just a variableDEFAULT_API_KEY
DEFAULT_MODEL_OLD
lab.py
but also this is just a variableget_model_family()
generate_data.py
but alsoserver.py
From utils, we import the following:
chunk_document()
generate_data.py
max_seed_example_tokens()
generate_data.py
num_chars_from_tokens()
generate_data.py
read_taxonomy()
generate_data.py
but alsolab.py
get_sysprompt()
generate_data.py
but alsolab.py
,chat.py
andmake_data.py
The text was updated successfully, but these errors were encountered: