updates to grounded flow #53

oindrillac · 2024-06-28T21:56:42Z

Stress testing was done on the SDG pipeline

added grounded test script that can be used to test
updates to templates to get full pipeline working

scripts/test_grounded_skills.py

npalaska · 2024-07-01T05:11:41Z

src/instructlab/sdg/default_flows.py

+                    "filter_value": "2.0",
                    "operation": operator.eq,
+                    "convert_dtype": float,


This change is somehow breaking the knowledge generation. The filter_relevancy block is returning a dataset with 0 rows (probably because it filters out all the responses generated by the previous block). Is this expected? Same goes for filter_verify_question block below.

Ah this could be because the file src/instructlab/sdg/configs/knowledge/simple_generate_qa.yaml still contains

{question_1} {response_1} {question_2} {response_2} {question_3} {response_3}

instead of

{icl_query_1} {icl_response_1} {icl_query_2} {icl_response_2} {icl_query_3} {icl_response_3}

#57 addresses this

markmc · 2024-07-01T10:39:57Z

The commit log could be much improved. The logical changes I see, with the explanations I'd expect:

Convert filter values to a float - does this fix a bug? details of how the bug manifested would help others learn
Combine question and context in the grounded skills flow at the end - why? what's the before and after behavior
template updates to get full pipeline working - these look like important, but subtle changes, what details can you provide to help a person in future understand what to look out for if they are making further changes?

oindrillac · 2024-07-01T15:02:47Z

here are some explanations for the changes:

Convert filter values to a float - does this fix a bug? details of how the bug manifested would help others learn

This is addressing an issue that we observed. In some of the cases where the model is being asked to provide a total score, it provides floating point sums like 1.5/2.5 and to accommodate those cases where we want to filter those values out it made sense to keep it float.

Combine question and context in the grounded skills flow at the end - why? what's the before and after behavior

That is a step towards converting it to the final format (messages) required for training.

template updates to get full pipeline working - these look like important, but subtle changes, what details can you provide to help a person in future understand what to look out for if they are making further changes?

The template changes stem from model behavior on the templates and stress testing across various leaf nodes. The changes were mainly made to make sure the responses from the model align best with the expected behavior

markmc · 2024-07-01T16:14:26Z

here are some explanations for the changes:

It would be great to get those explanations into the commit messages. See also "Information in commit messages" in https://www.berrange.com/tags/commit-message/

The commit message must contain all the information required to fully understand & review the patch for correctness. Less is not more. More is more.

shivchander · 2024-07-01T16:24:38Z

here are some explanations for the changes:

It would be great to get those explanations into the commit messages. See also "Information in commit messages" in https://www.berrange.com/tags/commit-message/

The commit message must contain all the information required to fully understand & review the patch for correctness. Less is not more. More is more.

Thanks for the recommendation! We could start doing this here on, this PR needs to go in by today. We are on a very tight deadline

shivchander

a few minor changes, otherwise lgtm

src/instructlab/sdg/default_flows.py

oindrillac · 2024-07-01T17:47:14Z

a few minor changes, otherwise lgtm

done, thank you!

aakankshaduggal

LGTM, thanks @oindrillac

shivchander

LGTM, thanks

npalaska

LGTM

russellb

the last two commits just say "updated to string" but neither are updating something to be a string.

either way it looks like something that should be squashed in to your main commit

mergify · 2024-07-02T14:52:43Z

This pull request has merge conflicts that must be resolved before it can be
merged. @oindrillac please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

oindrillac · 2024-07-02T15:00:41Z

@russellb squashed

russellb · 2024-07-02T16:17:30Z

@russellb squashed

Thanks. Can you also add your explanations of the changes that you gave here to the commit messages as suggested?

oindrillac · 2024-07-02T18:39:25Z

@russellb @markmc updated the commit messages as per your suggestion, ptal thanks

russellb

I see that e2e passed, though the commit messages still seem the same? I'm happy for it to merge once that content is there. Let me know if any help is needed with that. You don't need to wait for e2e to pass again if you only change the commit message. Just merge away.

This is a step towards converting it to the final format (messages) required for training. Signed-off-by: Oindrilla Chatterjee <[email protected]> Co-authored-by: Aakanksha Duggal <[email protected]> Co-authored-by: Shiv <[email protected]>

Changed the prompt templates and alignment with expected outputs. Conducted stress testing across various leaf nodes to ensure accuracy and relevance. Signed-off-by: Oindrilla Chatterjee <[email protected]> Co-authored-by: Aakanksha Duggal <[email protected]> Co-authored-by: Shiv <[email protected]>

russellb · 2024-07-02T20:38:20Z

lgtm!

oindrillac · 2024-07-02T20:40:37Z

Oh for some reason it didn't get synced. Thanks for adding descriptive notes @aakankshaduggal

npalaska reviewed Jun 28, 2024

View reviewed changes

scripts/test_grounded_skills.py Outdated Show resolved Hide resolved

npalaska mentioned this pull request Jul 1, 2024

Replace Iterblock with LLMBlock #58

Closed

npalaska reviewed Jul 1, 2024

View reviewed changes

oindrillac force-pushed the grounded branch from 9a4b684 to 7b00084 Compare July 1, 2024 15:10

shivchander reviewed Jul 1, 2024

View reviewed changes

src/instructlab/sdg/default_flows.py Outdated Show resolved Hide resolved

src/instructlab/sdg/default_flows.py Outdated Show resolved Hide resolved

aakankshaduggal approved these changes Jul 1, 2024

View reviewed changes

aakankshaduggal requested a review from npalaska July 1, 2024 18:00

shivchander approved these changes Jul 1, 2024

View reviewed changes

aakankshaduggal requested a review from russellb July 1, 2024 19:29

npalaska reviewed Jul 1, 2024

View reviewed changes

russellb requested changes Jul 1, 2024

View reviewed changes

oindrillac force-pushed the grounded branch from dfc4710 to ea80afd Compare July 2, 2024 14:51

mergify bot added CI/CD Affects CI/CD configuration testing Relates to testing needs-rebase labels Jul 2, 2024

oindrillac force-pushed the grounded branch from ea80afd to df5c29c Compare July 2, 2024 14:54

mergify bot removed the needs-rebase label Jul 2, 2024

oindrillac force-pushed the grounded branch 3 times, most recently from 96c8f4f to 3c302b0 Compare July 2, 2024 18:37

oindrillac requested a review from russellb July 2, 2024 18:37

russellb approved these changes Jul 2, 2024

View reviewed changes

aakankshaduggal force-pushed the grounded branch from 3c302b0 to 464a393 Compare July 2, 2024 20:31

oindrillac and others added 2 commits July 2, 2024 16:34

aakankshaduggal force-pushed the grounded branch from 464a393 to 1f89b1a Compare July 2, 2024 20:36

russellb merged commit 2788384 into instructlab:main Jul 2, 2024
10 checks passed

russellb added this to the 0.1.0 milestone Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

updates to grounded flow #53

updates to grounded flow #53

oindrillac commented Jun 28, 2024

npalaska Jul 1, 2024

npalaska Jul 1, 2024

russellb Jul 1, 2024

markmc commented Jul 1, 2024

oindrillac commented Jul 1, 2024

markmc commented Jul 1, 2024

shivchander commented Jul 1, 2024

shivchander left a comment

oindrillac commented Jul 1, 2024

aakankshaduggal left a comment

shivchander left a comment

npalaska left a comment

russellb left a comment

mergify bot commented Jul 2, 2024

oindrillac commented Jul 2, 2024

russellb commented Jul 2, 2024

oindrillac commented Jul 2, 2024 •

edited

Loading

russellb left a comment

russellb commented Jul 2, 2024

oindrillac commented Jul 2, 2024

updates to grounded flow #53

updates to grounded flow #53

Conversation

oindrillac commented Jun 28, 2024

npalaska Jul 1, 2024

Choose a reason for hiding this comment

npalaska Jul 1, 2024

Choose a reason for hiding this comment

russellb Jul 1, 2024

Choose a reason for hiding this comment

markmc commented Jul 1, 2024

oindrillac commented Jul 1, 2024

markmc commented Jul 1, 2024

shivchander commented Jul 1, 2024

shivchander left a comment

Choose a reason for hiding this comment

oindrillac commented Jul 1, 2024

aakankshaduggal left a comment

Choose a reason for hiding this comment

shivchander left a comment

Choose a reason for hiding this comment

npalaska left a comment

Choose a reason for hiding this comment

russellb left a comment

Choose a reason for hiding this comment

mergify bot commented Jul 2, 2024

oindrillac commented Jul 2, 2024

russellb commented Jul 2, 2024

oindrillac commented Jul 2, 2024 • edited Loading

russellb left a comment

Choose a reason for hiding this comment

russellb commented Jul 2, 2024

oindrillac commented Jul 2, 2024

oindrillac commented Jul 2, 2024 •

edited

Loading