Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set gen_kwargs['n'] dynamically in the simple pipelines #144

Merged
merged 1 commit into from
Jul 16, 2024

Conversation

russellb
Copy link
Member

We need a way to allow --num-instructions to influence how many
instructions we generate using the simple pipelines. The way to do
this seems to be to set n to this value. Since this is a runtime
parameter, and we only want to set it for n in certain cases, add a
new value for gen_kwargs['n'] called dynamic which is a hint to use
the runtime parameter here.

Closes #130

Signed-off-by: Russell Bryant [email protected]

@russellb
Copy link
Member Author

in draft form so I can let CI run.

I'm also not in love with this implementation, but I don't have any better ideas ...

@markmc
Copy link
Contributor

markmc commented Jul 16, 2024

I think we're saying "the pipeline author needs to tell us they want --sdg-scale-factor to be applied to this block" (as per instructlab/instructlab#1570)

Would n: scaled be more intuitive?

@russellb
Copy link
Member Author

I think we're saying "the pipeline author needs to tell us they want --sdg-scale-factor to be applied to this block" (as per instructlab/instructlab#1570)

Would n: scaled be more intuitive?

Thanks, I like that name better.

We need a way to allow `--num-instructions`, or in the future
`--sdg-scale-factor`, to influence how many instructions we generate
using the simple pipelines. The way to do this seems to be to set `n`
to this value. Since this is a runtime parameter, and we only want to
set it for `n` in certain cases, add a new value for gen_kwargs['n']
called `scaled` which is a hint to use the runtime parameter here.

Closes instructlab#130

Signed-off-by: Russell Bryant <[email protected]>
@russellb
Copy link
Member Author

I think we're saying "the pipeline author needs to tell us they want --sdg-scale-factor to be applied to this block" (as per instructlab/instructlab#1570)
Would n: scaled be more intuitive?

Thanks, I like that name better.

I pushed an update to rename the value to scaled as suggested

results = []
for prompt in prompts:
for _ in range(n):
for _ in range(generate_args["n"]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not get_n() ?

Copy link
Member Author

@russellb russellb Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work and make the code more clear, but in this particular code path, the value was already converted up several lines when it did

    generate_args = self._gen_kwargs(**gen_kwargs)

Otherwise just after this when it calls client.completions.create(), it would break on an invalid value of n

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pro: always use self._get_n() so nobody is trying to think of when to use it or not

con: technically not required here

shrug

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I might tweak a bit when I rebase #138 which changes this anyway

@markmc markmc added this to the 0.1.3 milestone Jul 16, 2024
@russellb russellb marked this pull request as ready for review July 16, 2024 13:44
@russellb
Copy link
Member Author

I checked the e2e CI output and it looks correct

@markmc markmc merged commit d8e6681 into instructlab:main Jul 16, 2024
11 checks passed
jwm4 pushed a commit to jwm4/sdg that referenced this pull request Dec 13, 2024
…_actions/rojopolis/spellcheck-github-actions-0.43.0

Bump rojopolis/spellcheck-github-actions from 0.42.0 to 0.43.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix applying num_instructions_to_generate to simple pipelines
2 participants