Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment run command #49

Merged
merged 4 commits into from
May 9, 2024
Merged

Experiment run command #49

merged 4 commits into from
May 9, 2024

Conversation

rchan26
Copy link
Collaborator

@rchan26 rchan26 commented May 8, 2024

Fix #10. The filename must be the full path to the file,

Example usage:

  • if test.jsonl is in the current directory, it'll get moved to the input folder of the data folder, which isdata/input by default:
prompto_run_experiment -f test.jsonl
  • if test.jsonl is not in the current directory, you must provide the full path to it and it'll get moved to the input folder of the data folder if it isn't already. if it is in the data folder:
prompto_run_experiment -f data/input/test.jsonl
  • if the filename passed is not a JSONL file, it will error:
prompto_run_experiment -f test.txt
  • the file can really be anywhere, it will just get moved to the input folder for us to run. you can specify the data folder using -d as usual, and there are flags for max-queries per minute (-m) and max retry attempts (-a) and whether or not to run the experiment in "parallel" (-p) by querying different APIs in parallel - just like the standard prompto_run_pipeline command
prompto_run_experiment --file some_folder/some_sub_folder/some_exp.jsonl --data pipeline_data --max-queries 20 --max-attempts 3 --parallel

Note: the below is not an option anymore but keeping here as a record:

The filename must either:

  • path to the file, or
  • already in the input folder

Example usage

  • if test.jsonl is not in the current directory, it has to be in the data/input folder, i.e. the full path is data/input/test.jsonl but you just pass in the filename:
prompto_run_experiment -f test.jsonl

In this setting where the file is in the input folder, this is equivalent to just passing the full path, i.e.

prompto_run_experiment -f data/input/test.jsonl

In the setting where test.jsonl is neither in the current directory or the input folder, there will be an error as it cannot locate where the file is.

@rchan26 rchan26 marked this pull request as draft May 8, 2024 16:06
@rchan26 rchan26 requested a review from fedenanni May 9, 2024 08:40
@rchan26 rchan26 marked this pull request as ready for review May 9, 2024 08:40
@fedenanni
Copy link
Collaborator

@rchan26 all looks good but do we really need that option where you pass just the name of the file and if the file is not in the same directory checks if it's in input? Can't we simplify it by just allowing the correct path where the file is?

@fedenanni
Copy link
Collaborator

I mean this:
Screenshot 2024-05-09 at 13 02 59

@rchan26
Copy link
Collaborator Author

rchan26 commented May 9, 2024

yeah, I was thinking and debating about this. it is a bit messier, but I was thinking about the case where your data folder isn't data which is the case with the project. in that repo, there's already a data folder including the evals, so I make a new folder pipeline_data. in this case, if we only allowed for a valid path, we'd have to specify that different folder twice:

prompto_run_experiment -f pipeline_data/input/test.jsonl -d pipeline_data

in this setup, you could do

prompto_run_experiment -f test.jsonl -d pipeline_data

to run pipeline_data/input/test.jsonl.

But I think I agree with you, it's a bit confusing, especially if for some reason you have test.jsonl in the current directory and in the input folder (it would actually run the thing in the input folder). I'll remove this option and ask for another review

@fedenanni fedenanni merged commit 0535166 into main May 9, 2024
3 checks passed
@rchan26 rchan26 deleted the experiment-run-command branch May 10, 2024 08:41
rchan26 pushed a commit that referenced this pull request May 20, 2024
rchan26 pushed a commit that referenced this pull request May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CLI for running a single experiment
2 participants