Skip to content

Commit

Permalink
Merge pull request #52 from stoerr/feature/command-files
Browse files Browse the repository at this point in the history
Since the Java startup time is noticeable, it might become unpleasant if you call the aigenpipeline on >100 files or something. So, if you just give one argument (is not an option) that's taken as a file containing arguments to several invocations of the tools. Empty lines separate invocations. Lines starting with # are comments.
  • Loading branch information
stoerr authored Sep 17, 2024
2 parents 057c9fa + 2ed5cce commit b75d689
Show file tree
Hide file tree
Showing 6 changed files with 111 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,48 @@ public class AIGenPipeline {
protected List<AIInOut> hintFiles = new ArrayList<>();

public static void main(String[] args) throws IOException {
new AIGenPipeline().run(args);
if (args.length == 1 && !args[0].startsWith("-")) {
processCommandFile(args[0]);
} else {
new AIGenPipeline().run(args);
}
}

/**
* Read command lines from the given file. Empty lines separate individual command lines.
* Lines starting with a # are ignored (comments).
* This saves the startup time when calling the tool multiple times. Incompatible to all other options.
*/
protected static void processCommandFile(String cmdfilepath) throws IOException {
File cmdfile = new File(cmdfilepath);
if (!cmdfile.exists() || !cmdfile.isFile() || !cmdfile.canRead()) {
ERR.println("Cannot read command file " + cmdfile.getAbsolutePath());
System.exit(1);
}
try (Scanner scanner = new Scanner(cmdfile, StandardCharsets.UTF_8)) {
// TODO: perhaps handle quoted strings, but that's only for command line arguments unlikely to occur.
StringBuffer cmd = new StringBuffer();
while (scanner.hasNextLine()) {
String line = scanner.nextLine().trim();
if (line.trim().startsWith("#")) continue;
if (line.trim().isEmpty()) {
runWithCommandLine(cmd.toString());
cmd.setLength(0);
} else {
cmd.append(" ").append(line.trim());
}
}
if (!cmd.toString().trim().isEmpty()) {
runWithCommandLine(cmd.toString());
}
}
}

protected static void runWithCommandLine(String cmdline) throws IOException {
ERR.println("Processing command line: ");
ERR.println(cmdline.toString().trim());
new AIGenPipeline().run(cmdline.trim().split("\\s+"));
ERR.println();
}

protected void run(String[] args) throws IOException {
Expand Down Expand Up @@ -221,6 +262,7 @@ protected void executeTask() {

/**
* Scans for files in {@link #outputScan} and processes them.
*
* @param args the command line arguments
*/
protected void runWithOutputScan(String[] args) {
Expand Down Expand Up @@ -358,6 +400,7 @@ protected void parseArguments(String[] args, File dir) throws IOException {
switch (args[i]) {
case "-h":
case "--help":
case "-?":
help = true;
break;
case "-ha":
Expand Down Expand Up @@ -515,6 +558,7 @@ protected void parseArguments(String[] args, File dir) throws IOException {
/**
* This reads the collected texts of the website from /helpaitexts.md and gives them to the AI, and then has it
* answer the #helpAIquestion from that.
*
* @throws IOException if the help texts could not be read
*/
protected void answerHelpAIQuestion() throws IOException {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
Usage:
aigenpipeline [options] [<input_files>...]
or
ai-gen-pipeline <command_file>

The AIGenPipeline tool generates content using an AI based on a prompt and input files.
It can also update or improve existing content, and it only calls the AI if the input or prompt files have changed.

If it's called with a command file, it reads a number of command lines from that file. Empty lines separate individual command lines.
Lines starting with a # are ignored (comments). This saves the startup time when calling the tool multiple times.

Options:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ public String execute() {
if (organizationId != null) {
builder.header("OpenAI-Organization", organizationId);
}
builder.timeout(Duration.ofSeconds(120));
builder.timeout(Duration.ofSeconds(300));
HttpRequest request = builder.build();
try {
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
Expand Down
19 changes: 17 additions & 2 deletions bin/aigenpipeline
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,24 @@
scriptdir="$(dirname $(readlink -f $0))"

# find the jar file. Try first according to project layout, then in scriptdir itself.
jarFile="$(ls -1tr $scriptdir/../aigenpipeline-commandline/target/aigenpipeline-commandline*.jar | egrep -v 'sources|javadoc' | tail -n 1)"
jarFile="$(ls -1tr $scriptdir/../aigenpipeline-commandline/target/aigenpipeline-commandline*.jar | egrep -v 'sources|javadoc')"

# abort if there are several jar files
if [ $(echo "$jarFile" | wc -l) -gt 1 ]; then
echo "Cannot execute: multiple jar files found in $scriptdir/../aigenpipeline-commandline/target" >&2
echo "$jarFile" >&2
exit 1
fi

if [ -z "$jarFile" ]; then
jarFile="$(ls -1tr $scriptdir/aigenpipeline-commandline*.jar | egrep -v 'sources|javadoc' | tail -n 1)"
jarFile="$(ls -1tr $scriptdir/aigenpipeline-commandline*.jar | egrep -v 'sources|javadoc')"
fi

# abort if there are several jar files
if [ $(echo "$jarFile" | wc -l) -gt 1 ]; then
echo "Cannot execute: multiple jar files found in $scriptdir" >&2
echo "$jarFile" >&2
exit 1
fi

if [ -z "$jarFile" ]; then
Expand Down
10 changes: 10 additions & 0 deletions examples/differentialReTranslation/generate-new.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/usr/bin/env ../../bin/aigenpipeline
-m gpt-4o-mini -p 0dialogelements.prompt README.md -o dialogelements.txt

-m gpt-4o-mini -p 1html.prompt README.md dialogelements.txt -o differentialReTranslation.html

-m gpt-4o-mini -p 2css.prompt README.md differentialReTranslation.html -o differentialReTranslation.css

-m gpt-4o -p 3js.prompt README.md dialogelements.txt requests.jsonl -o differentialReTranslation.js

-m gpt-4o-mini -p 4examplejs.prompt README.md dialogelements.txt examples.txt -o differentialReTranslationExamples.js
30 changes: 30 additions & 0 deletions src/site/markdown/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,28 @@ command line arguments. Thus, the later override the earlier one. Explicitly giv
processed at the point where the argument occurs when processing the command line arguments. The option `-cp` /
`--configprint` gives an overview of the used files / sources of configuration.

## Command files

While the startup time of aigenpipeline is low in comparison to the actual LLM calls, it can still hurt if there
are many files to process and most AI calls can be skipped because there were no changes in inputs or prompts.
Thus, if you just give one file as argument, it'll be read s command file containing a number of command lines
that are executed in sequence.
Those contain a number of command lines that are executed in sequence. Empty lines separate individual command lines,
and lines starting with a # are ignored (comments). For example:

```
#!/usr/bin/env ../../bin/aigenpipeline
-m gpt-4o-mini -p 0dialogelements.prompt README.md -o dialogelements.txt
# command lines are separated by empty lines, and comment lines starting with # are ignored
-m gpt-4o-mini -p 1html.prompt README.md dialogelements.txt -o differentialReTranslation.html
# Each command line can be split over several lines if convenient.
-m gpt-4o-mini -p 2css.prompt
README.md differentialReTranslation.html
-o differentialReTranslation.css
```

## Other features

If you are not satisfied with the result, the tool can also be used to ask the AI for clarification: ask a question
Expand Down Expand Up @@ -199,6 +221,14 @@ You can either:
```
Usage:
aigenpipeline [options] [<input_files>...]
or
ai-gen-pipeline <command_file>
The AIGenPipeline tool generates content using an AI based on a prompt and input files.
It can also update or improve existing content, and it only calls the AI if the input or prompt files have changed.
If it's called with a command file, it reads a number of command lines from that file. Empty lines separate individual command lines.
Lines starting with a # are ignored (comments). This saves the startup time when calling the tool multiple times.
Options:
Expand Down

0 comments on commit b75d689

Please sign in to comment.