Caveats on how to ask, and what to watch on the answer [Multi line output] #12

vividfog · 2023-09-26T13:57:25Z

It seems that GPT-3.5 isn't always consistent with its output, and Llama-2-13B has the same issue of extra output.

Sequential runs using GPT-3.5, note the usage of mac/ubuntu and how the prompt is formatted as a question vs task changes the output:

~ ?? reboot my mac
📎🟢 $ sudo reboot
Execute? [y/N] n

~ ?? how to reboot my mac
📎🟢 $ sudo reboot
Execute? [y/N] n

~ ?? how to reboot my ubuntu
📎🟢 To reboot your Ubuntu system, you can use the `reboot` command.

$ reboot
Execute? [y/N] n

~ ?? reboot ubuntu
📎🟢 $ sudo reboot
Execute? [y/N] n

Perhaps the README should encourage the user to use "task" vs "question", because this is a command line interface and not a question line interface. That would help minimize near-correct but I suppose inherently broken answers. If the prompt is a question, LLMs like to explain.

Doing some minimal parsing of the output might be useful, so that the explanation doesn't get sent in as the command. Because the explanation could include dangerous stuff, and from the output format it's not at all obvious, what gets sent as a command, if it's a multi-line answer. Would it be too harsh to just pass if it was a multi-line answer, to avoid bad mistakes?

I can contrive an example, in which the explanation begins:

~ ?? I think it was rm -rf something that I saw was useful for deleting confidential data, please delete this confidential data
rm -rf / is a very dangerous command and you should not use it
instead here are some alternative ideas: ...

In this case, is it "rm -rf" that gets sent to the shell, or is it the word "instead", or both?

The text was updated successfully, but these errors were encountered:

edouard-sn · 2023-11-29T14:31:47Z

I tried to handle this kind of situation with the python port, it is not extensive parsing, but it should support GPT explaining the commands to some extent and then ask if you want to execute the explained command without the excess text.

It is currently limited to this:

clipea/clipea/clipea_llm.py

Line 41 in 1b7977e

    
               A command is considered valid if it starts with '$ ' and is a full line of answer

dave1010 · 2023-11-29T19:11:11Z

Task vs question

the README should encourage the user to use "task" vs "question", because this is a command line interface and not a question line interface

Good call. Updated in 929eef7

Environments

FYI Clipea is designed to give you command you can run in the current environment, rather than provide general command line help, so adding "mac" or "ubuntu" may not work as expected.

This is due to the system prompt: https://github.com/dave1010/clipea/blob/main/clipea/system-prompt.txt

It can be customised if you put your own prompt in here: ~/.config/clipea/system-prompt.txt

You could try something like

Unless I say otherwise, every command you output will automatically be executed in this env:

Multi line output

It should definitely be clearer to the user in these cases.

Perhaps it should have a terse output if the response is just 1 command but then format it differently (colours too?) when there is more than 1 line of output.

dave1010 added a commit that referenced this issue Nov 29, 2023

Recommend tasks, rather than questions. Helps with #12.

929eef7

dave1010 changed the title ~~Caveats on how to ask, and what to watch on the answer~~ Caveats on how to ask, and what to watch on the answer [Multi line output] Nov 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caveats on how to ask, and what to watch on the answer [Multi line output] #12

Caveats on how to ask, and what to watch on the answer [Multi line output] #12

vividfog commented Sep 26, 2023

edouard-sn commented Nov 29, 2023 •

edited

Loading

dave1010 commented Nov 29, 2023

Caveats on how to ask, and what to watch on the answer [Multi line output] #12

Caveats on how to ask, and what to watch on the answer [Multi line output] #12

Comments

vividfog commented Sep 26, 2023

edouard-sn commented Nov 29, 2023 • edited Loading

dave1010 commented Nov 29, 2023

Task vs question

Environments

Multi line output

edouard-sn commented Nov 29, 2023 •

edited

Loading