Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for prompts & agents #79

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.8', '3.9', '3.10', '3.11']
python-version: ['3.9', '3.10', '3.11']

steps:
- uses: actions/checkout@v4
Expand Down
99 changes: 89 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,16 @@ You specify what kind of an app you want to build. Then, GPT Pilot asks clarifyi
<!-- TOC -->
* [🔌 Requirements](#-requirements)
* [🚦How to start using gpt-pilot?](#how-to-start-using-gpt-pilot)
* [🧑‍💻️ Other arguments](#%EF%B8%8F-other-arguments)
* [🐳 How to start gpt-pilot in docker?](#how-to-start-gpt-pilot-in-docker)
* [🧑‍💻️ CLI arguments](#%EF%B8%8F-cli-arguments)
* [`app_id` and `workspace`](#app_id-and-workspace)
* [`user_id`, `email` and `password`](#user_id-email-and-password)
* [`app_type` and `name`](#app_type-and-name)
* [`step`](#step)
* [`skip_until_dev_step`](#skip_until_dev_step)
* [`advanced`](#advanced)
* [`delete_unrelated_steps`](#delete_unrelated_steps)
* [`update_files_before_start`](#update_files_before_start)
* [🔎 Examples](#-examples)
* [Real-time chat app](#-real-time-chat-app)
* [Markdown editor](#-markdown-editor)
Expand Down Expand Up @@ -49,8 +58,7 @@ https://github.com/Pythagora-io/gpt-pilot/assets/10895136/0495631b-511e-451b-93d

# 🔌 Requirements


- **Python**
- **Python >= 3.9**
- **PostgreSQL** (optional, projects default is SQLite)
- DB is needed for multiple reasons like continuing app development if you had to stop at any point or app crashed, going back to specific step so you can change some later steps in development, easier debugging, for future we will add functionality to update project (change some things in existing project or add new features to the project and so on)...

Expand All @@ -76,6 +84,7 @@ All generated code will be stored in the folder `workspace` inside the folder na
**IMPORTANT: To run GPT Pilot, you need to have PostgreSQL set up on your machine**
<br>


# 🐳 How to start gpt-pilot in docker?
1. `git clone https://github.com/Pythagora-io/gpt-pilot.git` (clone the repo)
2. Update the `docker-compose.yml` environment variables
Expand All @@ -87,28 +96,96 @@ All generated code will be stored in the folder `workspace` inside the folder na

This will start two containers, one being a new image built by the `Dockerfile` and a postgres database. The new image also has [ttyd](https://github.com/tsl0922/ttyd) installed so you can easily interact with gpt-pilot.

# 🧑‍💻️ Other arguments
- continue working on an existing app

# 🧑‍💻️ CLI arguments

## `app_id` and `workspace`
Continue working on an existing app using **`app_id`**
```bash
python main.py app_id=<ID_OF_THE_APP>
```

- continue working on an existing app from a specific step
_or_ **`workspace`** path:

```bash
python main.py workspace=<PATH_TO_PROJECT_WORKSPACE>
```

Each user can have their own workspace path for each App. (See [`user_id`](#user_id-email-and-password))


## `user_id`, `email` and `password`
These values will be saved to the User table in the DB.

```bash
python main.py user_id=me_at_work
```

If not specified, `user_id` defaults to the OS username, but can be provided explicitly if your OS username differs from your GitHub or work username. This value is used to load the `App` config when the `workspace` arg is provided.

If not specified `email` will be parsed from `~/.gitconfig` if the file exists.

See also [What's the purpose of arguments.password / User.password?](https://github.com/Pythagora-io/gpt-pilot/discussions/55)

---

## `app_type` and `name`
If not provided, the ProductOwner will ask for these values

`app_type` is used as a hint to the LLM as to what kind of architecture, language options and conventions would apply. If not provided, `prompts.prompts.ask_for_app_type()` will ask for it.

See `const.common.ALL_TYPES`: 'Web App', 'Script', 'Mobile App', 'Chrome Extension'

---

## `step`
Continue working on an existing app from a specific **`step`** (eg: `user_tasks`)
```bash
python main.py app_id=<ID_OF_THE_APP> step=<STEP_FROM_CONST_COMMON>
```

- continue working on an existing app from a specific development step

## `skip_until_dev_step`
- Continue working on an existing app from a specific **development step**
```bash
python main.py app_id=<ID_OF_THE_APP> skip_until_dev_step=<DEV_STEP>
```
This is basically the same as `step` but during the actual development process. If you want to play around with gpt-pilot, this is likely the flag you will often use.
<br>
- erase all development steps previously done and continue working on an existing app from start of development

- Erase all development steps previously done and continue working on an existing app from start of development

```bash
python main.py app_id=<ID_OF_THE_APP> skip_until_dev_step=0
```

---

## `advanced`
The Architect by default favours certain technologies including:

- Node.JS
- MongoDB
- PeeWee ORM
- Jest & PyUnit
- Bootstrap
- Vanilla JavaScript
- Socket.io

If you have your own preferences, you can have a deeper conversation with the Architect.

```bash
python main.py advanced=True
```


## `delete_unrelated_steps`


## `update_files_before_start`



# 🔎 Examples

Here are a couple of example apps GPT Pilot created by itself:
Expand Down Expand Up @@ -155,8 +232,10 @@ Here are the steps GPT Pilot takes to create an app:
4. **Architect agent** writes up technologies that will be used for the app
5. **DevOps agent** checks if all technologies are installed on the machine and installs them if they are not
6. **Tech Lead agent** writes up development tasks that Developer will need to implement. This is an important part because, for each step, Tech Lead needs to specify how the user (real world developer) can review if the task is done (eg. open localhost:3000 and do something)
7. **Developer agent** takes each task and writes up what needs to be done to implement it. The description is in human readable form.
8. Finally, **Code Monkey agent** takes the Developer's description and the currently implement file and implements the changes into it. We realized this works much better than giving it to Developer right away to implement changes.
7. **Developer agent** takes each task and writes up what needs to be done to implement it. The description is in human-readable form.
8. Finally, **Code Monkey agent** takes the Developer's description and the existing file and implements the changes into it. We realized this works much better than giving it to Developer right away to implement changes.

For more details on the roles of agents employed by GPT Pilot refer to [AGENTS.md](https://github.com/Pythagora-io/gpt-pilot/blob/main/pilot/helpers/agents/AGENTS.md)

![GPT Pilot Coding Workflow](https://github.com/Pythagora-io/gpt-pilot/assets/10895136/53ea246c-cefe-401c-8ba0-8e4dd49c987b)

Expand Down
64 changes: 64 additions & 0 deletions pilot/helpers/agents/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
Roles are defined in `const.common.ROLES`.
Each agent's role is described to the LLM by a prompt in `pilot/prompts/system_messages/{role}.prompt`

## Product Owner
`project_description`, `user_stories`, `user_tasks`

- Talk to client, ask detailed questions about what client wants
- Give specifications to dev team


## Architect
`architecture`

- Scripts: Node.js, MongoDB, PeeWee ORM
- Testing: Node.js -> Jest, Python -> pytest, E2E -> Cypress **(TODO - BDD?)**
- Frontend: Bootstrap, vanilla Javascript **(TODO - TypeScript, Material/Styled, React/Vue/other?)**
- Other: cronjob, Socket.io

TODO:
- README.md
- .gitignore
- .editorconfig
- LICENSE
- CI/CD
- IaC, Dockerfile


## Tech Lead
`development_planning`

- Break down the project into smaller tasks for devs.
- Specify each task as clear as possible:
- Description
- "Programmatic goal" which determines if the task can be marked as done.
eg: "server needs to be able to start running on a port 3000 and accept API request
to the URL `http://localhost:3000/ping` when it will return the status code 200"
- "User-review goal"
eg: "run `npm run start` and open `http://localhost:3000/ping`, see "Hello World" on the screen"


## Dev Ops
`environment_setup`

**TODO: no prompt**

`debug` functions: `run_command`, `implement_code_changes`


## Developer (full_stack_developer)
`create_scripts`, `coding` **(TODO: No entry in `STEPS` for `create_scripts`)**

- Implement tasks assigned by tech lead
- Modular code, TDD
- Tasks provided as "programmatic goals" **(TODO: consider BDD)**



## Code Monkey
**TODO: not listed in `ROLES`**

`development/implement_changes` functions: `save_files`

- Implement tasks assigned by tech lead
- Modular code, TDD
1 change: 1 addition & 0 deletions pilot/helpers/agents/Architect.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ def get_architecture(self):
# 'user_tasks': self.project.user_tasks,
'app_type': self.project.args['app_type']}, ARCHITECTURE)

# TODO: Project.args should be a defined class so that all of the possible args are more obvious
if self.project.args.get('advanced', False):
architecture = get_additional_info_from_user(self.project, architecture, 'architect')

Expand Down
6 changes: 4 additions & 2 deletions pilot/helpers/agents/CodeMonkey.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,17 +10,19 @@ def __init__(self, project, developer):
self.developer = developer

def implement_code_changes(self, convo, code_changes_description, step_index=0):
if convo == None:
if convo is None:
convo = AgentConvo(self)

# "... step {i} - {step.description}.
# To do this, you will need to see the local files
# Ask for files relative to project root."
files_needed = convo.send_message('development/task/request_files_for_code_changes.prompt', {
"step_description": code_changes_description,
"directory_tree": self.project.get_directory_tree(True),
"step_index": step_index,
"finished_steps": ', '.join(f"#{j}" for j in range(step_index))
}, GET_FILES)


changes = convo.send_message('development/implement_changes.prompt', {
"step_description": code_changes_description,
"step_index": step_index,
Expand Down
41 changes: 33 additions & 8 deletions pilot/helpers/agents/Developer.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
import uuid
from termcolor import colored
from utils.questionary import styled_text
from helpers.files import update_file
from utils.utils import step_already_finished
from helpers.agents.CodeMonkey import CodeMonkey
from logger.logger import logger
Expand All @@ -13,7 +12,9 @@
from const.function_calls import FILTER_OS_TECHNOLOGIES, DEVELOPMENT_PLAN, EXECUTE_COMMANDS, GET_TEST_TYPE, DEV_TASKS_BREAKDOWN, IMPLEMENT_TASK
from database.database import save_progress, get_progress_steps, save_file_description
from utils.utils import get_os_info
from helpers.cli import execute_command

ENVIRONMENT_SETUP_STEP = 'environment_setup'


ENVIRONMENT_SETUP_STEP = 'environment_setup'

Expand All @@ -40,23 +41,44 @@ def start_coding(self):

def implement_task(self):
convo_dev_task = AgentConvo(self)
# TODO: why "This should be a simple version of the app so you don't need to aim to provide a production ready code"?
# TODO: why `no_microservices`? Is that even applicable?
task_description = convo_dev_task.send_message('development/task/breakdown.prompt', {
"name": self.project.args['name'],
"app_type": self.project.args['app_type'],
"app_summary": self.project.project_description,
"clarification": [],
# TODO: why all stories at once?
"user_stories": self.project.user_stories,
# "user_tasks": self.project.user_tasks,
# TODO: "I'm currently in an empty folder" may not always be true?
"technologies": self.project.architecture,
# TODO: `array_of_objects_to_string` does not seem to be used by the prompt template?
"array_of_objects_to_string": array_of_objects_to_string,
# TODO: prompt lists `files` if `current_task_index` != 0
"directory_tree": self.project.get_directory_tree(True),
})

task_steps = convo_dev_task.send_message('development/parse_task.prompt', {}, IMPLEMENT_TASK)
convo_dev_task.remove_last_x_messages(2)
self.execute_task(convo_dev_task, task_steps, continue_development=True)

def execute_task(self, convo, task_steps, test_command=None, reset_convo=True, test_after_code_changes=True, continue_development=False):
def execute_task(self, convo: AgentConvo, task_steps, test_command=None, reset_convo=True, test_after_code_changes=True, continue_development=False):
"""
:param convo:
:param task_steps: [{
type: 'command|code_change|human_intervention',
command: { command: '', timeout: 1000ms }
code_change: { name: 'file name', path: '/path/to/file', content: "console.info('Hello');" },
(or code_change_description: str)
human_intervention_description: 'description of step in debugging'
}, ...]
:param test_command: None
:param reset_convo: True
:param test_after_code_changes: True
:param continue_development: False
:return:
"""
function_uuid = str(uuid.uuid4())
convo.save_branch(function_uuid)

Expand All @@ -75,6 +97,7 @@ def execute_task(self, convo, task_steps, test_command=None, reset_convo=True, t
run_command_until_success(data['command'], data['timeout'], convo, additional_message=additional_message)

elif step['type'] == 'code_change' and 'code_change_description' in step:
# DEV_TASKS_BREAKDOWN
# TODO this should be refactored so it always uses the same function call
print(f'Implementing code changes for `{step["code_change_description"]}`')
code_monkey = CodeMonkey(self.project, self)
Expand All @@ -83,6 +106,7 @@ def execute_task(self, convo, task_steps, test_command=None, reset_convo=True, t
self.test_code_changes(code_monkey, updated_convo)

elif step['type'] == 'code_change':
# IMPLEMENT_TASK
# TODO fix this - the problem is in GPT response that sometimes doesn't return the correct JSON structure
if 'code_change' not in step:
data = step
Expand Down Expand Up @@ -158,7 +182,6 @@ def continue_development(self, iteration_convo):

def set_up_environment(self):
self.project.current_step = ENVIRONMENT_SETUP_STEP
self.convo_os_specific_tech = AgentConvo(self)

# If this app_id already did this step, just get all data from DB and don't ask user again
step = get_progress_steps(self.project.args['app_id'], ENVIRONMENT_SETUP_STEP)
Expand All @@ -178,7 +201,9 @@ def set_up_environment(self):
logger.info(f"Setting up the environment...")

os_info = get_os_info()
os_specific_technologies = self.convo_os_specific_tech.send_message('development/env_setup/specs.prompt',

convo_os_specific_tech = AgentConvo(self)
os_specific_technologies = convo_os_specific_tech.send_message('development/env_setup/specs.prompt',
{
"name": self.project.args['name'],
"app_type": self.project.args['app_type'],
Expand All @@ -188,7 +213,7 @@ def set_up_environment(self):

for technology in os_specific_technologies:
# TODO move the functions definitions to function_calls.py
cli_response, llm_response = self.convo_os_specific_tech.send_message('development/env_setup/install_next_technology.prompt',
cli_response, llm_response = convo_os_specific_tech.send_message('development/env_setup/install_next_technology.prompt',
{ 'technology': technology}, {
'definitions': [{
'name': 'execute_command',
Expand All @@ -215,11 +240,11 @@ def set_up_environment(self):
})

if llm_response != 'DONE':
installation_commands = self.convo_os_specific_tech.send_message('development/env_setup/unsuccessful_installation.prompt',
installation_commands = convo_os_specific_tech.send_message('development/env_setup/unsuccessful_installation.prompt',
{ 'technology': technology }, EXECUTE_COMMANDS)
if installation_commands is not None:
for cmd in installation_commands:
run_command_until_success(cmd['command'], cmd['timeout'], self.convo_os_specific_tech)
run_command_until_success(cmd['command'], cmd['timeout'], convo_os_specific_tech)

logger.info('The entire tech stack needed is installed and ready to be used.')

Expand Down
Loading
Loading