-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Provide traceback and function source code to LLM advice model #48
base: main
Are you sure you want to change the base?
Conversation
Hi @PCain02 it looks like this branch now has conflicts that need to be resolved. Can you please investigate this issue when you have time? |
Also, @PCain02 can you give an example of a command-line that can be run and a repository on which it can be run so that we can all quickly test this feature? |
Oh thanks for pointing that out I can resolve those ASAP! |
Absolutely, the command I have been using is |
Oh also the toml version will need changed before merging. I am not sure what people decided on class about merge order. |
PR #41 for the Windows spacing bug should get merged before this PR because I made this with that fix in mind. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! This adds lots of needed depth to the LLM generation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! And when you guys ran |
Hi @rebekahrudd thanks for running this on Linux, I appreciate your help. Do you think that the advice that the coding mentor provides is now more helpful? |
Hello @PCain02, I know that it is not a specific feature of your tool, but I'm wondering whether or not the collected stack trace (that you are passing to the LLM) should actually be one of the options that you can specify through the use of With all of those points in mind, can you please provide --- as a part of the review of this PR and whether or not we decide to add this feature --- please give an example of the stack trace that your tool collects and passes along to the LLM? While I recognize that your tool does not currently (nor does it absolutely need to) display this information to the user. However, it would be helpful if we could see this information as it is passed to the LLM and this a major factor in the type of output that the LLM produces. |
Hi @CalebKendra and @rebekahrudd and others who reviewed this PR, can you please respond to @PCain02 about whether or not running the test suite passes correctly on your operating system and setup? (It should, however, we've had some problems with running the tests in GitHub Actions being different than running the tests on developer workstations). Thanks! |
Hello! I really like that idea although I think some formatting would be required to make it useful to the student right now here is an example of what the traceback looks like when given to the LLM. It is a list of dictionaries that contain the values about each test that failed. -- |
@PCain02 I just tested the pytest suite with Ubuntu and it passed. |
e475565
Yes this is much more helpful now! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request
**1. feat: Provide traceback and function source code to LLM advice model
2. List the names of those who contributed to the project.
@PCain02
3. Link the issue the pull request is meant to fix/resolve.
4. Add all labels that apply. (e.g., documentation, ready-for-review)
5. Describe the contents and goal of the pull request.
The goal of this PR are to provide more information to the LLM advice model. This includes the traceback and the actual source code of the function/s that failed the test.
6. Will coverage be maintained/increased?
Coverage will decrease slightly but test cases were added to compensate for that. It still remains at about 61%.
7. What operating systems has this been tested on? How were these tests conducted?
Windows 10 so please test on macOS and Linux. The tests were ran by doing
poetry run task test
but also by prompting the advice model and looking at the responses to see if they are more valuable than before. Now the traceback and failing function source code are given to the LLM so the responses are a lot more relevant and should not try to correct the test cases.8. Include a code block and/or screenshots displaying the functionality of your feature, if applicable/possible.
I know this is a lot to process so these figures should help with code comprehension.
This is an example output: