feat: Provide traceback and function source code to LLM advice model #48

PCain02 · 2024-11-10T17:58:55Z

Pull Request

**1. feat: Provide traceback and function source code to LLM advice model

2. List the names of those who contributed to the project.
@PCain02

3. Link the issue the pull request is meant to fix/resolve.

4. Add all labels that apply. (e.g., documentation, ready-for-review)

enhancement
test
bug

5. Describe the contents and goal of the pull request.
The goal of this PR are to provide more information to the LLM advice model. This includes the traceback and the actual source code of the function/s that failed the test.

6. Will coverage be maintained/increased?
Coverage will decrease slightly but test cases were added to compensate for that. It still remains at about 61%.

7. What operating systems has this been tested on? How were these tests conducted?
Windows 10 so please test on macOS and Linux. The tests were ran by doing poetry run task test but also by prompting the advice model and looking at the responses to see if they are more valuable than before. Now the traceback and failing function source code are given to the LLM so the responses are a lot more relevant and should not try to correct the test cases.

8. Include a code block and/or screenshots displaying the functionality of your feature, if applicable/possible.

I know this is a lot to process so these figures should help with code comprehension.

This is an example output:

…ches

PCain02 · 2024-11-13T22:25:45Z

Oh also the toml version will need changed before merging. I am not sure what people decided on class about merge order.

PCain02 · 2024-11-13T23:25:54Z

PR #41 for the Windows spacing bug should get merged before this PR because I made this with that fix in mind.

CalebKendra

LGTM! This adds lots of needed depth to the LLM generation

coverage.json

rebekahrudd

It works on my Linux computer.

Here's the output that I see:

PCain02 · 2024-11-16T16:14:38Z

Thank you! And when you guys ran poetry run task test it worked as well?

gkapfham · 2024-11-20T22:50:28Z

Hi @rebekahrudd thanks for running this on Linux, I appreciate your help. Do you think that the advice that the coding mentor provides is now more helpful?

gkapfham · 2024-11-20T23:09:26Z

Hello @PCain02, I know that it is not a specific feature of your tool, but I'm wondering whether or not the collected stack trace (that you are passing to the LLM) should actually be one of the options that you can specify through the use of --report stacktrace? It seems like this might be helpful information that a student would want, right? Please note that if we add this extra report, we then need to be clear about the fact that there is already a --report trace that might need to be renamed to --report testtrace.

With all of those points in mind, can you please provide --- as a part of the review of this PR and whether or not we decide to add this feature --- please give an example of the stack trace that your tool collects and passes along to the LLM? While I recognize that your tool does not currently (nor does it absolutely need to) display this information to the user. However, it would be helpful if we could see this information as it is passed to the LLM and this a major factor in the type of output that the LLM produces.

gkapfham · 2024-11-20T23:10:38Z

Hi @CalebKendra and @rebekahrudd and others who reviewed this PR, can you please respond to @PCain02 about whether or not running the test suite passes correctly on your operating system and setup? (It should, however, we've had some problems with running the tests in GitHub Actions being different than running the tests on developer workstations). Thanks!

PCain02 · 2024-11-21T02:03:01Z

Hello @PCain02, I know that it is not a specific feature of your tool, but I'm wondering whether or not the collected stack trace (that you are passing to the LLM) should actually be one of the options that you can specify through the use of --report stacktrace? It seems like this might be helpful information that a student would want, right? Please note that if we add this extra report, we then need to be clear about the fact that there is already a --report trace that might need to be renamed to --report testtrace.

With all of those points in mind, can you please provide --- as a part of the review of this PR and whether or not we decide to add this feature --- please give an example of the stack trace that your tool collects and passes along to the LLM? While I recognize that your tool does not currently (nor does it absolutely need to) display this information to the user. However, it would be helpful if we could see this information as it is passed to the LLM and this a major factor in the type of output that the LLM produces.

Hello! I really like that idea although I think some formatting would be required to make it useful to the student right now here is an example of what the traceback looks like when given to the LLM. It is a list of dictionaries that contain the values about each test that failed.
[{'test_path': 'tests/test_question_one.py::test_find_maximum_value', 'source_file': 'questions/question_one.py', 'tested_function': 'find_maximum_value', 'full_traceback': 'E AssertionError: Maximum positive value in matrix\n assert 9 == 0', 'error_type': 'AssertionError', 'error_message': 'Maximum positive value in matrix', 'stack_trace': [], 'variables': {}, 'assertion_detail': 'assert 9 == 0', 'expected_value': 0, 'actual_value': 9}, {'test_path': 'tests/test_question_one.py::test_find_average_value', 'source_file': 'questions/question_one.py', 'tested_function': 'find_average_value', 'full_traceback': 'E AssertionError: Average value in matrix with positive numbers\n assert 5.0 == 0.06111111111111111', 'error_type': 'AssertionError', 'error_message': 'Average value in matrix with positive numbers', 'stack_trace': [], 'variables': {}, 'assertion_detail': 'assert 5.0 == 0.06111111111111111', 'expected_value': 0.06111111111111111, 'actual_value': 5.0}]
In order for this to be usable information for a student I imagine giving the full_traceback would be the most helpful along with the function name. However it may not be understandable in this format on the user end so we could use f-string sto make it a phrase or phrases that could be understood by the user.

--
I would also like to note what the function aspect of the code is now being given to the LLM. It literally looks like this. A list of functions that are lists that each item is a line in the function as a string.
[['def find_maximum_value(matrix: List[List[int]]) -> Union[int, None]:', '"""Return the maximum value in the provided matrix."""', '# confirm that there is a value in the [0][0] position', 'if not matrix or not matrix[0]:', 'return None', 'maximum_value = matrix[0][0]', 'for row in matrix:', 'for value in row:', 'if value > maximum_value:', 'maximum_value = value', 'maximum_value = 0', 'return maximum_value'], ['def find_average_value(matrix: List[List[int]]) -> Union[float, None]:', '"""Find the average value in the provided matrix."""', '# check if the matrix is empty', 'if not matrix or not matrix[0]:', 'return None', '# initialize sum and count variables', 'total_sum = 10', 'count = 0', '# iterate over the matrix to calculate the sum and count the number of elements', 'for row in matrix:', 'for value in row:', 'total_sum += value', 'count += 100', '# calculate and return the average value', 'return total_sum / count']]

CalebKendra · 2024-11-21T20:15:13Z

Hi @CalebKendra and @rebekahrudd and others who reviewed this PR, can you please respond to @PCain02 about whether or not running the test suite passes correctly on your operating system and setup? (It should, however, we've had some problems with running the tests in GitHub Actions being different than running the tests on developer workstations). Thanks!

@PCain02 I just tested the pytest suite with Ubuntu and it passed.

rebekahrudd · 2024-11-23T03:14:24Z

Hi @rebekahrudd thanks for running this on Linux, I appreciate your help. Do you think that the advice that the coding mentor provides is now more helpful?

Yes this is much more helpful now!

rebekahrudd · 2024-11-23T03:15:01Z

Thank you! And when you guys ran poetry run task test it worked as well?

Yes! It worked when I ran the poetry run task test:

rebekahrudd

Here is the output I see in Linux, looks good!

PCain02 · 2024-12-02T15:39:24Z

LGTM! This adds lots of needed depth to the LLM generation

@CalebKendra Would you mind approving my changes then? Thank you!

hannahb09

I am on linux and it looks good to me.

PCain02 added 14 commits October 30, 2024 22:54

feat: improve traceback feature

08e5566

fix: bug in import stategy picking up pytest

9d79a5f

feat: add the ability to see tested functions

b365e91

feat: correctly identify which function fails each test

e79c7aa

fix: remove hard coded test paths and variables

d00048d

feat: add extract function and give it to llm as list of lists

5486a42

fix: remove other strategies and strategy statement

74f61ba

fix: move extract function to make more sense

c09f5c9

feat: add print

369efef

feat: clean up comments

dec6f19

feat: add test cases for the new extract functions

c3ecc35

fix: remove test files and add auto removal in tests

eb938f2

feat: delete debug print statements

7643e68

feat: improve LLM prompting

cece26e

PCain02 added enhancement New feature or request test Test cases or test running labels Nov 10, 2024

PCain02 mentioned this pull request Nov 10, 2024

Bug: CodingMentor Suggests Modifying Test Suite Instead of Fixing Buggy Code #33

Open

PCain02 added 5 commits November 10, 2024 13:00

lint: lint files

46a4152

lint: organize imports

f783e58

lint: fix imports

f573a5a

fix: too many branches in main fix

218faf0

lint: ruff lint

8507ee7

PCain02 added the bug Something isn't working label Nov 10, 2024

PCain02 and others added 7 commits November 11, 2024 13:31

feat: add debug statements for extract

5421388

Merge branch 'GatorEducator:main' into traceback_to_llm

fafbb8c

fix: make 2 helper functions for extract tracebacks to help with bran…

c3bbd2b

…ches

lint: remove debug print statements

d35dc9a

lint: ruff format

89eebd9

lint: ruff lint format

9908827

fix: ruff fix imports

4174cf9

Merge branch 'main' into traceback_to_llm

4323ef8

PCain02 added 2 commits November 13, 2024 17:26

lint: ruff format advise

67ebecb

chore: update coverage report

be5b174

PCain02 mentioned this pull request Nov 13, 2024

Improvement Suggestion: AI Coding Mentor Line-by-Line Help #7

Open

PCain02 and others added 3 commits November 13, 2024 18:35

chore: update toml file

7d548ac

feat: Update pyproject.toml to v0.3.4

dd45226

chore: Update pyproject.toml

2458332

CalebKendra previously approved these changes Nov 14, 2024

View reviewed changes

coverage.json Outdated Show resolved Hide resolved

rebekahrudd previously approved these changes Nov 15, 2024

View reviewed changes

fix: Delete coverage.json

874ec81

Merge branch 'main' into traceback_to_llm

e475565

PCain02 dismissed stale reviews from rebekahrudd and CalebKendra via e475565 November 21, 2024 21:55

Fix: Update pyproject.toml alphabetize authors

d8b764f

rebekahrudd approved these changes Nov 23, 2024

View reviewed changes

boulais01 requested review from CalebKendra and gkapfham December 2, 2024 20:17

boulais01 approved these changes Dec 5, 2024

View reviewed changes

hannahb09 approved these changes Dec 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Provide traceback and function source code to LLM advice model #48

feat: Provide traceback and function source code to LLM advice model #48

PCain02 commented Nov 10, 2024 •

edited

Loading

PCain02 commented Nov 13, 2024

PCain02 commented Nov 13, 2024

CalebKendra left a comment

rebekahrudd left a comment

PCain02 commented Nov 16, 2024

gkapfham commented Nov 20, 2024

gkapfham commented Nov 20, 2024

gkapfham commented Nov 20, 2024

PCain02 commented Nov 21, 2024 •

edited

Loading

CalebKendra commented Nov 21, 2024

rebekahrudd commented Nov 23, 2024

rebekahrudd commented Nov 23, 2024

rebekahrudd left a comment

PCain02 commented Dec 2, 2024 •

edited

Loading

hannahb09 left a comment

feat: Provide traceback and function source code to LLM advice model #48

Are you sure you want to change the base?

feat: Provide traceback and function source code to LLM advice model #48

Conversation

PCain02 commented Nov 10, 2024 • edited Loading

Pull Request

PCain02 commented Nov 13, 2024

PCain02 commented Nov 13, 2024

CalebKendra left a comment

Choose a reason for hiding this comment

rebekahrudd left a comment

Choose a reason for hiding this comment

PCain02 commented Nov 16, 2024

gkapfham commented Nov 20, 2024

gkapfham commented Nov 20, 2024

gkapfham commented Nov 20, 2024

PCain02 commented Nov 21, 2024 • edited Loading

CalebKendra commented Nov 21, 2024

rebekahrudd commented Nov 23, 2024

rebekahrudd commented Nov 23, 2024

rebekahrudd left a comment

Choose a reason for hiding this comment

PCain02 commented Dec 2, 2024 • edited Loading

hannahb09 left a comment

Choose a reason for hiding this comment

PCain02 commented Nov 10, 2024 •

edited

Loading

PCain02 commented Nov 21, 2024 •

edited

Loading

PCain02 commented Dec 2, 2024 •

edited

Loading