Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cot is after the answer #15

Open
Tostino opened this issue Jul 29, 2024 · 6 comments
Open

Cot is after the answer #15

Tostino opened this issue Jul 29, 2024 · 6 comments

Comments

@Tostino
Copy link

Tostino commented Jul 29, 2024

I see in quite a few examples, there is essentially a cot for doing math steps. It happens after the final answer is provided. This is much harder for the model to learn than having the cot come before the final answer. It also doesn't help with accuracy as much, because the influence of the final answer is already in the context window before it starts doing the steps for the calculation.

@N8python
Copy link
Contributor

This is ideally a human readable database, so answer -> justification might have been included as human readers have no problem. However, your point is well taken - maybe we should make a future contributing guideline to have:

cot
answer

@Tostino
Copy link
Author

Tostino commented Jul 31, 2024

I'd err on the "fix the examples that have incorrect ordering" side of things if it were me, as it makes downstream data generation more likely to have a usable format. There aren't that many examples at this point, so easier to do it now than later when there are many more in this repo.

That's just my $0.02 though.

@teknium1
Copy link
Contributor

teknium1 commented Aug 4, 2024

I'd err on the "fix the examples that have incorrect ordering" side of things if it were me, as it makes downstream data generation more likely to have a usable format. There aren't that many examples at this point, so easier to do it now than later when there are many more in this repo.

That's just my $0.02 though.

If you have some time would you want to help us reorder the cot in examples to come first? thanks!

@Tostino
Copy link
Author

Tostino commented Aug 8, 2024

I'll take a stab. Building a little classifier to help identify them is going to be my first step, then i'll do some manual corrections and try and automate fixing them.

It honestly seems like quite a useful tool if actually fleshed out.

@Tostino
Copy link
Author

Tostino commented Aug 18, 2024

After working with gpt4-o and claude-3.5-sonnet on this issue, i'm pretty convinced that they are intentionally structuring their data outputs by default this way to confound training other models on totally synthetic data.

The commercial models are just so inclined to put the reasoning steps after saying something that will already steer it's output.

I have a pretty good dataset of examples so far though, and now have a pipeline that can identify other instances of post-hoc reasoning somewhat reliably. Going to keep working on it a bit more.

@Tostino
Copy link
Author

Tostino commented Sep 3, 2024

Alright. My initial attempt at fine-tuning llama3.1 8b didn't work well enough with my small sample size. I changed my approach and went with a very detailed, structured system prompt, and improved accuracy a ton, but still not quite enough to get accurate results for my complicated prompt/responses with llama3.1 8b or gpt-4o-mini. 3.5 Sonnet or gpt-4o work fine with it as long as the query/response aren't too large. Will be building a golden set of these responses and attempting another training run on a small model. I want to be able to run this type of analysis on large datasets, and that is vastly too expensive with Anthropic or OpenAI.

Anyways, here is the system prompt with some integrated examples:
reasoning-prompt-enhanced.md

Here is how you would prompt it in a single user prompt section (may need to change the user/assistant identifiers for use with OpenAI)

<|user|>
{prompt}
<|assistant|>
{response}
<|user|>
Can you analyze the reasoning in your prior response?

A full example would look like:

<|user|>
Identify any sarcasm or irony in this scenario:
A student who never studies brags to his friends, "I'm so prepared for this exam, I didn't even open a book!"
<|assistant|>
This scenario contains irony. The student's statement is sarcastic because they are claiming to be prepared for the exam while simultaneously admitting to not studying at all. The irony lies in the contradiction between being prepared and not studying, as these typically don't go together. The student is likely using this sarcastic statement to mask their lack of preparation or to appear nonchalant about the exam.
<|user|>
Can you analyze the reasoning in your prior response?

Response from Claude 3.5 Sonnet:

analysis:
  input:
    user_query:
      q1:
        text: "Identify any sarcasm or irony in this scenario: A student who never studies brags to his friends, \"I'm so prepared for this exam, I didn't even open a book!\""
        type: "literary analysis"
    instructions: []

  response_markers:
    r1:
      start: "This scenario contains irony."
      end: "The student is likely using this sarcastic statement to mask their lack of preparation or to appear nonchalant about the exam."

  response_structure:
    r1:
      - content: "This scenario contains irony."
        part_type: "conclusion"
      - content: "The student's statement is sarcastic because they are claiming to be prepared for the exam while simultaneously admitting to not studying at all."
        part_type: "explanation"
      - content: "The irony lies in the contradiction between being prepared and not studying, as these typically don't go together."
        part_type: "analysis"
      - content: "The student is likely using this sarcastic statement to mask their lack of preparation or to appear nonchalant about the exam."
        part_type: "interpretation"

  individual_analysis:
    a1:
      question_id: "q1"
      response_id: "r1"
      reasoning_analysis:
        context_considerations:
          - factor: "Literary analysis question"
            impact: "Required identification and explanation of literary devices (sarcasm and irony)"
        issues:
          - description: "Conclusion stated before reasoning"
            severity: 3
            explanation: "The response begins with the conclusion that the scenario contains irony before providing the reasoning behind this assessment."
        criteria_violated:
          - "Conclusion presented before supporting evidence"
        criteria_met: 
          - "Identifies both sarcasm and irony"
          - "Explains the contradiction that creates the irony"
          - "Provides interpretation of the student's possible motives"
        classification: "Problematic Reasoning"
        severity: 3
        justification: "While the response accurately identifies and explains the sarcasm and irony in the scenario, it presents the conclusion upfront before providing the supporting analysis. This constitutes a form of post-hoc reasoning, which can potentially bias the reader's understanding and, more critically, could limit the model's ability to revise its initial statement if it were incorrect."

  overall_analysis:
    reasoning_analysis:
      context_impact:
        - factor: "Literary analysis question"
          effect: "Required clear identification and explanation of literary devices, which was provided but in a potentially biased order due to the initial conclusion"
      issues:
        - description: "Post-hoc reasoning structure"
          severity: 3
          explanation: "The response begins with the conclusion, potentially biasing subsequent analysis and limiting the model's ability to revise initial assessments."
      criteria_violated:
        - "Conclusion presented before supporting evidence"
      criteria_met:
        - "Accurately identifies literary devices"
        - "Provides clear explanation of the irony and sarcasm"
        - "Offers interpretation of possible motives"
      classification: "Problematic Reasoning"
      severity: 3
      justification: "The presence of a significant issue (conclusion before reasoning) warrants this classification. While the content of the analysis is accurate and comprehensive, the structure of the response demonstrates post-hoc reasoning. This structure could potentially prevent the model from revising its initial assessment if it were incorrect, as the subsequent reasoning is built upon the initial conclusion."

  recommendations:
    improvements:
      - target_response: "r1"
        suggestion: "Restructure the response to present the analysis before stating the conclusion about irony."
        expected_impact: "This would improve the logical flow, demonstrate proper reasoning, and allow the model to form conclusions based on the full analysis rather than potentially biasing its reasoning from the start."
    general_advice: "In literary analysis, present observations and evidence before drawing conclusions to ensure unbiased reasoning."

  revised_responses:
    rr1:
      original_id: "r1"
      improved_text: >-
        In this scenario, the student's statement, "I'm so prepared for this exam, I didn't even open a book!" contains both sarcasm and irony.

        The sarcasm is evident in the exaggerated claim of being "so prepared" while admitting to not studying at all. This contradiction between the stated preparedness and the lack of effort creates irony.

        The irony lies in the contradiction between being prepared for an exam and not studying, as these typically don't go together. Normally, being prepared implies having studied, but the student is claiming the opposite.

        The student is likely using this sarcastic statement to mask their lack of preparation or to appear nonchalant about the exam. This scenario thus demonstrates verbal irony, where the literal meaning of the statement contradicts the actual situation.

  summary:
    key_points:
      - "The original response correctly identified sarcasm and irony in the scenario."
      - "Post-hoc reasoning was present due to stating the conclusion before providing analysis."
      - "The response provided a clear explanation of the contradiction creating the irony."
      - "An interpretation of the student's possible motives was included."
    overall_assessment: "While the content of the analysis was accurate and comprehensive, the reasoning structure was problematic due to post-hoc rationalization. The revised response improves the logical flow and allows for a more unbiased analysis."

  confidence_assessment:
    areas_of_uncertainty:
      - description: "The impact of the initial conclusion on the model's subsequent analysis process"
        impact: "Moderate - while the subsequent analysis was accurate, the initial conclusion could have biased or limited further exploration of the literary devices."
    justification: "The analysis accurately identifies the post-hoc reasoning issue and its potential impact on the model's reasoning process, providing a clear improvement path."
    overall_confidence: 0.9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants