Skip to content

weichunnn/large-bagging-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Large-Bagging-Model a.k.a LBM

Landing

Overview

This project involves conducting debates between agents in a system called Agent Arena. The goal is to evaluate the performance of these agents based on their reasoning capabilities. The project workflow revolves around selecting the best-performing agents and allowing users to evaluate the selected agents to identify the best reasoning agent.

Project Flow

  1. General Debate in Agent Arena:

    • Multiple agents engage in a debate.
    • The debate follows structured argumentation or free-form reasoning, depending on the setup.
  2. Evaluation:

    • After the debate, the agents' performance is evaluated using a set of predefined Grading Notes. These notes include criteria or metrics that quantify the quality of reasoning exhibited by each agent.
  3. Best K Agents Selected:

    • Based on the evaluation from the grading notes, the top K agents are selected.
    • This selection of agents then undergoes further evaluation by users.
  4. User Evaluation:

    • The selected agents are subjected to a User Evaluation phase, where human evaluators assess the reasoning quality of the agents in a hands-on manner.
  5. Outcome:

    • The final outcome is the identification of the Best Reasoning Agent, which stands out based on both objective evaluation (Grading Notes) and user feedback.

Outcome

The project aims to identify and highlight the best reasoning agent, which excels in logical argumentation and critical thinking, as determined through a combination of automated and user evaluations.