AI_README_Generator

AI_README_Generator is an AI-powered tool that automates GitHub repository analysis to create comprehensive README documentation.

This project leverages various technologies such as LangChain, LangGraph, FastHTML, ColBERT, tree-sitter parsers, and advanced semantic analysis to provide meaningful insights into repository contents. By orchestrating multi-step processes that seek confirmation from users, it not only refines results iteratively but also collects a self-evaluation dataset used to improve the system.

Currently, the system only supports Python projects, but there are plans to expand support to other languages.

System Architecture Flow

LangGraph Agentic Flow

Features

1. Cognitive Architecture for Agents with LangGraph

The system is designed using a graph-based cognitive architecture using LangGraph.

2. Code Retrieval

The system utilizes ColBERT, a neural information retrieval model based on cross encoder, to perform context-aware retrieval of documents and code snippets within the repository. This ensures that the most relevant information is identified and utilized for analysis and README generation.

3. Smart Chunking

By using tree-sitter parsers, the system chunk codes into semantically meaningful parts. Also, it implements “smart chunking” which includes relevant part of code such as class definition or function arguments to parts of code that doesn’t include them.

4. Human-in-the-Loop Feedback System

A key feature of AI_README_Generator is the integration of human feedback. Users can refine the system's output iteratively, improving the relevance and accuracy of the generated documentation.

5. Web Interface for User Interaction

A user-friendly web interface built with FastHTML allows users to input GitHub repository URLs for analysis. The system then analyzes the repository and presents the results, potentially generating a README file or other types of documentation. The feedback system enables an iterative process where users can provide feedback on the analysis, guiding the system to refine and improve its results over time.

How It Works

User Inputs GitHub Repository URL: The user provides a GitHub repository URL via the web interface.
Clone the repository and generate metadata: The system creates project metadata, including the directory tree and a list of packages used in the project.
Smart chunking and indexing with ColBERT: The system chunks codes and index them for ColBERT analysis.
Feedback and Refinement:The system analyzes the repository using predefined steps of instruction. For instance, the first instruction might be "What are the core Python packages?" The system then retrieves relevant code snippets and generates an answer to the question. Using the web application, the user can confirm or correct this analysis.
README Generation: Based on the analyzed content, the system may generate a README content that explain the repository's purpose, usage, and key components.

Technologies Used

LangChain/LangGraph: Orchestrates the agentic flow for code retrieval, analysis, and documentation generation.
FastHTML: Web development with python.
ColBERT(RAGatouille): A neural information retrieval model that enhances context-aware code retrieval.
Tree-sitter: Provides language-specific parsing capabilities for multi-language support.

How to run

# Initialize virtual environment
python -m venv venv
source venv/bin/activate

# Install required packages
pip install -r requirements.txt

# Run the application
python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
app		app
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
architecture_flow.png		architecture_flow.png
main.py		main.py
requirements.txt		requirements.txt
run_graph_locally.py		run_graph_locally.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI_README_Generator

System Architecture Flow

LangGraph Agentic Flow

Features

1. Cognitive Architecture for Agents with LangGraph

2. Code Retrieval

3. Smart Chunking

4. Human-in-the-Loop Feedback System

5. Web Interface for User Interaction

How It Works

Technologies Used

How to run

About

Releases

Packages

Contributors 2

Languages

minki-j/AI_README_Generator

Folders and files

Latest commit

History

Repository files navigation

AI_README_Generator

System Architecture Flow

LangGraph Agentic Flow

Features

1. Cognitive Architecture for Agents with LangGraph

2. Code Retrieval

3. Smart Chunking

4. Human-in-the-Loop Feedback System

5. Web Interface for User Interaction

How It Works

Technologies Used

How to run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages