Recursive Summarizer with OpenAI LLM

This project implements a recursive summarizer using OpenAI's Language Learning Model (LLM) with Bun and TypeScript. The summarizer efficiently handles large text datasets by recursively dividing, processing, and summarizing them in batches.

Features

Initialize Input Text: Starts the summarization process by accepting raw text input.
Token Sanitation and Aggregation: Cleans and combines tokens for optimal processing.
Token Allocation and Distribution: Manages token counts and allocates them across multiple API calls.
Parallel API Completions: Executes multiple OpenAI API completions concurrently to maximize efficiency.
Token Segmentation: Divides tokens into segments suitable for batch processing.
Batch Completion Webhook Integration: Integrates with a webhook to manage batch processing asynchronously.
Batch Creation Handler: Facilitates the creation of multiple batches for parallel processing.
Batch Completion Awaiter: Waits for all batch completions before aggregating results.
Finalize Summary: Aggregates results to produce a comprehensive summary of the input text.
Async Batch API or Real-Time Completions: Offers flexibility to choose between using the async OpenAI batch API for handling large datasets efficiently, or real-time completions for quicker, on-the-fly summarizations.

Installation

To use the recursive summarizer, clone the repository and install the necessary dependencies:

git clone https://github.com/simonorzel26/recursive-summarizer.git
cd recursive-summarizer
bun install

Usage

To start the summarization process:

bun run src/summarizer.ts --input your_text_file.txt --output summary.txt

This command will process the input text file, summarize it using the recursive summarizer, and save the output in the specified file.

Configuration Options

Batch vs Real-Time: In the config.json file, you can specify whether to use async batch processing or real-time completions. Set useAsyncBatch to true for batch processing, or false for real-time processing.
Token Limits and Batch Sizes: Customize token limits, batch sizes, and other parameters in the config.json file according to your specific needs.

Development

If you're contributing or modifying the project:

Install Dependencies: Ensure all dependencies are installed with bun install.
Run Tests: Use bun test to run any existing tests.
Linting: Make sure your code adheres to the project's style guide by running bun lint.

Contributing

Contributions are welcome! Please fork this repository and submit a pull request with your changes.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
src		src
test		test
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.prettierrc		.prettierrc
README.md		README.md
bun.lockb		bun.lockb
nest-cli.json		nest-cli.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
recursive-summarization.jpg		recursive-summarization.jpg
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recursive Summarizer with OpenAI LLM

Features

Installation

Usage

Configuration Options

Development

Contributing

License

About

Releases

Packages

Languages

simonorzel26/openai-recursive-summarization

Folders and files

Latest commit

History

Repository files navigation

Recursive Summarizer with OpenAI LLM

Features

Installation

Usage

Configuration Options

Development

Contributing

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages