Description

This repository contains a Python script called pdf_summarization_app.py that allows users to summarize a PDF document using natural language processing. The script uses the gradio library for building a simple web interface for users to input the PDF file path and a custom prompt for summarization.

Installation

To use the pdf_summarization_app.py script, follow these steps:

Clone this repository to your local machine using the following command:

git clone https://github.com/your_username/pdf_summarization_app.git

Create the conda environment:

conda env create -f environment.yml

You can also just install the required libraries using the following command:

pip install -r requirements.txt

Usage

To use the pdf_summarization_app.py script, follow these steps:

Open a terminal and navigate to the pdf_summarization_app directory.
Run the script using the following command:

python pdf_summarization_app.py
Open a web browser and navigate to http://localhost:7860/.
Enter the path to the PDF file and a custom prompt for summarization.
Click the "Summarize" button to generate the summary and the "Custom Summarize" button to generate custom summary.

Notebooks

This repository also contains two Jupyter notebooks entitled summarization_with_langchain.ipynb and talk_to_pdf_with_langchain.ipynb. These notebooks provide additional examples and functionality for summarizing PDF documents using natural language processing.

Custom Summarization

I added a specialized app to construct custom summaries.

Interactive Text Chunk Visualization

I added a text chunk visualization for the summarization app:

Running the custom summarization app

Just run:

streamlit run ./custom_summarization_app.py

Credits

This project was built using the gradio, langchain and streamlit librairies.

TODOs

Add option to produce multiple summaries
Create github repo
Add custom summarization
Add interactive tool to debug chunk size
Add a free option
Add an option to estimate cost in the case of using paid API like ChatGPT
Add token count option
Add option to create summaries for multiple papers inside a folder
Integrate map prompt and combine prompt

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
__pycache__		__pycache__
2023-06-12-12-07-40.png		2023-06-12-12-07-40.png
README.md		README.md
app_highlight.py		app_highlight.py
custom_summarization_app.py		custom_summarization_app.py
custom_summarization_app_streamlit_version.py		custom_summarization_app_streamlit_version.py
doc_data.txt		doc_data.txt
environment.yml		environment.yml
paper.pdf		paper.pdf
paper01.pdf		paper01.pdf
paper02.pdf		paper02.pdf
paper03.pdf		paper03.pdf
paper04.pdf		paper04.pdf
pdf_summarization_app.py		pdf_summarization_app.py
requirements.txt		requirements.txt
summarization_with_langchain.ipynb		summarization_with_langchain.ipynb
talk_to_pdf_with_langchain.ipynb		talk_to_pdf_with_langchain.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Installation

Usage

Notebooks

Custom Summarization

Interactive Text Chunk Visualization

Running the custom summarization app

Credits

TODOs

About

Releases

Packages

Languages

EnkrateiaLucca/summarization_with_langchain

Folders and files

Latest commit

History

Repository files navigation

Description

Installation

Usage

Notebooks

Custom Summarization

Interactive Text Chunk Visualization

Running the custom summarization app

Credits

TODOs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages