Document++

Overview

Document++ is a Django-based web application that allows users to upload documents in .docx or .pdf format to perform functions, currently:

Summarization: Summarize long documents into concise versions.
Spelling Correction: Correct spelling errors in the uploaded documents.

This application uses various libraries to handle document processing, summarization, and spelling correction.

Features

Upload Support: Supports .docx and .pdf document uploads.
Summarization: Uses the Latent Semantic Analysis (LSA) algorithm to provide concise summaries.
Spelling Correction: Automatically detects and corrects spelling mistakes in documents.
Download: Provides an option to download the corrected document with spelling corrections applied.

Dependencies

Core Dependencies

Django: The web framework that powers the application.
- To install Django, run:
```
pip install django
```

Document Handling

Fitz (PyMuPDF): Used for reading and processing PDF files.
- Install it using:
```
pip install pymupdf
```
python-docx: For reading and writing .docx files.
- Install it using:
```
pip install python-docx
```

Summarization

Sumy: Provides several summarization algorithms, including Latent Semantic Analysis (LSA).
- Install Sumy:
```
pip install sumy
```
nltk.punkt: Used for sentence tokenization during summarization.
- Install NLTK and the required Punkt package:
```
pip install nltk
```
  Additionally, download the Punkt tokenizer models:
```
import nltk
nltk.download('punkt')
```

Spelling Correction

TextBlob or other spell-checking libraries (if used) for spelling correction. If TextBlob is being used, you can install it using:
```
pip install textblob
```

Other Dependencies

Tempfile: Used for creating temporary files during processing.
io: Provides I/O operations for handling file streams.

Additional Packages

time: Used for timing operations.
NLTK: Provides the punkt tokenizer required for summarization.

Setting Up the Project

Clone the Repository:

git clone <your-repository-url>
cd document-improver

Install Required Dependencies: Install all required dependencies using pip:
```
pip install django pymupdf python-docx sumy nltk
```
Run the Development Server: Start the Django development server:
```
python manage.py runserver
```
Access the Application: Open your browser and navigate to:
```
http://127.0.0.1:8000
```

Usage

Upload a .docx or .pdf document.
Select whether you want to summarize or correct spelling in the document.
Click the respective button, and the app will process the document.
Download the summarized or corrected document.

Screenshots

Future Expansions

I plan to add the following features:

Markup Support: To be able to markup the document(s) with text, shapes, and/or drawings.
Document Bot: An AI bot which answers questions from the uploaded document.
PDF Merge: Feature to merge 2 or more PDFs.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
docapp		docapp
seproject		seproject
.gitignore		.gitignore
README.md		README.md
db.sqlite3		db.sqlite3
manage.py		manage.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document++

Overview

Features

Dependencies

Core Dependencies

Document Handling

Summarization

Spelling Correction

Other Dependencies

Additional Packages

Setting Up the Project

Usage

Screenshots

Future Expansions

About

Releases

Packages

Contributors 2

Languages

syed-ateeb-naveed/Document-Plus

Folders and files

Latest commit

History

Repository files navigation

Document++

Overview

Features

Dependencies

Core Dependencies

Document Handling

Summarization

Spelling Correction

Other Dependencies

Additional Packages

Setting Up the Project

Usage

Screenshots

Future Expansions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages