Skip to content

Latest commit

 

History

History
18 lines (10 loc) · 1.18 KB

README.md

File metadata and controls

18 lines (10 loc) · 1.18 KB

Nougat: Revolutionizing OCR for Scientific Documents

nuogat

About Nougat

Nougat is an advanced Transformer-based OCR model that simplifies the process of converting complex scientific documents, often stored in PDF format, into a common and machine-readable Markdown format. Developed by a team of experts, Nougat leverages state-of-the-art architecture and training techniques to make scientific knowledge more accessible and usable.

Key Features

  • Transformer Architecture: Nougat uses a Swin Transformer as a vision encoder and an mBART-based text decoder, allowing for end-to-end transcription of scientific PDFs.

  • End-to-End Training: With Nougat, there's no need for complex pipelines. The model takes raw pixels as input and generates Markdown text as output, simplifying the entire OCR process.

  • Bridging the Gap: Nougat not only transcribes scientific documents but also bridges the gap between human-readable content and machine-readable text, making it easier to access and utilize scientific knowledge.

    git clone https://github.com/inuwamobarak/nougat.git