Anyparser LangChain: Seamless Integration of Anyparser with LangChain

Integrate Anyparser's powerful content extraction capabilities with LangChain for enhanced AI workflows. This integration package enables seamless use of Anyparser's document processing and data extraction features within your LangChain applications, making it easier than ever to build sophisticated AI pipelines.

Installation

pip install anyparser-langchain

Setup

Before running the examples, make sure to set your Anyparser API credentials as environment variables:

export ANYPARSER_API_KEY="your-api-key"
export ANYPARSER_API_URL="https://anyparserapi.com"

Anyparser LangChain Examples

This examples directory contains examples demonstrating different ways to use the Anyparser LangChain integration.

python examples/01_single_file_json.py
python examples/02_single_file_markdown.py
python examples/03_multiple_files_json.py
python examples/04_multiple_files_markdown.py
python examples/05_load_folder.py
python examples/06_ocr_markdown.py
python examples/07_ocr_json.py
python examples/08_crawler.py

Examples

1. Single File Processing

01_single_file_json.py: Process a single file with JSON output
02_single_file_markdown.py: Process a single file with markdown output

2. Multiple File Processing

03_multiple_files_json.py: Process multiple files with JSON output
04_multiple_files_markdown.py: Process multiple files with markdown output
05_load_folder.py: Load and process all files from a folder (max 5 files)

3. OCR Processing

06_ocr_markdown.py: Process images/scans with OCR (markdown output)
07_ocr_json.py: Process images/scans with OCR (JSON output)

4. Web Crawling

08_crawler_basic.py: Basic web crawling with essential settings

Features Demonstrated

Document Processing

Different output formats (markdown, JSON)
Multiple file handling
Folder processing
Metadata handling

Web Crawling

Basic crawling with depth and scope control
Advanced URL and content filtering
Crawling strategies (BFS, LIFO)
Rate limiting and robots.txt respect

Notes

All examples use async/await for better performance
Error handling is included in all examples
Each example includes detailed comments explaining the options used
OCR examples support multiple languages
Crawler examples demonstrate various filtering and control options

Features Demonstrated

Different output formats (markdown, JSON)
OCR capabilities with language support
OCR performance presets
Image extraction
Table extraction
Metadata handling
Error handling
Async/await usage

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
.vscode		.vscode
anyparser_langchain		anyparser_langchain
changelogs		changelogs
docs		docs
examples		examples
tests		tests
.gitignore		.gitignore
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anyparser LangChain: Seamless Integration of Anyparser with LangChain

Installation

Setup

Anyparser LangChain Examples

Examples

1. Single File Processing

2. Multiple File Processing

3. OCR Processing

4. Web Crawling

Features Demonstrated

Document Processing

Web Crawling

Notes

Features Demonstrated

License

About

Releases 1

Packages

Languages

License

anyparser/anyparser_langchain

Folders and files

Latest commit

History

Repository files navigation

Anyparser LangChain: Seamless Integration of Anyparser with LangChain

Installation

Setup

Anyparser LangChain Examples

Examples

1. Single File Processing

2. Multiple File Processing

3. OCR Processing

4. Web Crawling

Features Demonstrated

Document Processing

Web Crawling

Notes

Features Demonstrated

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages