forked from georgia-tech-db/evadb
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Verified that issue georgia-tech-db#1067 is resolved and added docume…
…ntation for load pdf functionality
- Loading branch information
Lohith K S
authored and
Lohith K S
committed
Nov 7, 2023
1 parent
64219f1
commit c552137
Showing
2 changed files
with
17 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
LOAD PDF | ||
========== | ||
|
||
.. _load-pdf: | ||
|
||
.. code:: mysql | ||
LOAD PDF 'test_pdf.pdf' INTO MyPDFs; | ||
PDFs can be directly imported into a table, where the PDF document is segmented into pages and paragraphs. | ||
Each row in the table corresponds to a paragraph extracted from the PDF, and the resulting table includes columns for ``name`` , ``page``, ``paragraph``, and ``data``. | ||
|
||
| ``name`` signifies the title of the uploaded PDF. | ||
| ``page`` signifies the specific page number from which the data is retrieved. | ||
| ``paragraph`` signifies the individual paragraph within a page from which the data is extracted. | ||
| ``data`` refers to the text extracted from the paragraph on the given page. |