Skip to content

Commit

Permalink
Verified that issue georgia-tech-db#1067 is resolved and added docume…
Browse files Browse the repository at this point in the history
…ntation for load pdf functionality
  • Loading branch information
Lohith K S authored and Lohith K S committed Nov 7, 2023
1 parent 64219f1 commit c552137
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ parts:
- file: source/reference/evaql/load_csv
- file: source/reference/evaql/load_image
- file: source/reference/evaql/load_video
- file: source/reference/evaql/load_pdf
- file: source/reference/evaql/select
- file: source/reference/evaql/explain
- file: source/reference/evaql/show_functions
Expand Down
16 changes: 16 additions & 0 deletions docs/source/reference/evaql/load_pdf.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
LOAD PDF
==========

.. _load-pdf:

.. code:: mysql
LOAD PDF 'test_pdf.pdf' INTO MyPDFs;
PDFs can be directly imported into a table, where the PDF document is segmented into pages and paragraphs.
Each row in the table corresponds to a paragraph extracted from the PDF, and the resulting table includes columns for ``name`` , ``page``, ``paragraph``, and ``data``.

| ``name`` signifies the title of the uploaded PDF.
| ``page`` signifies the specific page number from which the data is retrieved.
| ``paragraph`` signifies the individual paragraph within a page from which the data is extracted.
| ``data`` refers to the text extracted from the paragraph on the given page.

0 comments on commit c552137

Please sign in to comment.