Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NotImplementedError: File format not supported #63

Open
jatinchhabriya opened this issue Aug 22, 2024 · 5 comments
Open

NotImplementedError: File format not supported #63

jatinchhabriya opened this issue Aug 22, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@jatinchhabriya
Copy link

jatinchhabriya commented Aug 22, 2024

Describe the bug

NotImplementedError: File format not supported
I am facing this error where the expected result is the list of tables
However I am getting a Implementation exception instead
One of the maintainers of the package @bosd suggested using the fork branch pypdf-table-extraction main branch, however it is not clear how to import the package the way I earlier imported camelot-py[cv] as mentioned in the following steps to reproduce.

Steps to reproduce the bug

%pip install camelot-py[cv]
import camelot.io
from camelot.io import read_pdf

@bosd
Copy link
Collaborator

bosd commented Aug 22, 2024

We're currently still updating the documentation. The rebranding has not been completed yet.
However the api has not been changed. So the package published on pypi is a direct compatible replacement of camelot.
https://pypi.org/project/pypdf-table-extraction/

You can install it with pip install pypdf-table-extraction

@jatinchhabriya
Copy link
Author

@bosd Can you provide an implementable example starting with how to import and use the read_pdf functionality?

@bosd
Copy link
Collaborator

bosd commented Aug 22, 2024

The example from the docs are still accurate.
The only thing to change afaik is the pip install command.

@jatinchhabriya
Copy link
Author

@bosd So as I gather you have rebranded to pypdf-table-extraction, pip installing the package did not fix the NotImplemented Error. I don't see a confirmation from Kushal before the issue reported two weeks ago was closed out on the topic branch
Please also confirm the compatible versions of opencv, ghostscript, pandas @MartinThoma

@bosd
Copy link
Collaborator

bosd commented Aug 22, 2024

Did you follow the quickstart as a test?
Example (from the readme):

>>> import camelot
>>> tables = camelot.read_pdf('foo.pdf')
>>> tables
<TableList n=1>
>>> tables.export('foo.csv', f='csv', compress=True) # json, excel, html, markdown, sqlite
>>> tables[0]
<Table shape=(7, 7)>
>>> tables[0].parsing_report
{
    'accuracy': 99.02,
    'whitespace': 12.24,
    'order': 1,
    'page': 1
}
>>> tables[0].to_csv('foo.csv') # to_json, to_excel, to_html, to_markdown, to_sqlite
>>> tables[0].df # get a pandas DataFrame!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants