ExtractTable - API to extract tabular data from images and scanned PDFs

The motivation is to make it easy for developers to extract tabular data from images or scanned PDF files without worrying about the table area, column coordinates, rotation et al.


Before we talk/boast about the service, a developer MUST need an API key to use the ExtractTable service. FREE credits here - check data privacy in FAQ.


pip install -U ExtractTable

Basic Usage

from ExtractTable import *
et_sess = ExtractTable(api_key=YOUR_API_KEY)        # Replace your VALID API Key here
print(et_sess.check_usage())        # Checks the API Key validity as well as shows associated plan usage
table_data = et_sess.process_file(filepath=Location_of_Image_with_Tables, output_format="df")

# To process PDF, make use of pages ("1", "1,3-4", "all") params in the read_pdf function
table_data = et_sess.process_file(filepath=Location_of_PDF_with_Tables, output_format="df", pages="all")

Detailed Python Notebook Here

Pull Requests & Rewards

Pull requests are most welcome and greatly appreciated with API credits.


This project is licensed under the Apache License 2.0, see the LICENSE file for details.

