Receipt OCR Using Python [Guide 2025]

FLAT FIELDS	VALUE
Total	28.23
Establishment	Walmart
Date	2021-05-27 21:04:30
Address	Walmart Supercenter, 1001 Warrior Way, Quincy, WV 25015
Payment Method	Debit

FLAT FIELDS

VALUE

Total

28.23

Establishment

Walmart

Date

2021-05-27 21:04:30

Address

Walmart Supercenter, 1001 Warrior Way, Quincy, WV 25015

Payment Method

Debit

FLAT FIELDS	VALUE
Total	28.23
Establishment	Walmart
Date	2021-05-27 21:04:30
Address	Walmart Supercenter, 1001 Warrior Way, Quincy, WV 25015
Payment Method	Debit

FLAT FIELDS

VALUE

Total

28.23

Establishment

Walmart

Date

2021-05-27 21:04:30

Address

Walmart Supercenter, 1001 Warrior Way, Quincy, WV 25015

Payment Method

Debit

Last Updated on June 26, 2025

Tabscanner simplifies extracting data from receipts and invoices with its cutting-edge OCR technology. This blog demonstrates how to integrate Tabscanner’s API into your Python backend to process receipts and retrieve structured data in JSON format.

Leveraging short polling, Tabscanner efficiently handles the processing of receipt images, with results typically available in about 5 seconds. Let’s explore how to implement this.

Why Use Tabscanner?

Tabscanner supports the following features:

Uploads: Process images in JPG or PNG format, including smartphone photos or screenshots.
Language Support: Handle multiple languages and character sets.
Data Extraction: Retrieve totals, tax breakdowns, line items, merchant details, and more.

Prerequisites

Tabscanner API Key: Obtain this from your Tabscanner account.
Python Environment: Install Python and the requests library (pip install requests).
Backend Integration: Tabscanner is designed for server-side use, not direct integration with mobile apps.

Find out more about Tabscanner OCR

Step 1: Upload a Receipt for Processing

The first step is submitting a receipt image to the /process endpoint. This returns a token that you’ll use to poll for results.

Code Example


    import requests
    
    # API Configuration
    
    API_KEY = "your_api_key_here"
    PROCESS_ENDPOINT = "https://api.tabscanner.com/api/2/process"
    
    def upload_receipt(file_path):
    
        """
        Upload a receipt image to Tabscanner for processing.
        Returns a token to poll for results.
        """
    
        with open(file_path, 'rb') as file:
            response = requests.post(
                PROCESS_ENDPOINT,
                headers={"apikey": API_KEY},
                files={"file": file}
            )
    
        if response.status_code == 200:
            token = response.json().get("token")
            print(f"Token: {token}")
            return token
        else:
            print(f"Error uploading receipt: {response.status_code}, {response.text}")
            return None

Step 2: Poll for Results Using the Token

Once you have the token, poll the /result endpoint to check if the receipt processing is complete. Polling every second is recommended after an initial delay of about 5 seconds.

Code Example


    import time
    
    RESULT_ENDPOINT_BASE = "https://api.tabscanner.com/api/result/"
    
    def poll_for_result(token):
    
        """
        Poll Tabscanner's result endpoint using the token until processing is complete.
        Returns the extracted data as a JSON object.
        """
    
        polling_url = f"{RESULT_ENDPOINT_BASE}{token}"
    
        while True:
    
            response = requests.get(polling_url, headers={"apikey": API_KEY})

            if response.status_code == 200:
    
                result_data = response.json()
                status = result_data.get("status")

                if status == "done":
                    print("Processing complete!")
                    return result_data.get("result")
                elif status == "pending":
                    print("Processing... retrying in 1 second.")
                    time.sleep(1)
                else:
                    print(f"Unexpected status: {status}")
                    return None
            else:
                print(f"Error polling for result: {response.status_code}, {response.text}")
                return None

Step 3: Combine Upload and Polling

Integrate the upload and polling functions into a complete workflow.

Code Example


    def process_receipt(file_path):
    
        """
        Upload a receipt and retrieve its processed data.
        """
    
        print("Uploading receipt...")
        token = upload_receipt(file_path)
    
        if not token:
            print("Failed to start receipt processing.")
            return None
    
        print("Polling for results...")
        result = poll_for_result(token)
    
        if result:
            print("Receipt Data Retrieved:")
            print(result)
        else:
            print("Failed to retrieve receipt data.")

Step 4: Run the Script

Provide the path to your receipt image and process it.


    if __name__ == "__main__": 
         receipt_file = "path/to/your/receipt.jpg" 
         process_receipt(receipt_file)

Example Output

Once the receipt processing is complete, the API returns a JSON object with structured data like:


    {
        "establishment": "SuperMart",
        "date": "2025-01-01 14:32:00",
        "total": 45.67,
        "subTotal": 41.23,
        "tax": 4.44,
        "lineItems": [
            {"desc": "Apple", "qty": 3, "price": 1.5, "lineTotal": 4.5},
            {"desc": "Milk", "qty": 1, "price": 2.5, "lineTotal": 2.5}
        ]
    }

Error Handling

The API provides detailed error codes. Here are some common ones:

400: API key not found.
403: No file detected.
405: Unsupported file
type.
500: OCR Failure.

Use the response’s message and status_code attributes for debugging.

Tips for Improving Results

Image Quality: Ensure the receipt is well-lit and in focus.
Format Guidance: Use images with dimensions greater than 720×1280 for best results.
Custom Configurations: Contact Tabscanner for advanced features like custom fields or line-item resolution.

Conclusion

Tabscanner’s receipt OCR API is a powerful tool for extracting data from receipts with
minimal setup. By following this guide, you can integrate it into your Python backend and streamline your data
processing workflows.

For more details, visit the Tabscanner Documentation. 🚀

Happy coding!

CLICK HERE TO START USING TABSCANNER API

Tags: OCR Technology Tabscanner

Receipt OCR using Python [Guide 2025]

Why Use Tabscanner?

Prerequisites

Step 1: Upload a Receipt for Processing

Code Example

Step 2: Poll for Results Using the Token

Code Example

Step 3: Combine Upload and Polling

Code Example

Step 4: Run the Script

Example Output

Error Handling

Tips for Improving Results

Conclusion

Related Posts

Tabscanner Comparison Vs the Receipt OCR API Top Tier by AI

Which AI Technologies Does Tabscanner Use?