How to Convert PDF to PDF/A with Python

Learn how to make a document PDF/A compliant using Python to call the Convert to PDF/A API Tool from pdfRest.

Share this page

Why Use Convert to PDF/A with Python?

The pdfRest Convert to PDF/A API Tool is a powerful resource that allows users to convert PDF files into the PDF/A format, which is an ISO-standardized version of the PDF specialized for digital preservation of electronic documents. This tutorial will guide you through the process of sending an API call to Convert to PDF/A using Python.

This can be particularly useful for archiving documents in a way that preserves their visual appearance over time, ensuring that they can be reliably accessed and rendered in the future.

Python Code Sample for PDF/A

from requests_toolbelt import MultipartEncoder
import requests
import json

pdfa_endpoint_url = 'https://api.pdfrest.com/pdfa'

# The /pdfa endpoint can take a single PDF file or id as input.
mp_encoder_pdfa = MultipartEncoder(
    fields={
        'file': ('file_name.pdf', open('/path/to/file', 'rb'), 'application/pdf'),
        'output_type': 'PDF/A-1b',
        'rasterize_if_errors_encountered': 'on',
        'output' : 'example_pdfa_out',
    }
)

# Let's set the headers that the pdfa endpoint expects.
# Since MultipartEncoder is used, the 'Content-Type' header gets set to 'multipart/form-data' via the content_type attribute below.
headers = {
    'Accept': 'application/json',
    'Content-Type': mp_encoder_pdfa.content_type,
    'Api-Key': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' # place your api key here
}

print("Sending POST request to pdfa endpoint...")
response = requests.post(pdfa_endpoint_url, data=mp_encoder_pdfa, headers=headers)

print("Response status code: " + str(response.status_code))

if response.ok:
    response_json = response.json()
    print(json.dumps(response_json, indent = 2))
else:
    print(response.text)

# If you would like to download the file instead of getting the JSON response, please see the 'get-resource-id-endpoint.py' sample.

Source: pdf-rest-api-samples on GitHub

Breaking Down the Python

The provided code block demonstrates how to make a multipart/form-data POST request to the pdfRest API to convert a PDF to PDF/A format.

from requests_toolbelt import MultipartEncoder
import requests
import json

This imports the necessary modules. MultipartEncoder from requests_toolbelt is used to encode the multipart form data.

pdfa_endpoint_url = 'https://api.pdfrest.com/pdfa'

This sets the API endpoint URL for the PDF/A conversion.

mp_encoder_pdfa = MultipartEncoder(
    fields={
        'file': ('file_name.pdf', open('/path/to/file', 'rb'), 'application/pdf'),
        'output_type': 'PDF/A-1b',
        'rasterize_if_errors_encountered': 'on',
        'output' : 'example_pdfa_out',
    }
)

Here we define the payload for the POST request. The fields include:

'file': The PDF file to convert. Replace '/path/to/file' with the actual file path.
'output_type': The type of PDF/A to convert to (e.g., PDF/A-1b).
'rasterize_if_errors_encountered': If set to 'on', the service will rasterize the PDF if it encounters errors during conversion.
'output': The desired name for the output file.

headers = {
    'Accept': 'application/json',
    'Content-Type': mp_encoder_pdfa.content_type,
    'Api-Key': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'
}

These are the headers for the request. Replace 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' with your actual API key.

response = requests.post(pdfa_endpoint_url, data=mp_encoder_pdfa, headers=headers)

This line sends the POST request with the encoded data and headers.

if response.ok:
    response_json = response.json()
    print(json.dumps(response_json, indent = 2))
else:
    print(response.text)

If the request is successful, the response is printed in a formatted JSON structure. If not, the error text is printed.

Beyond this Tutorial

We have now successfully made an API call to the pdfRest Convert to PDF/A endpoint using Python. This allows us to convert PDF documents to the archival PDF/A format programmatically. You can demo all of the pdfRest API Tools in the API Lab and refer to the API Reference documentation for more details.

Note: This is an example of a multipart API call. Code samples using JSON payloads can be found at our GitHub Repository.

Generate a self-service API Key now!

Create your FREE API Key to start processing PDFs in seconds, only possible with pdfRest.

Compare Plans

Related Solutions

Related Tutorials

How to Convert PDF to PDF/A with Python

Why Use Convert to PDF/A with Python?

Python Code Sample for PDF/A

Breaking Down the Python

Beyond this Tutorial