aoptk.literature.convert_pdf_scan

Classes

ConvertPDFScan

Abstract base class for converting PDF scans to text.

Module Contents

class aoptk.literature.convert_pdf_scan.ConvertPDFScan[source]

Bases: abc.ABC

Abstract base class for converting PDF scans to text.

abstractmethod convert_pdf_scan(image64: str, mime_type: str) str[source]

Return converted text data.

Parameters:
  • image64 – Base64-encoded string of the PDF scan image.

  • mime_type – MIME type of the image (‘image/png’).