Tag Archives: pdf

[Solved] itextpdf Read PDF File Error: Rebuild failed: trailer not found.

Recently, I used itextpdf to print invoices, but there was an error when reading the file stream. The following is the key code

ClassPathResource classPathResource = new ClassPathResource("/template/RU_HK_INVOICE_TEMPLATE.pdf");
InputStream inputStream = classPathResource.getInputStream();
reader = new PdfReader(inputStream);// read the template of pdf

The following errors are reported in the new PdfReader(inputStream) each time:

com.itextpdf.text.exceptions.InvalidPdfException: Rebuild failed: trailer not found.; Original message: xref subsection not found at file pointer

Maven will use pom XML configuration files uniformly encode the project, but some files do not need to be re encoded, such as PDF template files; After recoding, the PDF template structure may be damaged, resulting in the unavailability of the files generated after compilation, as shown in the following figure

Therefore, it is necessary to filter out the files that do not need to be encoded: filter all files with the suffix .pdf or .p8 and do not encode them uniformly. Need to be configured in the pom.xml file nonFilteredFileExtension tag

<!-- Filter the suffixes of files that do not need to be transcoded pdf -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-resources-plugin</artifactId>
                <version>3.0.1</version>
                <configuration>
                    <encoding>UTF-8</encoding>
                    <useDefaultDelimiters>false</useDefaultDelimiters>
                    <nonFilteredFileExtensions>
 						<nonFilteredFileExtension>pdf</nonFilteredFileExtension>
                    </nonFilteredFileExtensions>
                </configuration>
            </plugin>

[Solved] Pdfplumber Read PDF Sheet Error: AttributeError: function/symbol ‘ARC4_stream_init‘ not found in library

Pdfplumber reports an error when reading PDF table attributeerror: function/symbol ‘arc4_stream_init’ not found in library

Solutions to errors reported

Error reporting item

When using pdfplumber to extract tables in PDF, you will be prompted that arc4 is missing_stream_init。

Traceback (most recent call last):

  File "C:\Users\Stan\Python\ALIRT\pdf extracter\test.py", line 50, in <module>
    text = convert_pdf_to_txt('test_pdf.pdf')

  File "C:\Users\Stan\Python\ALIRT\pdf extracter\test.py", line 40, in convert_pdf_to_txt
    for page in PDFPage.get_pages(fp, pagenos, maxpages=maxpages, password=password,caching=caching, check_extractable=True):

  File "C:\Users\Stan\anaconda3\lib\site-packages\pdfminer\pdfpage.py", line 127, in get_pages
    doc = PDFDocument(parser, password=password, caching=caching)

  File "C:\Users\Stan\anaconda3\lib\site-packages\pdfminer\pdfdocument.py", line 564, in __init__
    self._initialize_password(password)

  File "C:\Users\Stan\anaconda3\lib\site-packages\pdfminer\pdfdocument.py", line 590, in _initialize_password
    handler = factory(docid, param, password)

  File "C:\Users\Stan\anaconda3\lib\site-packages\pdfminer\pdfdocument.py", line 283, in __init__
    self.init()

  File "C:\Users\Stan\anaconda3\lib\site-packages\pdfminer\pdfdocument.py", line 291, in init
    self.init_key()

  File "C:\Users\Stan\anaconda3\lib\site-packages\pdfminer\pdfdocument.py", line 304, in init_key
    self.key = self.authenticate(self.password)

  File "C:\Users\Stan\anaconda3\lib\site-packages\pdfminer\pdfdocument.py", line 354, in authenticate
    key = self.authenticate_user_password(password)

  File "C:\Users\Stan\anaconda3\lib\site-packages\pdfminer\pdfdocument.py", line 361, in authenticate_user_password
    if self.verify_encryption_key(key):

  File "C:\Users\Stan\anaconda3\lib\site-packages\pdfminer\pdfdocument.py", line 368, in verify_encryption_key
    u = self.compute_u(key)

  File "C:\Users\Stan\anaconda3\lib\site-packages\pdfminer\pdfdocument.py", line 326, in compute_u
    result = ARC4.new(key).encrypt(hash.digest())  # 4

  File "C:\Users\Stan\anaconda3\lib\site-packages\Crypto\Cipher\ARC4.py", line 132, in new
    return ARC4Cipher(key, *args, **kwargs)

  File "C:\Users\Stan\anaconda3\lib\site-packages\Crypto\Cipher\ARC4.py", line 60, in __init__
    result = _raw_arc4_lib.ARC4_stream_init(c_uint8_ptr(key),

  File "C:\Users\Stan\anaconda3\lib\site-packages\cffi\api.py", line 912, in __getattr__
    make_accessor(name)

  File "C:\Users\Stan\anaconda3\lib\site-packages\cffi\api.py", line 908, in make_accessor
    accessors[name](name)

  File "C:\Users\Stan\anaconda3\lib\site-packages\cffi\api.py", line 838, in accessor_function
    value = backendlib.load_function(BType, name)

AttributeError: function/symbol 'ARC4_stream_init' not found in library 'C:\Users\Stan\anaconda3\lib\site-packages\Crypto\Util\..\Cipher\_ARC4.cp37-win_amd64.pyd': error 0x7f

Solution:

Downgrade:

pip install pycryptodome==3.0.0

Two methods

Method 1:

# Installation
$ pip install arc4
# Import ARC4 package
from arc4 import ARC4

Method 2:

# Installation
$ pip install crypto
# Import ARC4 package
from Crypto.Cipher import ARC4