OiO.lk Community platform!

Oio.lk is an excellent forum for developers, providing a wide range of resources, discussions, and support for those in the developer community. Join oio.lk today to connect with like-minded professionals, share insights, and stay updated on the latest trends and technologies in the development field.
  You need to log in or register to access the solved answers to this problem.
  • You have reached the maximum number of guest views allowed
  • Please register below to remove this limitation

Error when decoding an attachment with python from an email obtained with the Gmail API

  • Thread starter Thread starter Adolfo Israel Ramírez Reséndiz
  • Start date Start date
A

Adolfo Israel Ramírez Reséndiz

Guest
I hope someone can help me. I am trying to download the attachments of an email where DMARC reports arrive from a domain, I download them using the Gmail API, according to the API documentation the file is downloaded through a base64 encoded string which I obtain correctly "apparently", but when I try to decode that string with python to obtain the file I get an error that says that the base64 string is not recognized as a valid string and it is possible that it is corrupt. To rule out that the chain is too long, I tried downloading it in parts and then joining it, but the error continues.

I attach the code I am using and the error that returns: (The base64 string is just an example)

Code:
import base64
import gzip
import io

# Base64 encoded gzip file
base64_gzip = "H4sIAAAAAAAAAKVTuW7cMBCt118RuJcoyl5bCzB0XKRMUrhLI1DUaJexeICkNsfXh5f2MIw0aSTOe6N584Yj8vRLzh-OYJ3Q6uMtrpvbJ3pDJoBxYPyV3mxIIWlTY4LWIOAWjLa-l-DZyDwL0IZou- 8Vk0Cfvzx___a1evn8QtAJjBkgmZip0c5L5jzYT0yyP1o5cDXXkqDMx8xSX4x07PCWtVtc8 YHz6v6ha6odG6AadtDhu2FiO94SdM6PX4eWoLdM7ZPshgywF4riR_zYdg_3TUNQRhIJakzUX SAjFeNYBF1VOUlcWCZGz4L_7s0yzMIdoIjr4EJRoSbtYM7GChZpNr4KSR1B-ZAgZ6aExHcEDLXwA7gnyKTYnQGXEcM9xbHbeIjApGmIwjO2-k5fYaJc29yh1T-zdacXy6EXhrbNru62dYu7ehv GeSZSHteLCnoE5UPCigYc2byEQaXK0b1w4X6Fj3uitILg_QIpOdG4YS5YXmeQTE4FLGM4G7kSCTeR-ydiBOXFJMJarld5hFkb6Cer5fUNXFMp-wBsBPtO7iWRBN8IEbb4Q2_BLbMvyhc-_n3_abfjh8V rCbLdUxWyjuC_yq3rhN70G9PyMoSNWX_2v5P8j0QNBAAA"

# Ensure the base64 string has the correct padding
missing_padding = len(base64_gzip) % 4
if missing_padding:
    base64_gzip += '=' * (4 - missing_padding)

# Decode the base64 string
decoded_gzip = base64.b64decode(base64_gzip)

# Decompress the gzip file
with gzip.GzipFile(fileobj=io.BytesIO(decoded_gzip)) as f:
    decompressed_data = f.read().decode('utf-8')

decompressed_data

Code:
---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
Cell In[2], line 14
     11     base64_gzip += '=' * (4 - missing_padding)
     13 # Decode the base64 string
---> 14 decoded_gzip = base64.b64decode(base64_gzip)
     16 # Decompress the gzip file
     17 with gzip.GzipFile(fileobj=io.BytesIO(decoded_gzip)) as f:

File /usr/local/lib/python3.11/base64.py:88, in b64decode(s, altchars, validate)
     86     assert len(altchars) == 2, repr(altchars)
     87     s = s.translate(bytes.maketrans(altchars, b'+/'))
---> 88 return binascii.a2b_base64(s, strict_mode=validate)

Error: Incorrect padding
  1. I tried downloading the base64 string from the api in parts to rule out the string being corrupted in the api response.
  2. I tried using a different decoding library (binascii) to rule out errors with the base64 library.
  3. I even tried encoding a file manually (but smaller in size), sending it through the mailbox that I am using, downloading it through the API request and decoding it with the same code and I was successful. Which could indicate that maybe the string I get from the other files does get corrupted, but I don't know what the reason is.
<p>I hope someone can help me.
I am trying to download the attachments of an email where DMARC reports arrive from a domain, I download them using the Gmail API, according to the API documentation the file is downloaded through a base64 encoded string which I obtain correctly "apparently", but when I try to decode that string with python to obtain the file I get an error that says that the base64 string is not recognized as a valid string and it is possible that it is corrupt.
To rule out that the chain is too long, I tried downloading it in parts and then joining it, but the error continues.</p>
<p>I attach the code I am using and the error that returns:
(The base64 string is just an example)</p>
<pre><code>import base64
import gzip
import io

# Base64 encoded gzip file
base64_gzip = "H4sIAAAAAAAAAKVTuW7cMBCt118RuJcoyl5bCzB0XKRMUrhLI1DUaJexeICkNsfXh5f2MIw0aSTOe6N584Yj8vRLzh-OYJ3Q6uMtrpvbJ3pDJoBxYPyV3mxIIWlTY4LWIOAWjLa-l-DZyDwL0IZou- 8Vk0Cfvzx___a1evn8QtAJjBkgmZip0c5L5jzYT0yyP1o5cDXXkqDMx8xSX4x07PCWtVtc8 YHz6v6ha6odG6AadtDhu2FiO94SdM6PX4eWoLdM7ZPshgywF4riR_zYdg_3TUNQRhIJakzUX SAjFeNYBF1VOUlcWCZGz4L_7s0yzMIdoIjr4EJRoSbtYM7GChZpNr4KSR1B-ZAgZ6aExHcEDLXwA7gnyKTYnQGXEcM9xbHbeIjApGmIwjO2-k5fYaJc29yh1T-zdacXy6EXhrbNru62dYu7ehv GeSZSHteLCnoE5UPCigYc2byEQaXK0b1w4X6Fj3uitILg_QIpOdG4YS5YXmeQTE4FLGM4G7kSCTeR-ydiBOXFJMJarld5hFkb6Cer5fUNXFMp-wBsBPtO7iWRBN8IEbb4Q2_BLbMvyhc-_n3_abfjh8V rCbLdUxWyjuC_yq3rhN70G9PyMoSNWX_2v5P8j0QNBAAA"

# Ensure the base64 string has the correct padding
missing_padding = len(base64_gzip) % 4
if missing_padding:
base64_gzip += '=' * (4 - missing_padding)

# Decode the base64 string
decoded_gzip = base64.b64decode(base64_gzip)

# Decompress the gzip file
with gzip.GzipFile(fileobj=io.BytesIO(decoded_gzip)) as f:
decompressed_data = f.read().decode('utf-8')

decompressed_data
</code></pre>
<pre><code>---------------------------------------------------------------------------
Error Traceback (most recent call last)
Cell In[2], line 14
11 base64_gzip += '=' * (4 - missing_padding)
13 # Decode the base64 string
---> 14 decoded_gzip = base64.b64decode(base64_gzip)
16 # Decompress the gzip file
17 with gzip.GzipFile(fileobj=io.BytesIO(decoded_gzip)) as f:

File /usr/local/lib/python3.11/base64.py:88, in b64decode(s, altchars, validate)
86 assert len(altchars) == 2, repr(altchars)
87 s = s.translate(bytes.maketrans(altchars, b'+/'))
---> 88 return binascii.a2b_base64(s, strict_mode=validate)

Error: Incorrect padding
</code></pre>
<ol>
<li>I tried downloading the base64 string from the api in parts to rule out the string being corrupted in the api response.</li>
<li>I tried using a different decoding library (binascii) to rule out errors with the base64 library.</li>
<li>I even tried encoding a file manually (but smaller in size), sending it through the mailbox that I am using, downloading it through the API request and decoding it with the same code and I was successful. Which could indicate that maybe the string I get from the other files does get corrupted, but I don't know what the reason is.</li>
</ol>
 

Online statistics

Members online
0
Guests online
3
Total visitors
3
Top