OiO.lk Blog python Using for loop reading with multiprocessing missing iterables
python

Using for loop reading with multiprocessing missing iterables


Sorry if I’m wording this wrong, below is my script, I’m trying to figure out why when I review the archive file (that I created) I only see 9874 lines when the file to open/read has 10000. I guess I’m trying to uderstand why some iterations are missing. I’ve tried it a few times and that number always varies. What am I doing wrong?

import multiprocessing
import hashlib
from tqdm import tqdm

archive = open('color_archive.txt', 'w')

def generate_hash(yellow: str) -> str:
    b256 = hashlib.sha256(yellow.encode()).hexdigest()
    x = ' '.join([yellow, b256])
    archive.write(f"{x}\n")

if __name__ == "__main__":
    listofcolors = []   
    with open('x.txt') as f:
        for yellow in tqdm(f, desc="Generating..."):
            listofcolors.append(yellow.strip())
           
    cpustotal = cpu_count() - 1
    pool = multiprocessing.Pool(cpustotal)
    results = pool.imap(generate_hash, listofcolors)
    pool.close()
    pool.join()
print('DONE')

This script executes fine however when looking at the archive file some lines are missing for example a file with 10000 lines only wrote 9985 lines to the new file, what am I doing wrong?



You need to sign in to view this answers

Exit mobile version