OiO.lk Blog python Why is the Python mulitprocessing using class functions slower than in serial for this code?
python

Why is the Python mulitprocessing using class functions slower than in serial for this code?


I am trying to run a multiprocessing pool within a class to calculate several values that use class functions of a much larger class. I am trying to take the initial values and add a random 5% normal distribution to each value and calculate the new log likelihood of those new values.

Here is a (somewhat useless) snippet of the code that I am using. make_model is very long but spits out a model for the velocity which I compare to the measured values stored in a data container.

import multiprocessing as mp

class fit_time_dependent():
        def __init__(self):
                setup a bunch of things here
   
        def make_model(self):
                a bunch of things here too
                return velocity_model

    def log_likelihood_pass1(self,pars):
        velocity_model = self.make_model(pars)

        totallogprob = 0
        if self.datum.velocities:
            for inst in self.datum.velocity_instruments:

                vsh_data = self.datum.get_velocity(inst)
                vsh_data_y = vsh_data["vsh"]*u.km/u.s
                vsh_data_y_err = vsh_data["vsh_err"]*u.km/u.s

                sigma2 = vsh_data_y_err ** 2# + model ** 2
                totallogprob += -0.5 * np.sum((vsh_data_y - velocity_model)**2/sigma2)

        return totallogprob.value

    def log_prob_pass1(self,pars):
        lp = self.log_prior(pars)
        if not np.isfinite(lp):
            return -np.inf
        return lp + self.log_likelihood_pass1(pars)

          def do_fit(self)

                p0 = some initial values from a previous fit of model to data

        nsize = 128
        spread = 0.05
        pos = np.array(p0) + spread * np.random.randn(nsize, len(p0))

        time_start_pool = time_counter.time()
        pool = mp.Pool(8)
        results_pool = pool.map(self.log_prob_pass1,pos)
        time_end_pool = time_counter.time()
        time_elapsed_pool = float(time_end_pool) - float(time_start_pool)
        print("Pool - map - %s seconds" % time_elapsed_pool)

        time_start_serial = time_counter.time()
        results_serial = np.asarray(list(map(self.log_prob_pass1,pos)))
        time_end_serial = time_counter.time()
        time_elapsed_serial = float(time_end_serial) - float(time_start_serial)
        print("Serial - map - %s seconds" % time_elapsed_serial)

The issue is that this calculation has to be repeated many time and running on a single core would take far too long.

When testing the code in pool vs serial, I get a huge performance hit for using pool.

Pool - map - 296.5006010532379 seconds
Serial - map - 17.647610187530518 seconds

Additionally, I was watching my CPU usage, and it seems that pool doesn’t use any of the cores that I requested in the pool:

CPU usage

I’ve tried to use pathos/multiprocess with their different Pool options like: ProcessPool, ParallelPool, ThreadPool. I would like to keep it simple and not have to use Process but if it comes to that fine.

This seems similar to my problem but not exactly: https://stackoverflow.com/questions/66790158/how-to-make-use-of-a-multiprocessing-manager-within-a-class

Thanks for the help.



You need to sign in to view this answers

Exit mobile version