Python – Using List/Multiple Arguments in Pool Map with Multiprocessing

pythonpython-multiprocessing

I am trying to pass a list as a parameter to the pool.map(co_refresh, input_list). However, pool.map didn't trigger the function co_refresh. And also no error returned. It looks like the process hung in there.

Original Code:

from multiprocessing import Pool
import pandas as pd
import os

account='xxx'
password='xxx'
threads=5
co_links='file.csv'

input_list=[]

pool = Pool(processes=threads)
def co_refresh(url, account, password, outputfile):

    print(url + ' : ' + account + ' : ' + password + ' : ' + outputfile)

    return;

link_pool = pd.read_csv(co_links, skipinitialspace = True)

for i, row in link_pool.iterrows():

    ln = (row.URL, account, password, os.path.join('e:/', row.File_Name.split('.')[0] + '.csv'))

    input_list.append(ln)

pool.map(co_refresh, input_list)

pool.close()

However, it never triggered the function co_refresh. How can I use the list as a parameter to be passed to my function?

Old Question (Simplified):

I have below input_list, which is a list of list:

[a1, b1, c1, d1]
[a2, b2, c2, d2]
[a3, b3, c3, d3]

I have the function as below:

def func(a, b, c, d)
   ###
    return;

I would like to use multiprocess for this function func:

from multiprocessing import Pool
pool = Pool(processes=5)
pool.map(func, input_list)
pool.close()

However, it never triggered the function func. How can I use the list as a parameter to be passed to my function?

Best Answer

You should define your work function before declaring the Pool, when you declaring Pool, sub worker processes forked from that point, worker process don't execute code beyond that line, therefore not seeing your work function.

Besides, you'd better replace pool.map with pool.starmap to fit your input.

A simplified example:

from multiprocessing import Pool

def co_refresh(a, b, c, d):
    print(a, b, c, d)

input_list = [f'a{i} b{i} c{i} d{i}'.split() for i in range(4)]
# [['a0', 'b0', 'c0', 'd0'], ['a1', 'b1', 'c1', 'd1'], ['a2', 'b2', 'c2', 'd2'], ['a3', 'b3', 'c3', 'd3']]

pool = Pool(processes=3)
pool.starmap(co_refresh, input_list)
pool.close()
Related Question