While developing a parallel version of some code, I noticed the looking up an item in an association seemed much slower than in the serial version of the code. To show the effect, I created the following example :
LaunchKernels[]; (* four kernels will be launched *) assoc= Association[Table[i-> {i,i^2},{i,1,100}]]; array=Table[{i,i^2},{i,1,100}]; DistributeDefinitions[assoc,array];
To make sure the association and the table are know by the sub-kernels :
In[6]:= ParallelTable[Names["Global`*"], {4}] Out[6]= {{"array", "assoc"}, {"array", "assoc"}, {"array", "assoc"}, {"array", "assoc"}}
First I run a serial version of some example code :
In[7]:= AbsoluteTiming[Table[Table[assoc[10], {100000}];, {4}];] Out[7]= {0.106069, Null}
Next I run the parallel version of the same code :
In[8]:= AbsoluteTiming[ParallelTable[Table[assoc[10], {100000}];, {4}, Method -> "CoarsestGrained"];] Out[8]= {4.37772, Null}
I also tried using the Lookup command explicitly:
In[9]:= AbsoluteTiming[ParallelTable[Table[Lookup[assoc, 10], {100000}];, {4}, Method -> "CoarsestGrained"];] Out[9]= {4.58422, Null}
In both cases the parallel version runs much slower although there is (I think) no call-back to the main kernel from the sub-kernels during the evaluation which might slow things down. There is also no passing of data into our out off the sub-kernels.
When I use a table lookup instead of an association lookup in the code , the slow down is not present :
In[13]:= AbsoluteTiming[ParallelTable[Table[array[[10]], {100000}];, {4}, Method -> "CoarsestGrained"];] Out[13]= {0.037059, Null}
I don’t have a clue what is going on. I would be glad if someone can explain me why this slow down happens and how to avoid it. Or point me to a previous similar post in case I missed it.