Englisch
Hi,
We are converting a stochastic simulation fortran program to OpenMP as the outputs of the program can be summed. In the simplest mode, we have just made the main loop a parallel region with firstprivate. No matter how many threads we launch, the wall time consumed is roughly the time for a single thread times the number of threads. The problem seems to be _kmp_launch_monitor which is having 200ms waits for ManualResetEvents. Eliminating atomic and critical sections has little effect on the outcome. Using OMP DO likewise.