Skip to content

Comparisson between various OpenMP scheduling policies for multi-thread loop distribution based on the Gaussian Blur algorithm

License

Notifications You must be signed in to change notification settings

FedosCucumber/OpenMPSchedulingTest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 

Repository files navigation

OpenMP Scheduling Policies Test

Comparisson between various OpenMP policies for multi-thread loop distribution based on the Gaussian Blur algorithm using box blur approximation. Algorithm is being run for a .pgm picture.

Algorithm overview.

Gaussian blur is an algorithm that blurs the image using the gaussian function: $ g(x, y) = e^\frac{x^2+y^2}{2\sigma^2} $. Result of the function are used to create a convolution kernel of size $6\sigma -1$ which is moved through the picture and changes each pixel. $\sigma$ is the standart deviation of pixel colour.

Applying gaussian blur directly can proove to be rather time-consuming. What we need is some form of approximation of the blur. This example uses the box blur aaproximation technique. Lets define the horizontal and total blur:

$$ b_h[i, j] = \sum_{x=j-br}^{j+br}{\frac{f[i,x]}{2*br}}$$

$$ b_t[i, j] = \sum_{y=j-br}^{j+br}{\frac{b_h[i, x]}{2*br}}$$

Where $br$ is the box filter radius. seeing that, we can boil down $b_t[i, j]$ to

$$b_t[i,j] = \sum_{y=i-br}^{i+br}{\sum_{x=j-br}^{j+br}{f[x,y]/(2*br)^2}} $$

This makes the time complexity of the algoritm to be O(nr).

Sheduling comparisson

One of the most popular ways to use OpenMP is to parallelize loops. This can be achieve using different verious scheduling options

  • static. This means loop iterations will ber equally distributed between threads
  • static, $x$. This means that each thread will get an $x$ amount of iterations
  • dynamic, $x$. This means that each thread get $x$ iterations and "asks" for new iterations after completing previously assigned ones.

I calculated the run-time of the programm using various policies, carious thread pool size and various chunk_size($x$) values. I also provided test results for single-thread runs just for good measure. The results are:

Graph1.png

Graph2.png

graph3.png

To get thus results, I used the average of 5 runs. The results show that maxing out the thread pool size with out passing over the harware limit of supported threads yields the best results. Dynamic and static scheduling show almost equal results on this algorithm. I think the reason behind this is that I still use a fixed-size thread pool for each run with out any other processes in the backgtound. Maxing out the chunk_size seems to limit the amount of threads used making the programm slower.

About

Comparisson between various OpenMP scheduling policies for multi-thread loop distribution based on the Gaussian Blur algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages