Terminology: Concurrency
and Parallelism
Give an example for a problem that can be solved concurrently but not in parallel, and that can be solved in parallel but not concurrently. If you think this doesn't exist, explain why.
Speedup Arithmetic
Consider two programs, one sequential and one parallel. Executing these on a shared memory multicore system yield the following results:
| # Cores | Seq. | Par. |
|---|---|---|
| 1 | 27s | 34s |
| 2 | 27s | 18s |
| 4 | 27s | 10s |
| 8 | 27s | 7s |
Compute the speedup and efficiency for 2, 4 and 8 cores and explain the results. Briefly comment on why on a single-core system the sequential version is faster than the parallel version.
Parallel Execution and Data Races
Consider the following tasks and explain whether or not data races could occur. Assume multiple actors are doing the same operation in parallel. If they can occur, suggest the most appropriate means of preventing them.
- Adding a value to a binary tree.
- Finding a value in a binary tree.
- Incrementing an integer that fits in a machine word.
- Creating a new file on a multi-user system.
- Replacing an existing file on a multi-user system.
OpenMP: Parallelization
Judge whether or not you can correctly parallelize these loops using
OpenMP just by adding a #pragma omp, without
adjusting the program. Briefly explain your answer, and if
you think it is possible, give the pragma you would use.
for (int i = 0; i < N; ++i) { a[i] = a[i] * a[i+1]; }for (int i = 0; i < N-1; i += 2) { a[i] = a[i] * a[i+1]; }bool any = false; for (int i = 0; i < N; ++i) { any = any || c[i]; }
OpenMP: Reduction
The lectures introduced reduction
. Explain in your own words
what the advantage is of reducing data, instead of manually iterating.
Specifically go into the dangers and difficulties that you
can avoid dealing with, if your problem matches the reduction
pattern.
To illustrate this, come up with a (simple) problem where
you can use reduction and contrast this to one where
you cannot.
Hint
What algebraic
properties does the reduction operation need to exhibit?
OpenMP: Schedulers
As you might know, is_prime(n) is computationally
expensive
while is_coprime(n,
m) is cheap. To determine if two positive
integers n and m are coprime, you just have to
check
.
In the following program these functions are both used:
#define N 128
void
foo(int in[N], bool out[N])
{
for (int i = 0; i < N; ++i) {
if (i & 1) { /* i is odd */
out[i] = is_prime(in[i]);
} else { /* i is even */
out[i] = is_coprime(in[i], in[(i+1)%N]);
}
}
}
Explain what the general efficiency issue is when you attempt
to execute the program in parallel using OpenMP. Which of the
three schedulers
mentioned in the lecture (static, dynamic
and guided) should you use and why? Write out the OpenMP
pragma you would use.
Hint
What do the respective schedulers do when a thread runs out of work?