Processing Units
Describe in your own words the tradeoffs between a CPU and GPU. Give an example of a task where each is preferable, and explain why! What if the CPU supports SIMD instructions?
OpenCL Kernels I
Explain why the kernel could be slow, and briefly sketch how to improve the performance.
__kernel void
discretize(__global float *f_dis, int n)
{
int i = get_global_id(0);
int j = get_global_id(1);
float a = 0;
float b = 2 * M_PI;
float h = (b - a) / n;
if (i < n && j < n) {
float x = a + i * h;
float y = a + j * h;
/* In OpenCL C, sin is type-generic, so no sinf needed. */
f_dis[i * n + j] = -2 * sin(x + y);
}
}
OpenCL Kernels II
What concurrency-related mistake has been made in the following implementation of (square) matrices transposition?
__kernel void
transpose(__global float *mat, int m)
{
int i = get_global_id(0);
int j = get_global_id(1);
float *tmp = mat[i * m + j];
mat[i * m + j] = mat[j * m + i];
mat[j * m + i] = tmp;
}