Parallel Computing '26: Exercise Sheet 3
(DRAFT) OpenCL and GPUs

Submission Deadline:

Processing Units

Describe in your own words the tradeoffs between a CPU and GPU. Give an example of a task where each is preferable, and explain why! What if the CPU supports SIMD instructions?

OpenCL Kernels I

Explain why the kernel could be slow, and briefly sketch how to improve the performance.

__kernel void
discretize(__global float *f_dis, int n)
{
  int i = get_global_id(0);
  int j = get_global_id(1);

  float a = 0;
  float b = 2 * M_PI;
  float h = (b - a) / n;

  if (i < n && j < n) {
    float x = a + i * h;
    float y = a + j * h;
    /* In OpenCL C, sin is type-generic, so no sinf needed. */
    f_dis[i * n + j] = -2 * sin(x + y);
  }
}

OpenCL Kernels II

What concurrency-related mistake has been made in the following implementation of (square) matrices transposition?

__kernel void
transpose(__global float *mat, int m)
{
  int i = get_global_id(0);
  int j = get_global_id(1);

  float *tmp = mat[i * m + j];
  mat[i * m + j] = mat[j * m + i];
  mat[j * m + i] = tmp;
}

Remember to submit your answers in groups of two via Brightspace! If you cannot find a group on your own, please reach out and we will try to pair you up.