CUDA

cosmos 30th March 2017 at 3:42pm
GPU computing

GPU computing, Parallel computing

CUDA model:

  1. move data from CPU to GPU memory cudaMcopy
  2. compute on GPU with Kernels
    1. Launch blocks of threads (forming a grid of blocks). Why blocks?: GPU allocates each blocks to a Streaming Multiprocessor (SM) (an SM may run more than 1 block). CUDA makes few guarantess about where and when thread blocks will run
  3. mode data back grom GPU to CPU memory

Kernels look like serial code, but you can specify the parallelism, which is the number of simultaneous threads each of which executes a copy of the kernel.

Memory modelfull model

Need for synchronization!

Barrier

Introduction to parallel programming by nvidia in Udacity: https://classroom.udacity.com/courses/cs344/lessons/55120467/concepts/671181630923

What happens when many threads try to write to same memory locationatomic operations