GPU computing, Parallel computing
cudaMcopy
Kernels look like serial code, but you can specify the parallelism, which is the number of simultaneous threads each of which executes a copy of the kernel.
Need for synchronization! |
Barrier
Introduction to parallel programming by nvidia in Udacity: https://classroom.udacity.com/courses/cs344/lessons/55120467/concepts/671181630923
What happens when many threads try to write to same memory location – atomic operations