![]() For devices of compute capability 2.0, the warp size is 32 threads and the number of banks is also 32. There will be 2 requests for the warp: one for the first half and second for the second half. For devices of compute capability 1.x, the warp size is 32 threads and the number of banks is 16. Shared memory banks are organized such that successive 32-bit words are assigned to successive banks and the bandwidth is 32 bits per bank per clock cycle. Each bank services only one thread request at a time, multiple simultaneous accesses from different threads to the same bank result in a bank conflict (the accesses are serialized). Successive sections of memory are assigned to successive banks. To reduce potential bottleneck, shared memory is divided into logical banks. Shared memory are accessable by multiple threads. _global_ void dynamicReverse ( int * d, int n ) Shared memory can be uses as user-managed data caches and high parallel data reductions. Memory access can be controlled by thread synchronization to avoid race condition (_syncthreads). Threads can access data in shared memory loaded from global memory by other threads within the same thread block. Shared memory latency is roughly 100x lower than uncached global memory latency. Shared memory is on-chip and is much faster than local and global memory. It combines consecutive accesses into one single access to DRAM. When threads in a warp load data from global memory, the system detects whether they areĬonsecutive. It is much slower than either registers or shared memory. Local memory is just thread local global memory. Used when it does not fit in to registers Where constants and kernel arguments are stored Each GPS has a constant memory for read only with shorter latency and higher throughput. Global memory is an order of magnitude slower. Access to the shared memory is in the TB/s. Each SM has a L1 cache for global memory references. Local, Constant, and Texture are all cached. Local, Global, Constant, and Texture memory all reside off chip.
0 Comments
Leave a Reply. |