|
CUB
|
BlockRadixSort provides variants of parallel radix sorting across a CUDA threadblock.
unsigned char, int, double, etc. Within each key, the implementation treats fixed-length bit-sequences of RADIX_BITS as radix digit places. Although the direct radix sorting method can only be applied to unsigned integral types, BlockRadixSort is able to sort signed and floating-point types via simple bit-wise transformations that ensure lexicographic key ordering.| KeyType | Key type |
| BLOCK_THREADS | The threadblock size in threads |
| ITEMS_PER_THREAD | The number of items per thread |
| ValueType | [optional] Value type (default: cub::NullType) |
| RADIX_BITS | [optional] The number of radix bits per digit place (default: 5 bits) |
| SMEM_CONFIG | [optional] Shared memory bank mode (default: cudaSharedMemBankSizeFourByte) |
__syncthreads() barrier is required if the supplied BlockRadixSort::SmemStorage is to be reused or repurposed by the threadblock.BLOCK_THREADS is a multiple of the architecture's warp sizeKeyType is an unsigned integral typePublic Types | |
| typedef _SmemStorage | SmemStorage |
The operations exposed by BlockRadixSort require shared memory of this type. This opaque storage can be allocated directly using the __shared__ keyword. Alternatively, it can be aliased to externally allocated shared memory or union'd with other types to facilitate shared memory reuse. | |
Static Public Methods | |
Keys-only sorting | |
| static __device__ __forceinline__ void | SortBlocked (SmemStorage &smem_storage, KeyType(&keys)[ITEMS_PER_THREAD], unsigned int begin_bit=0, const unsigned int &end_bit=sizeof(KeyType)*8) |
| Performs a threadblock-wide radix sort over a blocked arrangement of keys. More... | |
| static __device__ __forceinline__ void | SortBlockedToStriped (SmemStorage &smem_storage, KeyType(&keys)[ITEMS_PER_THREAD], unsigned int begin_bit=0, const unsigned int &end_bit=sizeof(KeyType)*8) |
| Performs a radix sort across a blocked arrangement of keys, leaving them in a striped arrangement. More... | |
| static __device__ __forceinline__ void | SortStriped (SmemStorage &smem_storage, KeyType(&keys)[ITEMS_PER_THREAD], unsigned int begin_bit=0, const unsigned int &end_bit=sizeof(KeyType)*8) |
| Performs a radix sort across a striped arrangement of keys. More... | |
Key-value pair sorting | |
| static __device__ __forceinline__ void | SortBlocked (SmemStorage &smem_storage, KeyType(&keys)[ITEMS_PER_THREAD], ValueType(&values)[ITEMS_PER_THREAD], unsigned int begin_bit=0, const unsigned int &end_bit=sizeof(KeyType)*8) |
| Performs a radix sort across a blocked arrangement of keys and values. More... | |
| static __device__ __forceinline__ void | SortBlockedToStriped (SmemStorage &smem_storage, KeyType(&keys)[ITEMS_PER_THREAD], ValueType(&values)[ITEMS_PER_THREAD], unsigned int begin_bit=0, const unsigned int &end_bit=sizeof(KeyType)*8) |
| Performs a radix sort across a blocked arrangement of keys and values, leaving them in a striped arrangement. More... | |
| static __device__ __forceinline__ void | SortStriped (SmemStorage &smem_storage, KeyType(&keys)[ITEMS_PER_THREAD], ValueType(&values)[ITEMS_PER_THREAD], unsigned int begin_bit=0, const unsigned int &end_bit=sizeof(KeyType)*8) |
| Performs a radix sort across a striped arrangement of keys and values. More... | |
|
inlinestatic |
Performs a threadblock-wide radix sort over a blocked arrangement of keys.
A subsequent __syncthreads() threadblock barrier should be invoked after calling this method if the supplied smem_storage is to be reused or repurposed by the threadblock.
| [in] | smem_storage | Shared reference to opaque SmemStorage layout |
| [in,out] | keys | Keys to sort |
| [in] | begin_bit | [optional] The beginning (least-significant) bit index needed for key comparison |
| [in] | end_bit | [optional] The past-the-end (most-significant) bit index needed for key comparison |
|
inlinestatic |
Performs a radix sort across a blocked arrangement of keys, leaving them in a striped arrangement.
A subsequent __syncthreads() threadblock barrier should be invoked after calling this method if the supplied smem_storage is to be reused or repurposed by the threadblock.
| [in] | smem_storage | Shared reference to opaque SmemStorage layout |
| [in,out] | keys | Keys to sort |
| [in] | begin_bit | [optional] The beginning (least-significant) bit index needed for key comparison |
| [in] | end_bit | [optional] The past-the-end (most-significant) bit index needed for key comparison |
|
inlinestatic |
Performs a radix sort across a striped arrangement of keys.
A subsequent __syncthreads() threadblock barrier should be invoked after calling this method if the supplied smem_storage is to be reused or repurposed by the threadblock.
| [in] | smem_storage | Shared reference to opaque SmemStorage layout |
| [in,out] | keys | Keys to sort |
| [in] | begin_bit | [optional] The beginning (least-significant) bit index needed for key comparison |
| [in] | end_bit | [optional] The past-the-end (most-significant) bit index needed for key comparison |
|
inlinestatic |
Performs a radix sort across a blocked arrangement of keys and values.
A subsequent __syncthreads() threadblock barrier should be invoked after calling this method if the supplied smem_storage is to be reused or repurposed by the threadblock.
| [in] | smem_storage | Shared reference to opaque SmemStorage layout |
| [in,out] | keys | Keys to sort |
| [in,out] | values | Values to sort |
| [in] | begin_bit | [optional] The beginning (least-significant) bit index needed for key comparison |
| [in] | end_bit | [optional] The past-the-end (most-significant) bit index needed for key comparison |
|
inlinestatic |
Performs a radix sort across a blocked arrangement of keys and values, leaving them in a striped arrangement.
A subsequent __syncthreads() threadblock barrier should be invoked after calling this method if the supplied smem_storage is to be reused or repurposed by the threadblock.
| [in] | smem_storage | Shared reference to opaque SmemStorage layout |
| [in,out] | keys | Keys to sort |
| [in,out] | values | Values to sort |
| [in] | begin_bit | [optional] The beginning (least-significant) bit index needed for key comparison |
| [in] | end_bit | [optional] The past-the-end (most-significant) bit index needed for key comparison |
|
inlinestatic |
Performs a radix sort across a striped arrangement of keys and values.
A subsequent __syncthreads() threadblock barrier should be invoked after calling this method if the supplied smem_storage is to be reused or repurposed by the threadblock.
| [in] | smem_storage | Shared reference to opaque SmemStorage layout |
| [in,out] | keys | Keys to sort |
| [in,out] | values | Values to sort |
| [in] | begin_bit | [optional] The beginning (least-significant) bit index needed for key comparison |
| [in] | end_bit | [optional] The past-the-end (most-significant) bit index needed for key comparison |
1.8.3.1