- 19 May, 2020 1 commit
-
-
David Olsen authored
Reviewed-by:
Bryce Adelstein Lelbach aka wash <brycelelbach@gmail.com>
-
- 21 Feb, 2020 2 commits
-
-
Bryce Adelstein Lelbach aka wash authored
-
Bryce Adelstein Lelbach aka wash authored
-
- 12 Feb, 2020 1 commit
-
-
Michał 'Griwes' Dominiak authored
It seems that we're getting these by trying to instantiate the CUB templates on other instantiation paths.
-
- 07 Feb, 2020 2 commits
-
-
Michał 'Griwes' Dominiak authored
Bug 2837899 Bug 2838747 Bug 2838948 Bug 2838990
-
Bryce Adelstein Lelbach aka wash authored
specialization. Bug 2796705 Reviewed-by:
Michał 'Griwes' Dominiak <griwes@griwes.info>
-
- 06 Feb, 2020 1 commit
-
-
Bryce Adelstein Lelbach aka wash authored
calling cudaGetLastError. Otherwise, if the CUDA API call is followed directly by a kernel launch, checking for a synchronous error during the kernel launch by calling cudaGetLastError may potentially return the error code from the CUDA API call. This type of error leakage is very subtle and difficult to trace. Also, update Makefile to remove old architectures and allow you to override the NVCC variant used. Bug 2720132 Bug 2808654 Reviewed-by:
Michał 'Griwes' Dominiak <griwes@griwes.info>
-
- 29 Jan, 2020 1 commit
-
-
Bryce Adelstein Lelbach aka wash authored
`cub::PtxVersion` and `cub::SmVersion`. These CUDA APIs acquire locks to CUDA driver/runtime mutex and perform poorly under contention. Caching is only done in C++11 as we need a guarantee of thread-safe initialization of statics. Caching is multi-device aware; a `cub::SwitchDevice` RAII class and `cub::CurrentDevice` function have been added to facilitate this. Bug 2824145 Bug 2808654 Reviewed-by:
Michał 'Griwes' Dominiak <griwes@griwes.info>
-
- 19 Oct, 2019 1 commit
-
-
Bryce Adelstein Lelbach aka wash authored
version check, because `CUDA_VERSION` is only available if you've included `<cuda.h>`.
-
- 15 Oct, 2019 8 commits
-
-
Bryce Adelstein Lelbach aka wash authored
-
Bryce Adelstein Lelbach aka wash authored
-
Bryce Adelstein Lelbach aka wash authored
`zip_iterator`/`discard_iterator` tests unhappy.
-
Bryce Adelstein Lelbach aka wash authored
-
Bryce Adelstein Lelbach aka wash authored
-
Bryce Adelstein Lelbach aka wash authored
can control the intermediate type.
-
Bryce Adelstein Lelbach aka wash authored
(which is more permissive) in `AgentReduce` to allow a wider range of types to be used with it.
-
Bryce Adelstein Lelbach aka wash authored
handle the case where there are 0 input items per warp.
-
- 17 Jul, 2019 1 commit
-
-
Duane Merrill authored
-
- 31 Jan, 2019 2 commits
-
-
Duane Merrill authored
-
Duane Merrill authored
-
- 11 Jan, 2019 1 commit
-
-
Duane Merrill authored
Fix a race for the CUDA event between DeviceFree() and DeviceAllocate() in the allocator
-
- 20 Dec, 2018 1 commit
-
-
Matti Kortelainen authored
This commit fixes a race for the CUDA event between DeviceFree() and DeviceAllocate(). If the mutex is unlocked before the cudaEventRecord(), there is a short period of time when - the memory block is already in the free list (cached_blocks), and - the CUDA event status is not yet cudaErrorNotReady and the DeviceAllocate() may consider that memory block to be free to be used for another CUDA stream.
-
- 16 Feb, 2018 1 commit
-
-
dumerrill authored
-
- 15 Feb, 2018 3 commits
-
-
Duane Merrill authored
-
Duane Merrill authored
-
Duane Merrill authored
-
- 14 Feb, 2018 2 commits
-
-
-
https://github.com/NVlabs/cub/issues/112Duane Merrill authored
Issue was not setting up the shfl constant properly. Refactor of shfl scans and reductions to always use lane_id as being relative to logical warp (not physical)
-
- 09 Feb, 2018 1 commit
-
-
dumerrill authored
-
- 08 Feb, 2018 10 commits
-
-
-
Duane Merrill authored
-
Duane Merrill authored
-
Duane Merrill authored
https://github.com/NVlabs/cub/issues/127 Applied reduction-op instead of RLE-scan-op for last item, so was always (insead of conditionally) folding in the last block's prefix count.
-
Duane Merrill authored
-
Duane Merrill authored
-
Duane Merrill authored
Master
-
dumerrill authored
-
dumerrill authored
-
Duane Merrill authored
Master
-
- 07 Feb, 2018 1 commit
-
-
Duane Merrill authored
-