π 2011-Apr-09 ⬩ βοΈ Ashwin Nanjappa ⬩ π·οΈ cuda, device vector, kernel, thrust ⬩ π Archive
Thrust makes it convenient to handle data with its device_vector
. But, things get messy when the device_vector
needs to be passed to your own kernel. Thrust data types are not understood by a CUDA kernel and need to be converted back to its underlying pointer. This is done using thrust::raw_pointer_cast
:
#include <thrust/device_vector.h>
thrust::device_vector<int> iVec;
int* iArray = thrust::raw_pointer_cast(&iVec[0]);
fooKernel<<<x, y>>>(iArray);
Inside a kernel, I typically need not only a pointer to the array, but also the length of the array. Thus, I find it is useful to convert device_vector
to a structure that holds both a pointer to the array and its length. (Very much like a vector itself.) Creating such a structure and its conversion function is easy thanks to templates:
// Template structure to pass to kernel
template <typename T>
struct KernelArray
{
T* _array;
int _size;
};
// Function to convert device_vector to structure
template <typename T>
KernelArray<T> convertToKernel(thrust::device_vector<T>& dVec)
{
KernelArray<T> kArray;
kArray._array = thrust::raw_pointer_cast(&dVec[0]);
kArray._size = (int) dVec.size();
return kArray;
}
Passing device_vector
to kernels and accessing its array inside the kernel is easy thanks to this infrastructure:
thrust::device_vector<int> iVec;
fooKernel<<<x, y>>>(convertToKernel(iVec)); // Explicit conversion from iVec to KernelArray<int>
__global__ fooKernel(KernelArray<int> inArray)
{
for (int i = 0; i < inArray._size; ++i)
something = inArray._array[i];
// ...
return;
}
You can take it a notch higher and make the conversion from device_vector
to KernelArray
to be implicit. This can be done by adding a constructor to KernelArray
that takes one input parameter of type device_vector
. (See Stephenβs comment below the post.)
With such a constructor, you can now pass a device_vector
seamlessly to the kernel:
thrust::device_vector<int> iVec;
fooKernel<<<x, y>>>(iVec); // Implicit conversion from iVec to KernelArray<int>
__global__ fooKernel(KernelArray<int> inArray)
{
// ...
}
Tried with: Thrust 1.3 and CUDA 3.2