📅 2011-Mar-10 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ cpp, cuda ⬩ 📚 Archive
A typical usage of cudaMalloc
to allocate memory on the device is:
const int arraySize = 100;
Foo* fooArray = NULL;
const int arraySpace = arraySize * sizeof( *fooArray );
cudaMalloc( &fooArray, arraySpace );
cudaMalloc
is a C function and its usage in C++ is quite messy for 2 reasons. First, fooArray
cannot be defined and assigned the allocated memory in the same statement. Second, a calculation of the array size in bytes is required.
A simple C++ wrapper for cudaMalloc
using a template function can handle both of these problems:
template< typename T >
int size )
T* myCudaMalloc(
{
T* loc = NULL;const int space = size * sizeof( T );
cudaMalloc( &loc, space );return loc;
}
const int arraySize = 100;
Foo* fooArray = myCudaMalloc< Foo >( arraySize );
I wish the function could deduce the type without requiring its to be explicitly specified, as myCudaMalloc< Foo >
. But, that would mean deducing the type based solely on the function return type, which is not possible in C++.
Tried with: CUDA 3.2