Code Yarns ‍👨‍💻
Tech BlogPersonal Blog

Thrust: Remove Duplicates in Multiple Vectors

📅 2011-Apr-21 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ thrust, vectors ⬩ 📚 Archive

With the magical thrust::zip_iterator duplicates in multiple vectors can be easily removed and the vectors can be trimmed in Thrust.

Consider two vectors, one of key values and the other holding their values. There can be many values associated with each key. The keys are sorted and the values associated with each key are also sorted. Finding duplicates in these vectors boils down to finding duplicate pairs and removing them. Here is how to achieve this easily using thrust::unique and thrust::zip_iterator:

typedef thrust::device_vector< int >                IntVector;
typedef IntVector::iterator                         IntIterator;
typedef thrust::tuple< IntIterator, IntIterator >   IntIteratorTuple;
typedef thrust::zip_iterator< IntIteratorTuple >    ZipIterator;

IntVector keyVector;
IntVector valVector;

// Remove duplicate pairs
ZipIterator newEnd = thrust::unique( thrust::make_zip_iterator( thrust::make_tuple( keyVector.begin(), valVector.begin() ) ),
                                     thrust::make_zip_iterator( thrust::make_tuple( keyVector.end(), valVector.end() ) ) );

IntIteratorTuple endTuple = newEnd.get_iterator_tuple();

// Trim the vectors
keyVector.erase( thrust::get<0>( endTuple ), keyVector.end() );
valVector.erase( thrust::get<1>( endTuple ), valVector.end() );

Tried with: Thrust 1.3 and CUDA 3.2