Code Yarns ‍👨‍💻
Tech BlogPersonal Blog

Compute Visual Profiler: Version 4 is Slow due to Analysis

📅 2011-Jun-08 ⬩ ✍️ Ashwin Nanjappa ⬩ 📚 Archive

Problem

The Compute Visual Profiler 4.0.17 that ships with CUDA 4.0 has a new Analysis feature. This is enabled on any session and is always active when using the Summary Table. I found this analysis to be too slow for my kernels since it hogs the CPU, making it difficult to interact with the profiler. And there seems to be no way to turn off this Analysis feature in this version of the profiler! 😐

Solution

Use the Compute Visual Profiler 3.2.0 from CUDA 3.2. I found that is works fine on CUDA executables compiled with CUDA 4.0. Copy over the computeprof directory found in %CUDA_PATH% from your older CUDA installation or from another computer. Use the computeprof.exe found in its bin directory. No additional DLLs are required.

Tried with: Compute Visual Profiler 4.0.17 and CUDA 4.0