I used TensorRT to convert a Caffe model into an engine (plan) file on one computer. When I tried to load this engine (plan) file on another computer and use it for inference using TensorRT, I got this error:
customWinogradConvActLayer.cpp:195: virtual void nvinfer1::cudnn::WinogradConvActLayer::allocateResources(const nvinfer1::cudnn::CommonContext&): Assertion 'configIsValid(context)' failed.
It turns out that the first computer had a NVIDIA 1080 Ti GPU and the engine had been created for it. The second computer had a NVIDIA K80 GPU. Though, TensorRT documentation is vague about this, it seems like an engine created on a specific GPU can only be used for inference on the same model of GPU!
When I created a plan file on the K80 computer, inference worked fine.
Tried with: TensorRT 2.1, cuDNN 6.0 and CUDA 8.0