[Performance] CUDAExecutionProvider without RoiAlign (opset 16 version) #21990
Labels
ep:CUDA
issues related to the CUDA execution provider
performance
issues related to performance regressions
stale
issues that have not been addressed in a while; categorized by a bot
Describe the issue
i'm using cascade mask rcnn model in detectron2. when export onnx, it has RoiAlign (opset 16 version) in model file.
when running on onnxruntime (Cuda EP), it's too slow since RoiAlign running on CPU EP.
Could anyone provider RoiAlign (opset 16 version) on Cuda EP?
To reproduce
1、Exporting Cascade Mask RCNN in detectron2;
2、Running model in Onnxruntime Cuda EP;
Urgency
No response
Platform
Windows
OS Version
Win10
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.18.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 11.8 and CUDA 12.2
Model File
No response
Is this a quantized model?
No
The text was updated successfully, but these errors were encountered: