Skip to content

Fix layout transformer for FusedConv #24169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 25, 2025

Conversation

fs-eire
Copy link
Contributor

@fs-eire fs-eire commented Mar 25, 2025

Description

Fix layout transformer for FusedConv.

The current layout transformer will transform FusedConv (kMSDomain) into FusedConv (kMSInternalNHWCDomain) if the EP wants channels_last.

However, kMSInternalNHWCDomain uses OpType Conv for both Conv and FusedConv, so FusedConv (kMSInternalNHWCDomain) is invalid (unregistered op).

This PR fixes this and allows layout transformer change FusedConv (kMSDomain) into Conv (kMSInternalNHWCDomain).

Motivation and Context

@tianleiwu
Copy link
Contributor

tianleiwu commented Mar 25, 2025

Has you done some parity test for such FusedConv conversion? My understanding is that FusedConv might have activation, but Conv does not support that.

I noticed that there was NhwcFusedConv:

OpKernelRegistryId nhwc_conv_fp16{
"NhwcFusedConv", kMSDomain, 1, {{"T", {DataTypeImpl::GetTensorType<MLFloat16>()}}}};

That could impact the layout transformer as well.

@fs-eire
Copy link
Contributor Author

fs-eire commented Mar 25, 2025

Has you done some parity test for such FusedConv conversion? My understanding is that FusedConv might have activation, but Conv does not support that.

I noticed that there was NhwcFusedConv:

OpKernelRegistryId nhwc_conv_fp16{
"NhwcFusedConv", kMSDomain, 1, {{"T", {DataTypeImpl::GetTensorType<MLFloat16>()}}}};

That could impact the layout transformer as well.

Specifically, the Conv in opset kMSInternalNHWCDomain supports activation and this is one of the reason why there is no FusedConv in opset kMSInternalNHWCDomain

Copy link
Contributor

@tianleiwu tianleiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FusedConv is not in the list of GetCUDALayoutSensitiveOps for CUDA:

const std::unordered_set<std::string_view>& GetCUDALayoutSensitiveOps() {
static std::unordered_set<std::string_view> cuda_nhwc_ops = []() {
return std::unordered_set<std::string_view>{
"BatchNormalization",
"Conv",
"ConvTranspose",
"GlobalMaxPool",
"MaxPool",
"GlobalAveragePool",
"AveragePool",
"GridSample",
"DepthToSpace",
"SpaceToDepth",
"LRN"};

So this change shall be fine for CUDA.

@fs-eire fs-eire merged commit 25b06f2 into main Mar 25, 2025
101 of 103 checks passed
@fs-eire fs-eire deleted the fs-eire/fix-layout-transform-for-fused-conv branch March 25, 2025 21:54
quic-zhaoxul pushed a commit to CodeLinaro/onnxruntime that referenced this pull request Apr 17, 2025
### Description

Fix layout transformer for FusedConv.

The current layout transformer will transform `FusedConv` (kMSDomain)
into `FusedConv` (kMSInternalNHWCDomain) if the EP wants channels_last.

However, kMSInternalNHWCDomain uses OpType `Conv` for both Conv and
FusedConv, so `FusedConv` (kMSInternalNHWCDomain) is invalid
(unregistered op).

This PR fixes this and allows layout transformer change `FusedConv`
(kMSDomain) into `Conv` (kMSInternalNHWCDomain).

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants