Skip to content

[WebNN] Fallback unsupported integer input and output of a WebNN graph to int32 #24425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 20, 2025

Conversation

Honry
Copy link
Contributor

@Honry Honry commented Apr 15, 2025

Some WebNN backends support limited data types for the input and output of a WebNN graph. However, they can support more data types for intermediate nodes. To address this limitation, we implement a data type fallback mechanism. (Note: Currently, we only support fallback to int32 for certain integer data types.)

If a data type is not supported for a graph's input or output but is supported for intermediate nodes, we will:

  1. Save the input MLTensor as 'int32' data type,
  2. Convert the input data from ORT to int32,
  3. Insert a cast operation to WebNN graph to convert the input back to its original data type,
  4. Insert a cast operation to WebNN graph to convert the output back to 'int32',
  5. Convert the output data from int32 to its original data type.

@Honry
Copy link
Contributor Author

Honry commented Apr 15, 2025

@fdwr, @guschmue, PTAL, thanks!

@fs-eire
Copy link
Contributor

fs-eire commented Apr 15, 2025

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,ONNX Runtime Web CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline

@fs-eire
Copy link
Contributor

fs-eire commented Apr 15, 2025

/azp run Linux QNN CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline,Linux DNNL CI Pipeline,Linux MIGraphX CI Pipeline,Linux ROCm CI Pipeline

Copy link

Azure Pipelines successfully started running 7 pipeline(s).

Copy link

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

@fdwr fdwr added the ep:WebNN WebNN execution provider label Apr 15, 2025
Copy link
Contributor

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding this. I hope it gets pushed down a level into the Chromium CoreML backend so callers didn't need to repeat this manual tensor conversion and cast-node insertion, but this demonstrates that it's possible.

@Honry Honry force-pushed the input-data-type-fallback branch from e5858e4 to 66ba571 Compare April 16, 2025 09:04
@Honry
Copy link
Contributor Author

Honry commented Apr 16, 2025

Thank you for adding this. I hope it gets pushed down a level into the Chromium CoreML backend so callers didn't need to repeat this manual tensor conversion and cast-node insertion, but this demonstrates that it's possible.

Indeed, according to Reilly's comment, it can't be done in Chromium. :(

@fdwr
Copy link
Contributor

fdwr commented Apr 17, 2025

Merge conflicts 💥.

Honry added 2 commits April 17, 2025 08:12
…h to int32

Some WebNN backends support limited data types for the input and output of a WebNN
graph. However, they can support more data types for intermediate nodes. To address
this limitation, we implement a data type fallback mechanism.
(Note: Currently, we only support fallback to int32 for certain integer data types.)
If a data type is not supported for a graph's input or output but is supported for
intermediate nodes, we will:

1. Save the input MLTensor as 'int32' data type,
2. Convert the input data from ORT to int32,
3. Insert a cast operation to WebNN graph to convert the input back to its original data type,
4. Insert a cast operation to WebNN graph to convert the output back to 'int32',
5. Convert the output data from int32 to its original data type.
@Honry Honry force-pushed the input-data-type-fallback branch from 66ba571 to 5962428 Compare April 17, 2025 00:26
@Honry
Copy link
Contributor Author

Honry commented Apr 17, 2025

Merge conflicts 💥.

@fdwr, fixed. Pls. help restart the CI.

fdwr
fdwr previously approved these changes Apr 17, 2025
Copy link
Contributor

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@fdwr
Copy link
Contributor

fdwr commented Apr 17, 2025

'webnnDataTypeToSize' was used before it was defined  @typescript-eslint/no-use-before-define

https://github.com/microsoft/onnxruntime/actions/runs/14505301882/job/40694637717?pr=24425#step:3:333

@Honry
Copy link
Contributor Author

Honry commented Apr 17, 2025

'webnnDataTypeToSize' was used before it was defined  @typescript-eslint/no-use-before-define

https://github.com/microsoft/onnxruntime/actions/runs/14505301882/job/40694637717?pr=24425#step:3:333

Oops, fixed.

@fdwr
Copy link
Contributor

fdwr commented Apr 17, 2025

/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline,Windows GPU WebGPU CI Pipeline,Windows OpenVINO CI Pipeline

@fdwr
Copy link
Contributor

fdwr commented Apr 17, 2025

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline

@fdwr
Copy link
Contributor

fdwr commented Apr 17, 2025

/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI

@fdwr
Copy link
Contributor

fdwr commented Apr 17, 2025

/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

1 similar comment
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

Copy link
Contributor

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Honry
Copy link
Contributor Author

Honry commented Apr 18, 2025

image

image

@fdwr, looks like this failure task is irrelevant to my change, do I need to rebase it to main?

@fdwr
Copy link
Contributor

fdwr commented Apr 19, 2025

@fdwr, looks like this failure task is irrelevant to my change, do I need to rebase it to main?

Restarting failed job. If it fails again, try remerging (or rebasing). Alas, required test.

@fs-eire fs-eire merged commit 67c87a1 into microsoft:main Apr 20, 2025
70 of 76 checks passed
ashrit-ms pushed a commit that referenced this pull request Apr 24, 2025
…h to int32 (#24425)

Some WebNN backends support limited data types for the input and output
of a WebNN graph. However, they can support more data types for
intermediate nodes. To address this limitation, we implement a data type
fallback mechanism. (Note: Currently, we only support fallback to int32
for certain integer data types.)

If a data type is not supported for a graph's input or output but is
supported for intermediate nodes, we will:
1. Save the input MLTensor as 'int32' data type,
2. Convert the input data from ORT to int32,
3. Insert a cast operation to WebNN graph to convert the input back to
its original data type,
4. Insert a cast operation to WebNN graph to convert the output back to
'int32',
5. Convert the output data from int32 to its original data type.
intbf pushed a commit to intbf/onnxruntime that referenced this pull request Apr 25, 2025
…h to int32 (microsoft#24425)

Some WebNN backends support limited data types for the input and output
of a WebNN graph. However, they can support more data types for
intermediate nodes. To address this limitation, we implement a data type
fallback mechanism. (Note: Currently, we only support fallback to int32
for certain integer data types.)

If a data type is not supported for a graph's input or output but is
supported for intermediate nodes, we will:
1. Save the input MLTensor as 'int32' data type,
2. Convert the input data from ORT to int32,
3. Insert a cast operation to WebNN graph to convert the input back to
its original data type,
4. Insert a cast operation to WebNN graph to convert the output back to
'int32',
5. Convert the output data from int32 to its original data type.

Signed-off-by: bfilipek <bartlomiej.filipek@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:WebNN WebNN execution provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants