Pulse · microsoft/onnxruntime · GitHub

April 6, 2025 – April 13, 2025

Overview

75 Active pull requests

24 Active issues

Could not load contribution data

Please try again later

46 Pull requests merged by 28 people

Add API to compile a model
#24207 merged Apr 12, 2025
ORT-OVEP Doc update
#24395 merged Apr 12, 2025
Update protobuf-java to 3.25.5
#24333 merged Apr 12, 2025
Bump vite from 6.2.5 to 6.2.6 in /js/web/test/e2e/exports/testcases/vite-default
#24396 merged Apr 11, 2025
[Native WebGPU EP] Add InstranceNormalization
#24369 merged Apr 11, 2025
Migrate OpenVino Pipeline to Github Actions
#24297 merged Apr 11, 2025
[webgpu] fix 2 bugs in Conv/ConvTranspose
#24388 merged Apr 11, 2025
[QNN EP] Add support for int64 shape input of Expand Op
#24389 merged Apr 11, 2025
[webgpu] Use workgroup memory to reduce register pressure
#24286 merged Apr 11, 2025
MlasTranspose multi-threads support.
#24261 merged Apr 11, 2025
[web] allow NPM tests to run nodejs binding for webgpu
#24370 merged Apr 11, 2025
Make test CApiTest.RequestLoadCancellation deterministic
#24348 merged Apr 10, 2025
Remove build-nuget from dml-vs-2022.yml
#24372 merged Apr 10, 2025
[WebNN EP] Support GroupQueryAttention(GQA)
#23416 merged Apr 10, 2025
Pin wheel version to 0.45.1 (#24349)
#24367 merged Apr 10, 2025
fix compile on latest 24.02
#24364 merged Apr 9, 2025
Update unknown provider error message with current providers
#24352 merged Apr 9, 2025
[EP Perf] Extension to post benchmark perf from local devices
#24236 merged Apr 9, 2025
[WebNN] Support MatMulNBits op
#24142 merged Apr 9, 2025
[web] fix TypeScript typing and add a test case
#24354 merged Apr 9, 2025
Support WebGPU build for android and ios
#24308 merged Apr 9, 2025
[QNN EP] Add support for Int64 tensors
#24351 merged Apr 9, 2025
[QNN-EP] LoRAv2 Document update
#24205 merged Apr 9, 2025
Pin wheel version to 0.45.1
#24349 merged Apr 9, 2025
Merge 'main' into 'win-ort-main' @ 39e585ff2b
#24353 merged Apr 9, 2025
[webgpu][dawn API optimization] workgroup dispatch
#24329 merged Apr 8, 2025
Group build args
#24337 merged Apr 8, 2025
[WebGPU EP] Exclude zero-dim input test case for WebGPU EP.
#24350 merged Apr 8, 2025
[web] revise flag ort.env.wasm.simd
#24314 merged Apr 8, 2025
ROCm: Remove -Wno-interference-size compiler flag
#24326 merged Apr 8, 2025
[webgpu] optimize SkipLayerNormalization operator
#24164 merged Apr 8, 2025
[webgpu] fix bias-add
#24336 merged Apr 8, 2025
[webgpu] Fix bias_split_gelu
#24342 merged Apr 8, 2025
Remove explicit batch network flag for TRT 10+
#24298 merged Apr 8, 2025
[webgpu] fix the reflect mode issue of Pad
#24202 merged Apr 8, 2025
webgpu support for DequantizeLinear
#24268 merged Apr 8, 2025
Use WASM f32x4 relaxed min/max for relaxed simd build
#24324 merged Apr 8, 2025
[webgpu] Flash attention for generation
#23808 merged Apr 8, 2025
Bump version to 1.21.1
#24328 merged Apr 8, 2025
[VitisAI EP] export InferShapes to VitisAIEP
#23881 merged Apr 8, 2025
[Native WebGPU] Exclude WebGPU EP from Conv3D tests.
#24327 merged Apr 7, 2025
[webgpu] Fix ROUND_PREFER_CEIL issue of Resize operator
#24229 merged Apr 7, 2025
Update Vitis-AI-ExecutionProvider.md
#24254 merged Apr 7, 2025
Implement load cancellation ability
#24257 merged Apr 7, 2025
[webgpu][dawn API optimization] reduce number of calls to buffer APIs
#24315 merged Apr 7, 2025
[webgpu] Use 1D dispatch groups for attention
#24228 merged Apr 7, 2025

29 Pull requests opened by 27 people

Remove Unnecessary List Conversions in Input Validation
#24331 opened Apr 7, 2025
Add GQA fusion for CUDA EP
#24335 opened Apr 7, 2025
Enable SME for sgemm and sbgemm through KleidiAI
#24346 opened Apr 8, 2025
[WebGPU EP] Add EINSUM implementation
#24358 opened Apr 9, 2025
[VitisAI] enable weights sharing
#24359 opened Apr 9, 2025
Fix Windows_CI_GPU_DML_Dev_x86 and Windows_CI_GPU_DML_Dev_arm64 pipeline steps
#24365 opened Apr 9, 2025
[nodejs] support Node.js binding in multi env
#24366 opened Apr 9, 2025
[ort-build] Pass ORT_EXTRA_INTERFACE_FLAGS to onnxruntime_session
#24368 opened Apr 9, 2025
Implement experimental intermediate cross CPU EP allocation
#24371 opened Apr 9, 2025
Fix MatmulTransposeFusion when input A and B are the same
#24373 opened Apr 10, 2025
[WIP] Support export of Llama with DynamicCache and transformers>=4.51
#24379 opened Apr 10, 2025
Update whisper transformer module to 4.48.0
#24382 opened Apr 10, 2025
[CPU] Add 8bit support to matmulnbits quantizer
#24384 opened Apr 10, 2025
Add Resize cubic mode without antialias (scales = [1, ≥1, ≥1, 1])
#24385 opened Apr 10, 2025
[MacOS] Add MLProgram Gather op for CoreML EP
#24387 opened Apr 10, 2025
[webgpu] Supports batch and zero points in MatMulNBits WideTileProgram
#24390 opened Apr 11, 2025
[DML EP] Support in-memory external data TensorProto
#24391 opened Apr 11, 2025
Replace gsl::narrow with narrow in xnnpack code
#24392 opened Apr 11, 2025
No float from chars on gcc9
#24393 opened Apr 11, 2025
ONNXRuntime OpenVINO - Release 1.22
#24394 opened Apr 11, 2025
Add LSX support for S8S8 and S8U8 GEMM kernels
#24397 opened Apr 11, 2025
[Native WebGPU] Support shared memory version of ReduceOps
#24399 opened Apr 11, 2025
[webgpu] move comments out from WGSL in FlashAttention impl
#24400 opened Apr 12, 2025
Support mixed precision in quantization for RTN
#24401 opened Apr 12, 2025
Add ability to disable Model Editor API
#24402 opened Apr 12, 2025
coreml: fix wrong C++ code in documentation
#24403 opened Apr 12, 2025
[webgpu] Fix batch-norm for ort-web-tests
#24404 opened Apr 13, 2025
Bump ruff from 0.9.5 to 0.11.5
#24405 opened Apr 14, 2025
[OpenVINO EP] Implement new overload of CreateProvider() for OpenVINO EP
#24406 opened Apr 14, 2025

16 Issues closed by 13 people

[Build] Onnx runtime cross compilation with toolchain arm-linux-gnueabihf-9.1.0-gcc/g++ with reduced ops config failing
#24279 closed Apr 13, 2025
[Feature Request] Implement provider destruction notification mechanism (e.g: IExecutionProvider::OnSessionEnd() )
#22970 closed Apr 12, 2025
Inputs are reordered by TensorRT provider
#22729 closed Apr 11, 2025
[Build] Cuda Execution Provider library is needed despite we only use TensoRT Execution provider
#22960 closed Apr 11, 2025
[Build] pip Installation Failure on aarch64 (NVIDIA Jetson Orin)
#24380 closed Apr 10, 2025
[Web] `onnxruntime-node` Linux addon binaries contain duplicate identical copies of `libonnxruntime.so.x` taking up extra ~40MB
#23956 closed Apr 10, 2025
[Performance] version 1.17.1 causes performance regression over 1.16.3 both with TRT EP and Cuda EP on Faster-RCNN model inference
#19955 closed Apr 10, 2025
[Build] cublasLt64_11.dll is required even when building cuda 12
#24360 closed Apr 10, 2025
[Web] TypeScript typings are not available with moduleResolution: bundler
#24343 closed Apr 9, 2025
[Web] Upgrading from 1.20.1 to 1.21.* breaks Segment Anything models on WebGPU
#23183 closed Apr 9, 2025
[Web] Allow to disable the SIMD detection check
#24292 closed Apr 8, 2025
[Build] Build python interface for Onnxruntime-qnn on aarch64 Open Embedded Linux
#24102 closed Apr 8, 2025
Access violation in nvcuda64.dll on Session::Run
#24339 closed Apr 8, 2025
Python 3.11.11 can't install onnxruntime==1.20.1
#24338 closed Apr 8, 2025
ONNX cannot save the XGBoost binary classifier properly when trained on an imbalanced dataset.
#24334 closed Apr 7, 2025
[Web] Ort tries (and fails) to fetch an .mjs file from localhost when imported through importScripts() using a CDN url in a web worker
#24325 closed Apr 7, 2025

8 Issues opened by 7 people

[Performance][Regression] Numerical mismatch between ORT 1.21.0 and PyTorch
#24398 opened Apr 11, 2025
[Build] Build for android with xnnpack enabled and Exceptions disabled has error with gsl::narrow
#24383 opened Apr 10, 2025
GPU Memory Leak on CUDAExecutionProvider with Specific Batch Size Sequence
#24376 opened Apr 10, 2025
How to confirm if the current hardware supports operators such as fp16/int8/int4?
#24375 opened Apr 10, 2025
quantize onnx models to INT8
#24374 opened Apr 10, 2025
[Build] CUDA Minimal build still needs CUDNN_HOME to be specified
#24361 opened Apr 9, 2025
onnxruntime cannot correctly process the argument of MatMul: Node input '/Cast_2_output_0' is not a graph input, initializer, or output of a previous node.
#24341 opened Apr 8, 2025
onnxruntime errors out due to ORT_ENABLE_EXTENDED optimization: Error merging shape info for output
#24340 opened Apr 8, 2025

42 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

[WebNN EP] Support MultiHeadAttention(MHA)
#24079 commented on Apr 14, 2025 • 16 new comments
Fix an issue in wasm nortti build and add minimal build support for vcpkg
#24012 commented on Apr 10, 2025 • 16 new comments
[WebNN EP] Automatically use ml-tensor for outputs
#24282 commented on Apr 13, 2025 • 13 new comments
Add static quantization runner
#24114 commented on Apr 9, 2025 • 1 new comment
in cmake/CMakeList.txt all avx related option all set off, do we need do anything to use avx features?
#11833 commented on Apr 11, 2025 • 0 new comments
ONNXRuntime-Node v1.21: "Specified device is not supported" Error on Ubuntu 22.04.4 LTS During session.run Execution
#24264 commented on Apr 11, 2025 • 0 new comments
[Build] compilation error: invalid instruction mnemonic 'vcvtneeph2ps'
#22519 commented on Apr 12, 2025 • 0 new comments
[Mobile] I After onnx is compiled using NDK, the compiled result is then linked to the C++ program
#24024 commented on Apr 12, 2025 • 0 new comments
[Web] No way to prevent the default wasm from being bundled
#24009 commented on Apr 12, 2025 • 0 new comments
Turning on coreml and turning off coreml are two results
#24033 commented on Apr 13, 2025 • 0 new comments
[Web] How to use JSEP and WebGPU in static library (missing jsepAlloc or jsepInit)
#23072 commented on Apr 13, 2025 • 0 new comments
Bump @babel/helpers from 7.25.6 to 7.26.10 in /js/react_native/e2e
#23993 commented on Apr 11, 2025 • 0 new comments
Bump @babel/runtime from 7.25.6 to 7.26.10 in /js/react_native/e2e
#23994 commented on Apr 11, 2025 • 0 new comments
[VSINPU]Fix gather OP with scalar indice issue
#24061 commented on Apr 11, 2025 • 0 new comments
[webgpu] Use 64 as the workgroup size of DP4AMatMulQuantize
#24129 commented on Apr 7, 2025 • 0 new comments
Fuse Initializers Graph Transform
#24175 commented on Apr 11, 2025 • 0 new comments
Validate CreateSessionFromArray with ep.context_enable enabled
#24176 commented on Apr 11, 2025 • 0 new comments
Enable Inference Results Saving in onnx-test-runner
#24210 commented on Apr 14, 2025 • 0 new comments
Add ConvSymKernelAvx2 assembly saturation checker
#24220 commented on Apr 12, 2025 • 0 new comments
[WIP] Support export of Llama with DynamicCache and transformers>=4.48
#24291 commented on Apr 10, 2025 • 0 new comments
Upgrade transformers to 4.48.0 for llama2
#24302 commented on Apr 12, 2025 • 0 new comments
GLU Operator gives different Results on Dml EP compared to CPU EP
#24311 commented on Apr 7, 2025 • 0 new comments
Error when I use cuda_runtime.h and OpenVINO EP at the same time
#23941 commented on Apr 7, 2025 • 0 new comments
SIGSEGV when calling OrtSession.run()
#24288 commented on Apr 7, 2025 • 0 new comments
OnnxRuntime 1.21.0 Java package failed to load on Windows
#24287 commented on Apr 7, 2025 • 0 new comments
how to get memory allocate detail in model infer?
#24323 commented on Apr 8, 2025 • 0 new comments
[DO NOT UNPIN] ORT Nightly Package Name Change
#22541 commented on Apr 8, 2025 • 0 new comments
[Feature Request] Please publish 3.13t wheels for Windows
#24318 commented on Apr 8, 2025 • 0 new comments
Using directML to inference accelerate onnxruntime, a crash occurred.
#22514 commented on Apr 8, 2025 • 0 new comments
[DO NOT UNPIN] ORT 1.21.0 Release Candidates available for testing
#23885 commented on Apr 8, 2025 • 0 new comments
[Web] Different model output with WebGPU vs WASM (or Python with the CUDA EP)
#24070 commented on Apr 9, 2025 • 0 new comments
[Feature Request] use Onnxruntime TensorRT execution provider with lean tensorRT runtime
#23082 commented on Apr 10, 2025 • 0 new comments
[Build] Unable to Compile ONNX Runtime 1.20.1 with ARMNN Provider on ARM Cortex A78
#23014 commented on Apr 10, 2025 • 0 new comments
[Performance] CUDAExecutionProvider without RoiAlign (opset 16 version)
#21990 commented on Apr 10, 2025 • 0 new comments
[Performance] does onnxruntime 1.19.0 support sve?
#23983 commented on Apr 10, 2025 • 0 new comments
[preprocess] Pad is not folded in Conv when opset_import is > 20
#23973 commented on Apr 10, 2025 • 0 new comments
[Build] .pc file asks for -lonnxruntime but onnxruntime.a isn't installed
#23959 commented on Apr 10, 2025 • 0 new comments
[Build] memory leaked
#23915 commented on Apr 10, 2025 • 0 new comments
[Build] Build script for MacOS fails for targets older than 13.4 because tests can not be built
#24277 commented on Apr 10, 2025 • 0 new comments
[Build] error: array index 7 is past the end of the array (that has type '__m256[4]')
#23180 commented on Apr 10, 2025 • 0 new comments
[Build] Missmatch between CMake config and folder structure of onnxruntime-linux-x64-1.21.0.tgz
#24003 commented on Apr 11, 2025 • 0 new comments
Bad Allocation Error in ONNX Runtime on Windows x86 CPU When Processing Multiple Images Sequentially
#23938 commented on Apr 11, 2025 • 0 new comments