-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Insights: microsoft/onnxruntime
Overview
Could not load contribution data
Please try again later
33 Pull requests merged by 21 people
-
Integration with ONNX 1.19
#25678 merged
Sep 6, 2025 -
[QNN EP] fix memory leak in OrtEpFactory::GetSupportedDevices()
#25968 merged
Sep 5, 2025 -
Make name an std::string in OrtMemoryInfo
#25960 merged
Sep 5, 2025 -
cherry picks for 1.23.0 release
#25959 merged
Sep 5, 2025 -
[GH Pages] Add plugin EP library documentation
#25880 merged
Sep 5, 2025 -
Upgrade wheel to 0.45.1
#25957 merged
Sep 4, 2025 -
Use vmlaq_f32 in MlasMultipleAddFloat32x4 for Android armeabi-v7a.
#25955 merged
Sep 4, 2025 -
[NV TensorRT RTX] Handle unsupported data types
#25953 merged
Sep 4, 2025 -
[MLAS] Add 8-bit weights ARM64 Gemm implementation
#25110 merged
Sep 4, 2025 -
[WebNN] Support IsNaN and IsInf
#25930 merged
Sep 4, 2025 -
Compile API: output model and initializer stream write functions
#25455 merged
Sep 4, 2025 -
Remove std::string::data() non-const usage from public headers
#25943 merged
Sep 4, 2025 -
Fix transpose of empty initializers called from constant folding
#25922 merged
Sep 4, 2025 -
[QNN-EP] Add negtive padding value support.
#25928 merged
Sep 4, 2025 -
Improve packaging pipelines RC version support
#25938 merged
Sep 4, 2025 -
[Core] Fix debug node input output compilation after Fp4 support was enabled in ORT
#25940 merged
Sep 4, 2025 -
Bump next from 15.2.4 to 15.4.7 in /js/web/test/e2e/exports/testcases/nextjs-default
#25904 merged
Sep 4, 2025 -
Migrate model tests to ONNX Model ZOO only
#25888 merged
Sep 3, 2025 -
upgrade dawn to 13c1635a14574ebb7116b56a69f5519301417fda
#25703 merged
Sep 3, 2025 -
Updated MetaHuman testimonial from Epic Games.
#23737 merged
Sep 3, 2025 -
POWER : Implement MlasGemmQuantKernel using VSX builtins for M = 1
#25490 merged
Sep 3, 2025 -
Revert "[QNN EP] Remove workaround for syntax error macro (#25923)"
#25939 merged
Sep 3, 2025 -
[Core] Support fp4 type in ORT
#25767 merged
Sep 3, 2025 -
[CXX] Introduce C++ API for new C entry points
#25897 merged
Sep 3, 2025 -
Compile API: disable optimizations by default
#25474 merged
Sep 3, 2025 -
[TRT RTX EP] Add support for RTX runtime caches
#25917 merged
Sep 3, 2025 -
[TRT RTX EP] Memory map the engine buffer
#25909 merged
Sep 3, 2025 -
[TRT RTX EP] Add sync method
#25898 merged
Sep 2, 2025 -
[QNN EP] Remove workaround for syntax error macro
#25923 merged
Sep 2, 2025 -
Support DynamicQuantizeLinear op
#25905 merged
Sep 2, 2025 -
Replace vmlaq_f32 with vfmaq_f32 (fused multiply-add)
#25669 merged
Sep 2, 2025 -
[webgpu] Add back missing code comments for flash decoding
#25879 merged
Sep 1, 2025
24 Pull requests opened by 15 people
-
[CUDA] Support SwiGlu in MoE and qMoE (#25530)
#25903 opened
Aug 30, 2025 -
Bump ruff from 0.12.9 to 0.12.11
#25906 opened
Sep 1, 2025 -
Bump clang-format from 20.1.8 to 21.1.0
#25907 opened
Sep 1, 2025 -
[webgpu] Manually normalize `Conv-Transpose` dispatch group size for Intel GPUs
#25908 opened
Sep 1, 2025 -
Enables Matmul and Gemm for float16 on CPU
#25913 opened
Sep 1, 2025 -
[TRT RTX EP] Remove unused legacy code from TRT EP that is no longer supported for TRT RTX
#25916 opened
Sep 2, 2025 -
[Web] Magic import comment missing for Vite (/* @vite-ignore */)
#25919 opened
Sep 2, 2025 -
Update AppendExecutionProvider docs
#25924 opened
Sep 2, 2025 -
Fix Local Attention off by 1 bug
#25927 opened
Sep 2, 2025 -
[webgpu] Optimize flash decoding by merging QKT and SplitVx shader
#25929 opened
Sep 3, 2025 -
[WebNN] Fix bug in quantizeLinear and dequantizeLinear
#25931 opened
Sep 3, 2025 -
[webgpu] Add dispatchWorkgroupsIndirect support
#25934 opened
Sep 3, 2025 -
[WebGPU] allow async shader compilation
#25941 opened
Sep 3, 2025 -
Add onnxruntime_plugin_ep_onnx_test for plugin EPs to run onnx test
#25942 opened
Sep 4, 2025 -
[webgpu] Unify the present_sequence_length in flash attention
#25945 opened
Sep 4, 2025 -
Bump electron from 28.3.2 to 35.7.5 in /js/web
#25950 opened
Sep 4, 2025 -
[CPU] MoE Kernel
#25958 opened
Sep 4, 2025 -
[webgpu] Move copy_kv_cache to template
#25961 opened
Sep 5, 2025 -
Split large inputs into smaller buffers to bypass maxStorageBufferBindingSize limit
#25962 opened
Sep 5, 2025 -
[TRT RTX EP] Rename TRT RTX libs to not collide with TRT
#25964 opened
Sep 5, 2025 -
Fix Attention GQA implementation on CPU
#25966 opened
Sep 5, 2025 -
Update React Native iOS CI build.
#25967 opened
Sep 5, 2025 -
Clean up CUDA contrib ops folder
#25969 opened
Sep 5, 2025 -
[WIP] Fix Mac Catalyst build options.
#25970 opened
Sep 5, 2025
21 Issues closed by 7 people
-
GetEpDevices() does not Detect Intel NPU via OpenVINO EP
#25557 closed
Sep 5, 2025 -
[Build] compiling errors for armeabi-v7a: vfmaq_f32 undecared
#25949 closed
Sep 4, 2025 -
[BUG] Non-zero status code returned while running Resize node. in Direct ML backend
#24928 closed
Sep 3, 2025 -
Incorrect Use of CUDA Constants in MIGraphXExecutionProvider::CreatePreferredAllocators (Should Use HIP)
#25268 closed
Sep 3, 2025 -
[Performance] How to used pinned memory in onnxruntime.
#20947 closed
Sep 2, 2025 -
[Build] can't build CUDA (+ vino and directML) for latest v1.22 on windows
#25081 closed
Sep 2, 2025 -
Initializers use wrong allocator
#25108 closed
Sep 2, 2025 -
[Build] CMake configurations files for bin release 1.22.0 are broken
#25242 closed
Sep 2, 2025 -
[DirectML EP] Error when validating attributes of `Slice` operator
#25252 closed
Sep 2, 2025 -
[Build] CMake configurations files for bin release 1.22.0 are broken for Linux
#25279 closed
Sep 2, 2025 -
Invalid MIGraphX EP option: migraphx_load_compiled_path
#25379 closed
Sep 2, 2025 -
[Mobile] OnnxTensor FLOAT16 creation works, but runtime fails with "unsupported type FLOAT16
#25891 closed
Sep 1, 2025 -
[Feature Request] Unable to run microsoft/Phi-3.5-vision-instruct-onnx on GPU with C#
#25892 closed
Sep 1, 2025 -
[Build] CCCL API migration issue.
#24774 closed
Aug 31, 2025 -
[Build] How to build ONNX Runtime as a dynamic framework (.dylib/.framework) for iOS?
#25256 closed
Aug 31, 2025 -
about infer ocr Memory exception
#25258 closed
Aug 31, 2025 -
[Web] Error using opus-mt-mul fp16 models with WebGPU
#25125 closed
Aug 30, 2025
16 Issues opened by 15 people
-
[Performance] Long-context OOM on newer architectures
#25965 opened
Sep 5, 2025 -
[Web] Static WASM library: Actually selecting WebGpuExecutionProvider
#25952 opened
Sep 4, 2025 -
Crash in `MlasRotaryEmbedOneRow` when running EmbeddingGemma
#25951 opened
Sep 4, 2025 -
[Runtime bug on Windows] resize op exception. Tensor type mismatch.
#25948 opened
Sep 4, 2025 -
[Build] Wheel file not present after build from source with `--build_wheel`
#25947 opened
Sep 4, 2025 -
[Performance] Performance Regression in ReduceMean Operator on Specific Axes
#25946 opened
Sep 4, 2025 -
Latest release v1.22.2 not available on pypi
#25944 opened
Sep 4, 2025 -
[Build] CUDA 13 Failed
#25936 opened
Sep 3, 2025 -
Could not find an implementation for QuantizeLinear(23)
#25932 opened
Sep 3, 2025 -
LayerNormalization - The parameter is incorrect error
#25926 opened
Sep 2, 2025 -
[Build] build for catalyst support will generate invalid compile command
#25920 opened
Sep 2, 2025 -
[Web] Magic import comment missing for Vite (/* @vite-ignore */)
#25918 opened
Sep 2, 2025 -
Deadlock Bug in Win10 + ORT + Specific Execution Mode
#25915 opened
Sep 2, 2025 -
No opset import for domain 'com.microsoft.nchwc' error
#25914 opened
Sep 2, 2025 -
Windows 7 compatibility for Microsoft.ML.OnnxRuntime NuGet package with C#
#25910 opened
Sep 1, 2025
57 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[AIX] test-suites failure fixes
#25791 commented on
Sep 5, 2025 • 17 new comments -
FP16 inference performance improvement on CPU
#25680 commented on
Sep 2, 2025 • 10 new comments -
NEON kernels for NCHWc Convolution and Pooling
#25580 commented on
Sep 5, 2025 • 9 new comments -
Support int4 and uint4 for reshape on opset 21+.
#25645 commented on
Sep 6, 2025 • 6 new comments -
Extend ORT perf test to report submit vs inference time
#25895 commented on
Sep 5, 2025 • 6 new comments -
Integrate SME1 SGEMM KleidiAI kernels
#25760 commented on
Sep 5, 2025 • 6 new comments -
[Don't review][webgpu] Make graph capture work on LLM
#25868 commented on
Sep 3, 2025 • 6 new comments -
[ARM CPU] SVE support for Elementwise kernels
#25238 commented on
Sep 2, 2025 • 5 new comments -
[CANN] Fix the ACL_ERROR_REPEAT_INITIALIZE error that occurs when coexisting with torch_npu
#25902 commented on
Aug 30, 2025 • 5 new comments -
Fix incorrect symbolic shape inference for ConstantOfShape
#25857 commented on
Sep 1, 2025 • 1 new comment -
Add model compilation in ORT perf test
#25797 commented on
Sep 5, 2025 • 1 new comment -
[Feature Request] Making the cuda version explicit in the python package name
#19438 commented on
Sep 5, 2025 • 0 new comments -
onnxruntime 1.17.0: transformers benchmarking failing for int8 quantized inference.
#19409 commented on
Sep 5, 2025 • 0 new comments -
Compile API: output EPContext binary data to write function
#25471 commented on
Sep 4, 2025 • 0 new comments -
LayerNormalization shape check error
#25896 commented on
Sep 5, 2025 • 0 new comments -
[Web] Avoid unnecessary data copy for pre-allocated tensors
#25571 commented on
Sep 5, 2025 • 0 new comments -
Error messages from QNN are turned into verbose level messages
#24876 commented on
Sep 5, 2025 • 0 new comments -
[CANN] When using onnxruntime-cann for inference, it failed to utilize the NPU for inference
#22229 commented on
Sep 5, 2025 • 0 new comments -
Update ORT to handle explicit OpSchemaRegisterOnce API in ONNX >= 1.18.0 for fluent chaining
#24561 commented on
Aug 30, 2025 • 0 new comments -
Bump transformers from 4.50.0 to 4.53.0 in /onnxruntime/python/tools/transformers/models/stable_diffusion/requirements
#25672 commented on
Sep 6, 2025 • 0 new comments -
Add `onnxruntime_show_ep_devices`
#25780 commented on
Sep 2, 2025 • 0 new comments -
webgpu: support bool type for Flatten and Unsqueeze operators
#25804 commented on
Sep 4, 2025 • 0 new comments -
Fix CPU EP Tile 0D overvalidation
#25821 commented on
Sep 2, 2025 • 0 new comments -
Add ort api for CustomOp compute events.
#25843 commented on
Sep 1, 2025 • 0 new comments -
[webgpu] Optimize MatMul Op
#25854 commented on
Sep 5, 2025 • 0 new comments -
[QNN EP] Add Case-2 LPBQ pattern support for Gemm and Matmul nodes
#25865 commented on
Sep 5, 2025 • 0 new comments -
[TRT-RTX EP] add missing thread header include to TRT-RTX provider
#25875 commented on
Sep 3, 2025 • 0 new comments -
[DRAFT] Attention and MHA CUDA BFloat16 Support - not ready for review
#25885 commented on
Sep 5, 2025 • 0 new comments -
[Documentation] Update ONNX Runtime Documentation to Position WinML as Preferred Windows Path and DirectML in Sustained Engineering
#25901 commented on
Sep 3, 2025 • 0 new comments -
[Build] add presets for CMake configuration.
#22628 commented on
Aug 30, 2025 • 0 new comments -
[Auto EP] How to set graph optimization level specifically for a registered EP?
#25593 commented on
Aug 30, 2025 • 0 new comments -
[Bug] Models that used to run on onnxruntime 1.16.3 causes AccessViolationException when loading on onnxruntime 1.22.1
#25588 commented on
Aug 30, 2025 • 0 new comments -
[OpenVINO] SessionOptionsAppendExecutionProvider_OpenVINO API loads NULL config file
#23871 commented on
Aug 30, 2025 • 0 new comments -
[Bug] [Performance] Cannot write_calibration_table for per channel quantization calibration
#25621 commented on
Aug 31, 2025 • 0 new comments -
Segmentation Fault running model
#25613 commented on
Aug 31, 2025 • 0 new comments -
[Bug] CUDAExecutionProvider fails to load due to missing libcudnn.so.9 in LD_LIBRARY_PATH when using onnxruntime-gpu==1.22.0
#25609 commented on
Aug 31, 2025 • 0 new comments -
Unable to load gpt-oss onnx model with dotnet
#25793 commented on
Aug 31, 2025 • 0 new comments -
quantize_dynamic causes shape mismatch when model has Gemm with transB = 1
#25890 commented on
Sep 1, 2025 • 0 new comments -
what(): Exception during loading: MLDataType for: tensor(int32) is not currently registered or supported
#25812 commented on
Sep 1, 2025 • 0 new comments -
[Mobile] CoreML Execution Provider not listed in supported values
#25869 commented on
Sep 2, 2025 • 0 new comments -
[Build]
#13606 commented on
Sep 3, 2025 • 0 new comments -
Does BatchNormalization support 2D shape of `X` input
#25230 commented on
Sep 3, 2025 • 0 new comments -
[Feature Request] Python support for AddExternalInitializersFromFilesInMemory
#25873 commented on
Sep 3, 2025 • 0 new comments -
IExecutionProvider::FusedNodeAndGraph set intermediate unused results as model outputs
#25647 commented on
Sep 3, 2025 • 0 new comments -
[Build] cmake cannot find KLEIDIAI - Windows 11 ARM
#24865 commented on
Sep 3, 2025 • 0 new comments -
[Mobile] MatMulNbits Q8 Errors out on Android
#24769 commented on
Sep 4, 2025 • 0 new comments -
Cannot run bfloat16 convolution.
#25740 commented on
Sep 4, 2025 • 0 new comments -
[Mobile] libonnxruntime4j_jni.so 16KB page size compatibility on Android ARM64
#25859 commented on
Sep 4, 2025 • 0 new comments -
[Web] WebGPU first‑run model warm‑up causes long GPU‑blocking operations (maxDispatchNumber & synchronous pipeline creation)
#25882 commented on
Sep 4, 2025 • 0 new comments -
Inference fails with 4 bit quantization
#25631 commented on
Sep 4, 2025 • 0 new comments -
Cannot use Microsoft.ML.OnnxRuntime NuGet package 1.22.1 with Microsoft.SemanticKernel.Connectors.Onnx
#25287 commented on
Sep 4, 2025 • 0 new comments -
Onnx Runtime for Java is packaged with 200MB onnxruntime.pdb in the win-x64 native package
#12084 commented on
Sep 4, 2025 • 0 new comments -
[Feature Request] Windows ARM64 native wheels
#19161 commented on
Sep 4, 2025 • 0 new comments -
[Performance] cudaMemcpyAsync dominates runtime for batch inference with FP32 inputs
#25852 commented on
Sep 5, 2025 • 0 new comments -
[Performance] ONNX FP16 model is having performance bottle neck when compared to FP32 variant
#25824 commented on
Sep 5, 2025 • 0 new comments -
Type mismatch error when loading a Float16 model
#25522 commented on
Sep 5, 2025 • 0 new comments -
[WebGPU] Subgroups feature is not enabled for ort-web WebGPU EP
#25595 commented on
Sep 5, 2025 • 0 new comments