-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Insights: microsoft/onnxruntime
Overview
Could not load contribution data
Please try again later
46 Pull requests merged by 28 people
-
Add API to compile a model
#24207 merged
Apr 12, 2025 -
ORT-OVEP Doc update
#24395 merged
Apr 12, 2025 -
Update protobuf-java to 3.25.5
#24333 merged
Apr 12, 2025 -
Bump vite from 6.2.5 to 6.2.6 in /js/web/test/e2e/exports/testcases/vite-default
#24396 merged
Apr 11, 2025 -
[Native WebGPU EP] Add InstranceNormalization
#24369 merged
Apr 11, 2025 -
Migrate OpenVino Pipeline to Github Actions
#24297 merged
Apr 11, 2025 -
[webgpu] fix 2 bugs in Conv/ConvTranspose
#24388 merged
Apr 11, 2025 -
[QNN EP] Add support for int64 shape input of Expand Op
#24389 merged
Apr 11, 2025 -
[webgpu] Use workgroup memory to reduce register pressure
#24286 merged
Apr 11, 2025 -
MlasTranspose multi-threads support.
#24261 merged
Apr 11, 2025 -
[web] allow NPM tests to run nodejs binding for webgpu
#24370 merged
Apr 11, 2025 -
Make test
CApiTest.RequestLoadCancellation
deterministic#24348 merged
Apr 10, 2025 -
Remove build-nuget from dml-vs-2022.yml
#24372 merged
Apr 10, 2025 -
[WebNN EP] Support GroupQueryAttention(GQA)
#23416 merged
Apr 10, 2025 -
Pin wheel version to 0.45.1 (#24349)
#24367 merged
Apr 10, 2025 -
fix compile on latest 24.02
#24364 merged
Apr 9, 2025 -
Update unknown provider error message with current providers
#24352 merged
Apr 9, 2025 -
[EP Perf] Extension to post benchmark perf from local devices
#24236 merged
Apr 9, 2025 -
[WebNN] Support MatMulNBits op
#24142 merged
Apr 9, 2025 -
[web] fix TypeScript typing and add a test case
#24354 merged
Apr 9, 2025 -
Support WebGPU build for android and ios
#24308 merged
Apr 9, 2025 -
[QNN EP] Add support for Int64 tensors
#24351 merged
Apr 9, 2025 -
[QNN-EP] LoRAv2 Document update
#24205 merged
Apr 9, 2025 -
Pin wheel version to 0.45.1
#24349 merged
Apr 9, 2025 -
Merge 'main' into 'win-ort-main' @ 39e585ff2b
#24353 merged
Apr 9, 2025 -
[webgpu][dawn API optimization] workgroup dispatch
#24329 merged
Apr 8, 2025 -
Group build args
#24337 merged
Apr 8, 2025 -
[WebGPU EP] Exclude zero-dim input test case for WebGPU EP.
#24350 merged
Apr 8, 2025 -
[web] revise flag
ort.env.wasm.simd
#24314 merged
Apr 8, 2025 -
ROCm: Remove -Wno-interference-size compiler flag
#24326 merged
Apr 8, 2025 -
[webgpu] optimize SkipLayerNormalization operator
#24164 merged
Apr 8, 2025 -
[webgpu] fix bias-add
#24336 merged
Apr 8, 2025 -
[webgpu] Fix bias_split_gelu
#24342 merged
Apr 8, 2025 -
Remove explicit batch network flag for TRT 10+
#24298 merged
Apr 8, 2025 -
[webgpu] fix the reflect mode issue of Pad
#24202 merged
Apr 8, 2025 -
webgpu support for DequantizeLinear
#24268 merged
Apr 8, 2025 -
Use WASM f32x4 relaxed min/max for relaxed simd build
#24324 merged
Apr 8, 2025 -
[webgpu] Flash attention for generation
#23808 merged
Apr 8, 2025 -
Bump version to 1.21.1
#24328 merged
Apr 8, 2025 -
[VitisAI EP] export InferShapes to VitisAIEP
#23881 merged
Apr 8, 2025 -
[Native WebGPU] Exclude WebGPU EP from Conv3D tests.
#24327 merged
Apr 7, 2025 -
[webgpu] Fix ROUND_PREFER_CEIL issue of Resize operator
#24229 merged
Apr 7, 2025 -
Update Vitis-AI-ExecutionProvider.md
#24254 merged
Apr 7, 2025 -
Implement load cancellation ability
#24257 merged
Apr 7, 2025 -
[webgpu][dawn API optimization] reduce number of calls to buffer APIs
#24315 merged
Apr 7, 2025 -
[webgpu] Use 1D dispatch groups for attention
#24228 merged
Apr 7, 2025
29 Pull requests opened by 27 people
-
Remove Unnecessary List Conversions in Input Validation
#24331 opened
Apr 7, 2025 -
Add GQA fusion for CUDA EP
#24335 opened
Apr 7, 2025 -
Enable SME for sgemm and sbgemm through KleidiAI
#24346 opened
Apr 8, 2025 -
[WebGPU EP] Add EINSUM implementation
#24358 opened
Apr 9, 2025 -
[VitisAI] enable weights sharing
#24359 opened
Apr 9, 2025 -
Fix Windows_CI_GPU_DML_Dev_x86 and Windows_CI_GPU_DML_Dev_arm64 pipeline steps
#24365 opened
Apr 9, 2025 -
[nodejs] support Node.js binding in multi env
#24366 opened
Apr 9, 2025 -
[ort-build] Pass ORT_EXTRA_INTERFACE_FLAGS to onnxruntime_session
#24368 opened
Apr 9, 2025 -
Implement experimental intermediate cross CPU EP allocation
#24371 opened
Apr 9, 2025 -
Fix MatmulTransposeFusion when input A and B are the same
#24373 opened
Apr 10, 2025 -
[WIP] Support export of Llama with DynamicCache and transformers>=4.51
#24379 opened
Apr 10, 2025 -
Update whisper transformer module to 4.48.0
#24382 opened
Apr 10, 2025 -
[CPU] Add 8bit support to matmulnbits quantizer
#24384 opened
Apr 10, 2025 -
Add Resize cubic mode without antialias (scales = [1, ≥1, ≥1, 1])
#24385 opened
Apr 10, 2025 -
[MacOS] Add MLProgram Gather op for CoreML EP
#24387 opened
Apr 10, 2025 -
[webgpu] Supports batch and zero points in MatMulNBits WideTileProgram
#24390 opened
Apr 11, 2025 -
[DML EP] Support in-memory external data TensorProto
#24391 opened
Apr 11, 2025 -
Replace gsl::narrow with narrow in xnnpack code
#24392 opened
Apr 11, 2025 -
No float from chars on gcc9
#24393 opened
Apr 11, 2025 -
ONNXRuntime OpenVINO - Release 1.22
#24394 opened
Apr 11, 2025 -
Add LSX support for S8S8 and S8U8 GEMM kernels
#24397 opened
Apr 11, 2025 -
[Native WebGPU] Support shared memory version of ReduceOps
#24399 opened
Apr 11, 2025 -
[webgpu] move comments out from WGSL in FlashAttention impl
#24400 opened
Apr 12, 2025 -
Support mixed precision in quantization for RTN
#24401 opened
Apr 12, 2025 -
Add ability to disable Model Editor API
#24402 opened
Apr 12, 2025 -
coreml: fix wrong C++ code in documentation
#24403 opened
Apr 12, 2025 -
[webgpu] Fix batch-norm for ort-web-tests
#24404 opened
Apr 13, 2025 -
Bump ruff from 0.9.5 to 0.11.5
#24405 opened
Apr 14, 2025 -
[OpenVINO EP] Implement new overload of CreateProvider() for OpenVINO EP
#24406 opened
Apr 14, 2025
16 Issues closed by 13 people
-
Inputs are reordered by TensorRT provider
#22729 closed
Apr 11, 2025 -
[Build] Cuda Execution Provider library is needed despite we only use TensoRT Execution provider
#22960 closed
Apr 11, 2025 -
[Build] pip Installation Failure on aarch64 (NVIDIA Jetson Orin)
#24380 closed
Apr 10, 2025 -
[Build] cublasLt64_11.dll is required even when building cuda 12
#24360 closed
Apr 10, 2025 -
[Web] TypeScript typings are not available with moduleResolution: bundler
#24343 closed
Apr 9, 2025 -
[Web] Upgrading from 1.20.1 to 1.21.* breaks Segment Anything models on WebGPU
#23183 closed
Apr 9, 2025 -
[Web] Allow to disable the SIMD detection check
#24292 closed
Apr 8, 2025 -
[Build] Build python interface for Onnxruntime-qnn on aarch64 Open Embedded Linux
#24102 closed
Apr 8, 2025 -
Access violation in nvcuda64.dll on Session::Run
#24339 closed
Apr 8, 2025 -
Python 3.11.11 can't install onnxruntime==1.20.1
#24338 closed
Apr 8, 2025 -
ONNX cannot save the XGBoost binary classifier properly when trained on an imbalanced dataset.
#24334 closed
Apr 7, 2025
8 Issues opened by 7 people
-
[Performance][Regression] Numerical mismatch between ORT 1.21.0 and PyTorch
#24398 opened
Apr 11, 2025 -
[Build] Build for android with xnnpack enabled and Exceptions disabled has error with gsl::narrow
#24383 opened
Apr 10, 2025 -
GPU Memory Leak on CUDAExecutionProvider with Specific Batch Size Sequence
#24376 opened
Apr 10, 2025 -
How to confirm if the current hardware supports operators such as fp16/int8/int4?
#24375 opened
Apr 10, 2025 -
quantize onnx models to INT8
#24374 opened
Apr 10, 2025 -
[Build] CUDA Minimal build still needs CUDNN_HOME to be specified
#24361 opened
Apr 9, 2025 -
onnxruntime errors out due to ORT_ENABLE_EXTENDED optimization: Error merging shape info for output
#24340 opened
Apr 8, 2025
42 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[WebNN EP] Support MultiHeadAttention(MHA)
#24079 commented on
Apr 14, 2025 • 16 new comments -
Fix an issue in wasm nortti build and add minimal build support for vcpkg
#24012 commented on
Apr 10, 2025 • 16 new comments -
[WebNN EP] Automatically use ml-tensor for outputs
#24282 commented on
Apr 13, 2025 • 13 new comments -
Add static quantization runner
#24114 commented on
Apr 9, 2025 • 1 new comment -
in cmake/CMakeList.txt all avx related option all set off, do we need do anything to use avx features?
#11833 commented on
Apr 11, 2025 • 0 new comments -
ONNXRuntime-Node v1.21: "Specified device is not supported" Error on Ubuntu 22.04.4 LTS During session.run Execution
#24264 commented on
Apr 11, 2025 • 0 new comments -
[Build] compilation error: invalid instruction mnemonic 'vcvtneeph2ps'
#22519 commented on
Apr 12, 2025 • 0 new comments -
[Mobile] I After onnx is compiled using NDK, the compiled result is then linked to the C++ program
#24024 commented on
Apr 12, 2025 • 0 new comments -
[Web] No way to prevent the default wasm from being bundled
#24009 commented on
Apr 12, 2025 • 0 new comments -
Turning on coreml and turning off coreml are two results
#24033 commented on
Apr 13, 2025 • 0 new comments -
[Web] How to use JSEP and WebGPU in static library (missing jsepAlloc or jsepInit)
#23072 commented on
Apr 13, 2025 • 0 new comments -
Bump @babel/helpers from 7.25.6 to 7.26.10 in /js/react_native/e2e
#23993 commented on
Apr 11, 2025 • 0 new comments -
Bump @babel/runtime from 7.25.6 to 7.26.10 in /js/react_native/e2e
#23994 commented on
Apr 11, 2025 • 0 new comments -
[VSINPU]Fix gather OP with scalar indice issue
#24061 commented on
Apr 11, 2025 • 0 new comments -
[webgpu] Use 64 as the workgroup size of DP4AMatMulQuantize
#24129 commented on
Apr 7, 2025 • 0 new comments -
Fuse Initializers Graph Transform
#24175 commented on
Apr 11, 2025 • 0 new comments -
Validate CreateSessionFromArray with ep.context_enable enabled
#24176 commented on
Apr 11, 2025 • 0 new comments -
Enable Inference Results Saving in onnx-test-runner
#24210 commented on
Apr 14, 2025 • 0 new comments -
Add ConvSymKernelAvx2 assembly saturation checker
#24220 commented on
Apr 12, 2025 • 0 new comments -
[WIP] Support export of Llama with DynamicCache and transformers>=4.48
#24291 commented on
Apr 10, 2025 • 0 new comments -
Upgrade transformers to 4.48.0 for llama2
#24302 commented on
Apr 12, 2025 • 0 new comments -
GLU Operator gives different Results on Dml EP compared to CPU EP
#24311 commented on
Apr 7, 2025 • 0 new comments -
Error when I use cuda_runtime.h and OpenVINO EP at the same time
#23941 commented on
Apr 7, 2025 • 0 new comments -
SIGSEGV when calling OrtSession.run()
#24288 commented on
Apr 7, 2025 • 0 new comments -
OnnxRuntime 1.21.0 Java package failed to load on Windows
#24287 commented on
Apr 7, 2025 • 0 new comments -
how to get memory allocate detail in model infer?
#24323 commented on
Apr 8, 2025 • 0 new comments -
[DO NOT UNPIN] ORT Nightly Package Name Change
#22541 commented on
Apr 8, 2025 • 0 new comments -
[Feature Request] Please publish 3.13t wheels for Windows
#24318 commented on
Apr 8, 2025 • 0 new comments -
Using directML to inference accelerate onnxruntime, a crash occurred.
#22514 commented on
Apr 8, 2025 • 0 new comments -
[DO NOT UNPIN] ORT 1.21.0 Release Candidates available for testing
#23885 commented on
Apr 8, 2025 • 0 new comments -
[Web] Different model output with WebGPU vs WASM (or Python with the CUDA EP)
#24070 commented on
Apr 9, 2025 • 0 new comments -
[Feature Request] use Onnxruntime TensorRT execution provider with lean tensorRT runtime
#23082 commented on
Apr 10, 2025 • 0 new comments -
[Build] Unable to Compile ONNX Runtime 1.20.1 with ARMNN Provider on ARM Cortex A78
#23014 commented on
Apr 10, 2025 • 0 new comments -
[Performance] CUDAExecutionProvider without RoiAlign (opset 16 version)
#21990 commented on
Apr 10, 2025 • 0 new comments -
[Performance] does onnxruntime 1.19.0 support sve?
#23983 commented on
Apr 10, 2025 • 0 new comments -
[preprocess] Pad is not folded in Conv when opset_import is > 20
#23973 commented on
Apr 10, 2025 • 0 new comments -
[Build] .pc file asks for -lonnxruntime but onnxruntime.a isn't installed
#23959 commented on
Apr 10, 2025 • 0 new comments -
[Build] memory leaked
#23915 commented on
Apr 10, 2025 • 0 new comments -
[Build] Build script for MacOS fails for targets older than 13.4 because tests can not be built
#24277 commented on
Apr 10, 2025 • 0 new comments -
[Build] error: array index 7 is past the end of the array (that has type '__m256[4]')
#23180 commented on
Apr 10, 2025 • 0 new comments -
[Build] Missmatch between CMake config and folder structure of onnxruntime-linux-x64-1.21.0.tgz
#24003 commented on
Apr 11, 2025 • 0 new comments -
Bad Allocation Error in ONNX Runtime on Windows x86 CPU When Processing Multiple Images Sequentially
#23938 commented on
Apr 11, 2025 • 0 new comments