-
Notifications
You must be signed in to change notification settings - Fork 3.2k
segmentation fault while using onnxruntime==1.21.0 #24144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
here is the exception message that is issued. I am not seeing a segmentation fault on the main build: unknown file: error: C++ exception with description "Load model from D :/dev/data/SegmentationFault_gh_24144/hf_Qwen2 - 7B - Instruct_model.onnx failed:Load model D :/dev/data/SegmentationFault_gh_24144/hf_Qwen2 - 7B - Instruct_model.onnx failed" thrown in the test body. |
Hi @yuslepukhin, Thanks for looking into it. |
Hi @yuslepukhin , I'm using model from the following location: https://huggingface.co/Qwen/Qwen2-7B-Instruct/tree/main |
Please, share exactly what you did. Also, please, share any console messages, enable logging and share, specifically what makes you think there is a segmentation fault. Also, please, fill out the template as to the version of your Linux OS etc. |
@yuslepukhin I'm trying to create a script which can reproduce the issue at your end. I'll try to share with you soon. |
Steps to reproduce:
Please let me know if you need any information from my side in this regard. Thanks |
I have followed the procedure and got the model. Produced a debug build from the tip of main. |
Hi @yuslepukhin, Were you able to run the complete script without any segmentation fault? If yes, can you please check the onnxruntime version?
While running inference step, I was getting Seg Fault: |
The bug reproduces with 1.21.0, but is not there with the latest code. D:\dev\data\SegmentationFault_gh_24144$ pip list certifi 2025.1.31 |
@yuslepukhin It is great that you are able to reproduce the issue. I think we should add this as a testcase to avoid such a regression in the future. What do you say? Let me know if you want me to add it. If yes, can you please share some documentation or guide me on how to add it and verify the same. |
As you mentioned, the problem was fixed in the maintenance release in this PR in one of the optimizations. Feel free to add a test if you so desire. There is not a documentation on how to add tests, but you can check any of the test files and see how they are written. |
@yuslepukhin I'm unable verify if this has been fixed in the latest onnxruntime binary. I tried installing nightly build (onnxruntime-1.22.0.dev20250321002), but still getting the segmentation fault. |
Everything that is fixed in the maintenance release, is fixed in main. You are welcome to debug it, it may be a new issue. I do not have a repro. |
Any details? What platform? |
Please,
Add this information to the GH issue along with the hardware you are using.
I do not have time allocated for this now, but I will get to it eventually.
…--
Dmitri
From: Dmytro Varich ***@***.***>
Sent: Tuesday, April 29, 2025 4:41
To: microsoft/onnxruntime ***@***.***>
Cc: Mention ***@***.***>; Comment ***@***.***>; Subscribed ***@***.***>
Subject: Re: [microsoft/onnxruntime] segmentation fault while using onnxruntime==1.21.0 (Issue #24144)
@yuslepukhin<https://github.com/yuslepukhin>, Hi!
I am writing an bachelor's thesis in which I am developing a custom package for ROS 2. In this package, my node uses ONNX Runtime to run a segmentation model using CUDA (GPU acceleration).
I have a segmentation fault when trying to run inference using ONNX Runtime 1.21.1 with CUDAExecutionProvider (CUDA version 12.2). The crash occurs during the initialization of the InferenceSession.
Here is the relevant code snippet:
if not os.path.exists(self.model_path):
self.get_logger().error(f"Model file not found at {self.model_path}")
self.session = ort.InferenceSession(self.model_path, providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
self.get_logger().info(f"ONNX model loaded from {self.model_path}")
Environment:
* ONNX Runtime: 1.21.1
* CUDA: 12.2
* OS: Ubuntu 22.04
* Python: 3.10
* Execution Providers: ['CUDAExecutionProvider', 'CPUExecutionProvider']
* Model: peoplesemsegnet_shuffleseg.onnx (Nvidia NGC Catalog<https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/peoplesemsegnet/files?version=deployable_shuffleseg_unet_onnx_v1.0>)
* Model Zip File: peoplesemsegnet_deployable_shuffleseg_unet_onnx.zip<https://github.com/user-attachments/files/19957865/peoplesemsegnet_deployable_shuffleseg_unet_onnx.zip>
🤔 Idk, It's either a problem because I'm using a model from NVIDIA that was originally designed for TensorRT (from Isaac ROS), or it's still an issue with the ONNX Runtime version — even though you said it would be fixed in the new release compared to 1.21.0.
Thank you in advance!
—
Reply to this email directly, view it on GitHub<#24144 (comment)> or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACWHYNERVMO3FUPUARAYQED235QN3BFKMF2HI4TJMJ2XIZLTSSBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLAVFOZQWY5LFVIZDAMJUGE4DKOJWGGSG4YLNMWUWQYLTL5WGCYTFNSWHG5LCNJSWG5C7OR4XAZNMJFZXG5LFINXW23LFNZ2KM5DPOBUWG44TQKSHI6LQMWVHEZLQN5ZWS5DPOJ42K5TBNR2WLKJRGU3DSMZZGY3TFAVEOR4XAZNFNFZXG5LFUV3GC3DVMWVDEOJUGM4TCNBRG43IFJDUPFYGLJLMMFRGK3FFOZQWY5LFVIZDAMJUGE4DKOJWGGTXI4TJM5TWK4VGMNZGKYLUMU>.
You are receiving this email because you were mentioned.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
|
onnxruntime getting crashed with segmentation fault when using version 1.21.0. It is not crashing when using 1.20.1 release.
Steps to reproduce:
hf_Qwen2-7B-Instruct_model.onnx.gz
The text was updated successfully, but these errors were encountered: