🔥 Run DeepSeek R1 distilled models Locally with ONNX Runtime – Faster & Private 🔥 AI inference doesn’t have to rely on the cloud. With ONNX Runtime, you can run DeepSeek R1 on your CPU, GPU, or NPU, keeping your data private while achieving up to 6.3x faster performance than PyTorch. Why run on-device? 🔹 Privacy – No data leaves your machine 🔹 Speed – Optimized performance across Intel, AMD, NVIDIA, and Qualcomm hardware 🔹 Flexibility – Works on PCs with easy deployment via Hugging Face + ORT 🔗 Check out the blog here: https://lnkd.in/gQgYwKJ8
ONNX Runtime
Software Development
Redmond, Washington 2,401 followers
Run fast, run anywhere: ONNX Runtime is a machine learning accelerator for cloud, edge, web and mobile
About us
ONNX Runtime is a cross-platform machine-learning model accelerator, with a flexible interface to integrate hardware-specific libraries. ONNX Runtime can be used with models from PyTorch, Tensorflow/Keras, TFLite, scikit-learn, and other frameworks.
- Website
-
https://onnxruntime.ai
External link for ONNX Runtime
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- Redmond, Washington
- Type
- Public Company
Locations
-
Primary
Redmond, Washington 98052, US
Updates
-
ONNX Runtime reposted this
You can now experiment Phi4 models with #ONNXRuntime for faster on-device inference. Try now: https://lnkd.in/guKkAAmA https://lnkd.in/gJVQNUzt https://lnkd.in/g2fpic-H
Introducing Phi-4-multimodal and Phi-4-mini! Phi-4-multimodal integrates speech, vision, and text processing, while Phi-4-mini excels in text-based tasks. Discover these models on Azure AI Foundry: https://msft.it/6048U7H48
-
Powered by ONNX Runtime 🔥🚀
WOW check this out it’s LMStudio.ai running ENTIRELY locally on an NPU from Qualcomm
-
Are you a developer in the NYC region? Interested in building applications using on-device AI? Want to score a Windows Copilot PC?? Register for this exclusive hackathon onsite at NYU in Brooklyn, this weekend 12/7-8, hosted by LMStudio, Qualcomm and Microsoft 🛠️🔥 https://lu.ma/41hfiu79
-
AI Dev Gallery https://lnkd.in/gVvYcTZe is powered by ONNX Runtime! #onnxruntime #ai-dev-gallery #windows #ignite24
AI Dev Gallery (Preview) - An app designed to help Windows developers integrate AI capabilities within their own apps and projects. https://buff.ly/4ezw47e #ai #windowsdev #aidevgallery
-
🚀 Simplifying AI Model Customization with MultiLoRA on ONNX Runtime 🎯 AI practitioners, are you looking to fine-tune multiple tasks on a single model while optimizing performance and reducing resource consumption? The latest update on ONNX Runtime introduces MultiLoRA (Low-Rank Adapters)—an innovative feature designed to streamline model customization across diverse applications. 🔑 Key highlights from the blog: 💡 Enable simultaneous fine-tuning for multiple tasks on the same base model. ⚡ Benefit from optimized inference performance and minimized memory overhead. 🌍 Empower customization for various industries, from healthcare to customer service. MultiLoRA is a game-changer for developers and researchers leveraging ONNX Runtime, making it easier to adapt models to specific needs without the heavy lifting of traditional fine-tuning. 👉 Dive into the details and let MultiLoRA with ORT transform your AI workflows: https://lnkd.in/gpUSyyBS Let’s shape the future of efficient AI customization, together! 💻✨ #AI #ONNXRuntime #MachineLearning #MultiLoRA #DeepLearning #ModelOptimization #LoraAdapters
-
🔧 Learn about AI Model Optimization with Olive! Check out Olive's CLI, a streamlined way to prepare and optimize your AI models for inference using ONNX Runtime. In this blog, you'll find: - Step-by-step commands for optimizing AI models with Olive’s “auto-opt” feature - Flexible quantization options for customizing performance - Support for LoRA and QLoRA fine-tuning 🔗 Read the full blog: https://lnkd.in/gaRdEr4M #ONNX #ONNXRuntime #MachineLearning #ModelOptimization #AIModels #OnDeviceAI #Quantization #EdgeComputing #Olive #MicrosoftAI
-
🚀 Boost team efficiency in model optimization with Olive’s Shared Cache feature! In the fast-paced world of machine learning, time and resource efficiency are key. Dive into how Olive's Shared Cache feature streamlines model optimization, slashing processing time and cutting down costs, paving the way for a more resource-efficient ML workflow. Read the full post below ⬇️ to discover how Olive can enhance your team’s productivity. 🔗 https://lnkd.in/ggFF5MtC #MachineLearning #ModelOptimization #ONNX #ONNXRuntime #Olive #Azure #SharedCache #AItools #ModelDeployment
-
🚀 Looking for high-performing ONNX models, ready to deploy? We've got you covered with thousands of pre-converted, optimized models available for your favorite device or platform! 💡 Start exploring on our brand-new models page: onnxruntime.ai/models, or dive directly into over 15,000 ONNX models trending on Hugging Face: https://lnkd.in/gB79bG-x. ONNX Runtime makes it easy to integrate optimized models directly into your workflow, whether you're building for the web, edge devices, or any other platform! #ONNX #ONNXRuntime #AI #Phi #Llama #Qualcomm #HuggingFace #LLM #LLMs #ONNXModels #ONNXZoo #TransformerModels #LanguageModels #AIModels #GenerativeAI
-
-
Exciting news! We've just launched our new roadmap page, giving you a clear view of upcoming features and release plans - check it out and stay updated on what's next at: https://lnkd.in/guMMAR3X #ONNXRuntime #ONNX #AI #Roadmap #MachineLearning #ModelOptimization #MultiLoRA #CoreML #GenAI #GenerativeAI #Phi #Llama #Whisper
-