
Microsoft Unveils Lightning-Fast DeepSeek R1 for Copilot+ PCs | Image Source: blogs.windows.com
Redmond, Washington, February 3, 2025 – Microsoft announced a significant progression in the IA panorama with the departure of the optimized distillasek R1 models using ONNX, now available for PC Copilot + PC in Snapdragon. This development marks a crucial moment for AI on devices, improving the speed, efficiency and optimization of the power for the new generation personal.
According to Microsoft’s Windows developer blog, Depseek R1 models offer impressive performance measurements, which a while ago to the first token of less than 70 milliseconds for short indications (less than 64 tokens) and a speed speed that reaches Up to 40 tokens per second. This efficiency jump is about to revolutionize how developers and end users interact with AI applications on their devices.
What is the Deepseek R1 model?
The Deepseek model represents Microsoft’s advanced approach to AI available. Distilled and optimized using the network exchange network (ONNX), these models are designed to function effectively in co -filled + PC equipped with neuronal treatment units (NPU). The initial version, Deepseek-R1-Distill-Qwen-1.5b, is accessible through the extension of the AI toolbox in the Visual Studio Code, with larger variants such as the 7B and 14b models expected soon.
These models are not only faster, they are smarter. By taking advantage of advanced quantification techniques and silicon optimizations, Microsoft said that Deepseek R1 can offer solid experiences of AI while maintaining the minimum energy consumption. According to Microsoft’s announcement, this optimization allows transparent integration in daily workflows without compromising battery life or system performance.
How does Deepseek improve the performance?
The key to Deepseek R1’s performance lies in its architectural refinements. Microsoft focused on optimizing the NPU model, allowing faster inference and reduced latency. The model uses a quantification combination int4 by channel for INT16 pesos and activations, improving speed and efficiency. In addition, the use of the ONNX QDQ format facilitates scalability on different Windows devices.
For example, Token Phi Silice Iterator, a previous reference in Microsoft AI optimization efforts, has demonstrated an improvement of 56% of energy efficiency when it went from the CPU to the NPU. Based on this success, Deepseek R1 models incorporate similar strategies, when making some time for the first token of only 130 milliseconds and a flow of 16 tokens per second for short invitations. This means faster responses with less pressure on system resources.
Why does that change for developers?
Developers should benefit greatly from Deepseek R1 capabilities. With the extension of the AI toolbox in the Visual Studio Code, the experimentation of these models becomes simple. Developers can download the model directly from Azure Ai Foundry, load it in the patio de recreo and start trying with real -time indications. This rationalized process reduces entry obstacles for the development of AI, promoting innovation at all levels.
In addition, the ability to execute Sophisticated models locally eliminates the dependence of cloud infrastructure for real -time applications. This not only reduces latency, but also improves confidentiality and data security, since confidential information should no longer be transmitted on the Internet for processing.
What makes NPU optimization so special?
The Neuronal Treatment Unit (NPU) is in the heart of this technological leap. Unlike traditional CPUs and GPUs, NPUs are specially designed for AI workloads, which offer unparalleled efficiency in complex calculations management. Microsoft optimization techniques focus on the distribution of workloads intelligently between the CPU and the NPU, which guarantees maximum performance with minimal energy consumption.
One of Depseek R1’s notable characteristics is the Quarot quantification scheme. This innovative approach uses Hadamard rotations to eliminate the aberrant values of weights and activations, improving the precision of quantification and allowing a low treatment without sacrificing performance. This technique considerably exceeds traditional methods such as GPTQ, particularly in weak granularity parameters.
How does this have an impact on daily PCs?
For everyday users, Deepseek R1 implications are deep. Imagine virtual assistants who react instantly, productivity tools that predict their needs before even articulating them and real -time translation services that work perfectly offline. These are just some of the possibilities unlocked by AI models with an optimized album for NPUs.
In addition, the electrical efficiency of these models means that users can take advantage of the improved characteristics of AI without worrying about the download or overheating of the battery. This is particularly beneficial for mobile professionals and remote workers who have their devices throughout the day.
What is the next step for Depseek R1 and Co -cilot + PCs?
Microsoft has ambitious plans for the future of Deepseek R1. Although the current version focuses on Copilot + PC in Snapdragon, the company has announced that the management of Intel Core Ultra 200V processors will soon continue. This multiplatform compatibility will ensure that a broader range of devices can benefit from this progress.
While Microsoft continues to refine its AI models, we can expect new improvements in performance, efficiency and scalability. The integration of AI in the Windows ecosystem is not only a functionality, but it is a fundamental change in the way it will evolve in the coming years.
In conclusion, the release of R1 Depseek models marks an important step to travel to smarter, faster and more efficient computer science. With Microsoft leading the load, the future of the PC with AI seems brighter than ever.