Rockchip: Major Update to RK182X | RKNN3 SDK V1.0.0 Released, Fully Supporting Multimodal Models

Rockchip: Major Update to RK182X | RKNN3 SDK V1.0.0 Released, Fully Supporting Multimodal Models


Rockchip has officially released the RK1820/RK1828 AI coprocessor RKNN3 SDK V1.0.0, which is perfectly compatible with the RK3588/RK3576 + RK1820/RK1828 hardware configuration. This SDK provides full-stack software support for edge-side AI model deployment, delivering comprehensive upgrades in performance, model compatibility, functionality, and accuracy, while offering outstanding advantages in high performance, broad compatibility, and excellent energy efficiency.

        This SDK package includes a PC-side development kit, board-side runtime APIs, and model conversion and deployment examples, supporting Android and Linux operating systems and leveraging high-speed PCIe/USB interfaces to achieve low-latency data exchange. Key optimizations in core functionality are particularly noteworthy:

1. Enhanced inference efficiency: Supports concurrent data transmission and inference, optimizes core operators, and enables multi-core, multi-model simultaneous inference to handle high concurrency.

2. Enhanced large-model compatibility: supports mRoPE and Function Calls, and is compatible with the features of mainstream large models;

3. More convenient development and deployment: supports board-level precision analysis, provides a lightweight Python API toolkit, enables custom post-processing for coprocessor models, and adds support for embedding models on the RKLLM3 server.

A Leap in Core Performance: LLM Decoding Efficiency Up by Over 15%, Breaking 100 Billion Parameters with Support for Up to 8 Billion

        One of the key breakthroughs in this official SDK release is an overall improvement of more than 15% in LLM decoding performance. Deep adaptation and optimization have been carried out for LLM models across a range of parameter sizes, from 0.5 billion to 8 billion. The RK1820 and RK1828 chips have been tailored to their respective computational characteristics, enabling them to meet the inference requirements of LLM applications in diverse edge-side scenarios.

        Test data show that the RK182X delivers particularly outstanding performance on lightweight LLM models. A 3B-scale model has achieved a key breakthrough by exceeding 100 decode TPS. The Qwen2.5-3B model achieves a decode TPS of 102.01, providing efficient support for real-time large-model interactions on edge devices; the ultra-lightweight Qwen2.5-0.5B model also delivers outstanding performance, with a TTFT of just 21.89 ms, a TPOT of 4.63 ms, and a decode TPS of 215.86; as for medium- and large-scale LLM models, Qwen3-8B achieves stable inference on the RK1828, with a Decode TPS of 61.11. , which can fully meet the deployment requirements of medium- and large-scale LLMs on the edge.

*Core performance data for LLM models with different parameter sizes on the RK182X are as follows:

 

VLM Model Performance: Efficient and Stable Multimodal Reasoning

        Under the same standard test environment, the RK182X has undergone deep inference optimization for multiple mainstream VLMs and multimodal models. The RK1820 and RK1828, leveraging their respective computing power characteristics, have been tailored for differentiated adaptation, maintaining stable visual inference latency across various image resolutions while delivering highly efficient LLM decoding performance; among them, Perfectly supports the Qwen3-VL series of models, with Qwen3-VL-4B achieving an LLM decode TPS of nearly 90 TPS. Moreover, the RK1828 fully supports on-device inference for medium- and large-scale VLM models, delivering outstanding multimodal interaction performance.

*Core performance data for different VLM models on the RK182X are as follows:

RK1828 is perfectly compatible with the Qwen2.5-Omni-3B multimodal model. Achieves efficient end-to-end inference across audio, vision, and language modalities, with a decoding TPS of 102.63—surpassing the 100 mark; all inference is performed independently on the coprocessor side, delivering stable inference latency at a 392×392 visual resolution, with audio inference taking only 98.91 ms, demonstrating outstanding multimodal processing efficiency.

CNN Model Performance: Solid Single-Core Compute Power, with a Dramatic Leap in Multi-Core, Multi-Batch Throughput

        The RK182X has undergone deep optimization for single-core and multi-batch, multi-core inference of classic classification and detection CNN models, fully leveraging its capability for concurrent inference across multiple cores and multiple models to maximize computing power. DINOv3’s ViT demonstrates particularly outstanding performance. While maintaining stable single-core performance, the multi-batch, multi-core mode delivers several-fold improvements in frame rate. Its high-throughput characteristics enable efficient adaptation to high-concurrency computer vision applications such as intelligent surveillance and industrial inspection, offering an excellent balance of performance and energy efficiency.

* Core performance data for different CNN models on the RK182X are as follows:

 

Near-lossless accuracy! Quantization optimization achieves “performance boost without accuracy loss.”

        The RKNN3 SDK V1.0.0 employs differentiated quantization strategies for different types of models: LLM/VLM models are quantized using the W4A16 G32 scheme, while CNN models use the W8A8 scheme. This approach significantly boosts inference performance while maintaining model accuracy that is essentially on par with the original float32 version—indeed, in some cases, quantized models even outperform their float32 counterparts—effectively controlling precision loss and delivering true performance gains without compromising model quality.

Building a comprehensive, end-to-end ecosystem to strengthen AIoT 2.0’s perception–decision–execution capabilities.

        The RKNN3 SDK V1.0.0 is tightly aligned with the core AIoT 2.0 architecture of perception–decision–execution, achieving comprehensive multi-dimensional adaptation for more than 30 mainstream AI models. It fosters deep collaboration with key upstream and downstream partners, breaking down ecosystem barriers between hardware computing power, software stacks, and algorithmic models. With full model support and excellent adaptation performance, the SDK fully unlocks the computational value of the RK182X coprocessor.

At the perception layer: multi-modal data acquisition capabilities are fully maximized, with deep adaptation of leading partner models. Build a multi-modal intelligent data gateway on the edge side, with comprehensive support for classic CNN visual models such as MobileNet and the YOLO series, as well as depth estimation models on the vision side; and deep integration on the audio side. iFlytek, Sinovation Ventures, and Daxiang Acoustics Once the perception-layer leading partners have completed the adaptation of core speech models such as ASR and TTS, multi-modal perception capabilities will be efficiently deployed.

At the decision-making level: a closed-loop ecosystem for full-specification models, with leading vendors’ core models being adapted one by one. . Compatible with mainstream open-source large models such as Qwen3-VL and GLM Edge, fully compatible with LLM models across the entire scale range from 0.5B to 8B, while also providing deep adaptation. Qwen2.5-Omni-3B, Zhipu MiniCPM, and Step-GUI-Edge Enables efficient on-device deployment of core models from leading vendors, delivering full-modal intelligent decision-making capabilities.

At the execution layer: software-hardware synergy empowers scenario deployment and enables end-to-end model capability translation. Leveraging the RK3588/RK3576 + RK182X hardware-and-software platform, the solution supports custom post-processing for coprocessor inference models, enabling flexible adaptation to diverse model decision-making and execution logic. It is compatible with Android and Linux operating systems, facilitating seamless deployment of comprehensive AI capabilities across a wide range of applications—including smart hardware and industrial inspection—thus achieving end-to-end transformation from algorithm development to practical application.

 

  In addition, the SDK is fully compatible with open-source platforms such as Hugging Face, ModelScope, and GitHub. Users can directly obtain the pre-converted RKNN model from GitHub:

• Model Zoo address: https://github.com/airockchip/rknn3-model-zoo;

• Tool download link: https://github.com/airockchip/rknn3-toolkit;

 

        The official release of the RK1820/RK1828 RKNN3 SDK V1.0.0 marks a significant breakthrough for Rockchip in the field of edge AI coprocessors. From performance leaps and model scalability to precision optimization, every update is closely aligned with developers’ real-world deployment needs, fully unleashing the high computing power and exceptional energy-efficiency advantages of the RK182X series. Moving forward, Rockchip will continue to iteratively refine and optimize the RKNN3 SDK, expanding model support, enhancing computational performance, and building a more efficient edge AI development toolchain to help more innovative AI applications be successfully deployed at the edge.

 

(Source: Rockchip official website)

Hot News


Understanding Circuit Protection Components: An Essential Element of Electronic Devices

Understanding Circuit Protection Components: An Essential Element of Electronic Devices