RTMPose: The All-In-One Real-time Pose Estimation Solution for Application and Research!

OpenMMLab
4 min readMar 16, 2023

--

Try now:

https://github.com/open-mmlab/mmpose/tree/dev-1.x/projects/rtmpose

Tech Report:

https://arxiv.org/abs/2303.07399

Advanced pose estimation algorithms have made significant progress in recent years and have demonstrated high accuracy on public datasets. For example, the state-of-the-art algorithm can achieve more than 80% AP accuracy on the MS COCO dataset. However, in real-world industrial applications, many mainstream algorithms are still several years old. Although there are advanced SOTA algorithms and models available, their complexity and high latency make it challenging to deploy them in practical scenarios, which results in expensive hardware and costs. Furthermore, many applications have strict real-time performance requirements, which pose an additional challenge for these algorithms to meet.

figure1. Prone sit-ups

This is where RTMPose comes in. With the release of MMPose 1.0, the RTMPose team focused on the development of high-performance algorithms that are widely applicable in industry. After careful research and development of five aspects of multi-person pose estimation algorithms: paradigm, backbone network, localization algorithm, training strategy, and deployment inference, the RTMPose-m model achieved 75.8% AP on COCO and reached 90+ FPS on an Intel i7–11700 CPU using ONNXRuntime, and 430+ FPS on an NVIDIA GTX 1660 Ti GPU using TensorRT. The RTMPose-s model achieved 72.2% AP on COCO and reached 70+ FPS on a Snapdragon865 chip in a mobile phone using ncnn.

With the help of MMDeploy, our project supports multiple platforms such as CPU, GPU, Jetson, and mobile devices, and multiple deployment frameworks such as ONNXRuntime, TensorRT, ncnn, OpenVINO, and RKNN.

table 1. RTMPose Inference Speed Overview

Performance

Academic Benchmark

During our research, we found that mainstream pose estimation projects in the current market, such as PP-TinyPose based on PaddleDetection, AlphaPose released by Shanghai Jiao Tong University, MoveNet and MediaPipe released by Google, lack unified comparison. When reporting accuracy, there exists differences in the validation sets and hardware used by each project. Even for the same COCO val2017 dataset, different standards were used for manual filtering, and these standards were not publicly disclosed.

Therefore, we deployed these pose estimation models on the same hardware and tested their performance on a unified COCO val2017 dataset, comparing them with RTMPose.

figure 2. Performance Comparison of Mainstream Pose Estimation Algorithms(COCO val2017)

Considering that they are designed for different application scenarios, for example, PP-TinyPose and MoveNet are both pose estimation algorithms designed for mobile devices, mainly for single-person pose estimation. Therefore, we also constructed a single-person validation dataset from COCO val2017 for careful comparison.

figure 3. Performance Comparison of Mainstream Single-Person Pose Estimation Algorithms(COCO-SinglePerson)

RTMPose has achieved comprehensive breakthroughs in the field of whole-body pose estimation, surpassing even the long-established OpenPose.

Table 2: Performance Comparison of Whole-Body Pose Estimation Algorithms (COCO-WholeBody V1.0)

Through the above comparison, it can be observed that RTMPose has a better balance of accuracy and speed compared to mainstream pose estimation libraries. In the future, pruning, distillation, and quantization algorithms based on MMRazor will be integrated into the RTMPose project to further enhance its performance.

figure 4. Preview of Pruning Algorithms

Evaluation on Internal Business Dataset

Since beta testing, RTMPose has received attention from the community. We invited a group of trial users from the industry, and below is feedback from these participants after using RTMPose on their own business data and devices.

Speed on Various Platforms

table3. Community Performance Evaluation of Hardware Deployment (Unit: Milliseconds per person)
table 4. Performance Comparison on Business Datasets from Community

Based on feedback from the community, it can be seen that RTMPose can be easily deployed on various hardware devices, and can directly bring more than ten percentage points of accuracy improvement in business applications:)

User-Friendly Tutorial for Quick Start

We have prepared detailed tutorials to guide users step by step through model training, deployment, and inference. Whether you are using CPU, GPU, mobile devices, or Jetson platforms, and programming in Python, C++, or JAVA, you can quickly deploy RTMPose.

figure5. Preview of Deployment Tutorial

Based on the MMDeploy precompiled package, users can save the trouble of complex environment configuration and installation, quickly experience the high-speed performance brought by RTMPose

Acknowledgement

Thanks to the community members who actively participated in the internal testing @zwfcrazy@RangeKing@tongda@52thanos

--

--