Unlocking the Potential of AGI with T2I-Adapter and MMPose!

3 min readMar 21, 2023

Game Changer for Generative AI

Stable Diffusion is currently one of the most popular open-source generative models that enables users to generate high-quality images conditioned on the input prompts. Users used to be “prompt magicians” seeking desirable results by combining various “prompt spells”, a fascinating but also elusive approach to exploring the potential and control of Stable Diffusion.

However, the emergence of T2I-Adapter （based on MMPose）and ControlNet （based on OpenPose）has changed this situation. These two works support users to directly control the generated results by inputting “high-level” conditions that humans can intuitively understand, such as human poses, sketches and sementic segmentations. They can be directly mounted on various pre-trained Stable Diffusion models as “plugins” to adapt to different styles of generative tasks. Moreover, T2I-Adapter is lighter and easier to mount, compared to ControlNet.

In the case of pose-guided image generation, T2I-Adapter uses pose detection results from MMPose to generate control signals for a pretrained T2I model (e.g. Stable Defusion), where the pose conditions and prompts are incorporated to synthesis desired visual contents.

Jump right in MMPose

Have a try: https://github.com/open-mmlab/mmpose/tree/dev-1.x/projects/mmpose4aigc

Step 1: Preparation

Run the following commands to install the project:

# install mmpose mmdet
pip install openmim
git clone -b 1.x https://github.com/open-mmlab/mmpose.git
cd mmpose
mim install -e .
mim install "mmdet>=3.0.0rc6"

# download models
bash download_models.sh

Step 2: Generate a Skeleton Image

Run the following command to generate a skeleton image:

# generate a skeleton image
bash mmpose_openpose.sh ../../tests/data/coco/000000000785.jpg

The input image and its skeleton are as follows:

Step 3: Upload to T2I-Adapter

The demo page of T2I- Adapter is Here.

Please feel free to share interesting pose-guided AIGC projects with us!：https://discord.gg/73vwx8tBZj

Say hello to RTMPose

And last but not least, we have one more surprise for you!

RTMPose is a high-performance real-time multi-person pose estimation framework based on MMPose. It is a long-term project dedicated to the training, optimization and deployment of high-performance real-time pose estimation algorithms in practical scenarios. We provide a series of models with t/s/m/l sizes to cover different application scenarios with the optimum performance-speed trade-off.

Major Features

High efficiency and high accuracy
Easy to deploy
Design for practical applications

Let’s dive into the world of MMPose and discover the endless possibilities of pose-guided image generation! Meanwhile, compared to OpenPose, the open-source license of MMPose is entirely business-friendly. So, what are you waiting for?