MMClassification 1.0 : Newly upgraded image classification toolbox

8 min readNov 11, 2022

In the OpenMMLab open source system, we provide excellent and rich support for the backbone model of computer vision. We have developed MMClassification, an open-source library for image classification tasks. In this blog, we will introduce the exciting properties and features of newly upgraded MMClassification 1.0.

MMClassification is the image classification open-source library of OpenMMLab ecosystem, which covers rich neural network architectures in computer vision.
In October 2020, the first version of MMClassification was released, integrating several milestone vision backbone models; In September 2021, MMClassification released v0.16.0, providing complete support for downstream tasks(e.g., detection, segmentation); In September 2022, we released MMClassification v1.0.0rc, which upgraded the architecture design and adapted to the new OpenMMLab 2.0 system.
We will introduce the newly upgraded MMClassification 1.0 from four aspects: new framework design, amazing new features, richer algorithms library, and easier model deployment.

New Framework Design

In the original framework of MMClassification, for a typical model training process, we can find that the construction of each module is scattered in different locations, including, auxiliary scripts, MMClassification library functions, and MMCV library functions (as shown in Figure 1). This makes it difficult for users to understand the training process, and it is not easy for users to implement their custom functions or modules.

In the new framework, we introduce the MMEngine into MMClassification, which is a fundamental library of OpenMMLab ecosystem. The executor of MMEngine is responsible for all modules constructions of one training process. The training script is only used for configuration parsing (as shown in Figure 2). Thus, the new training process is not only more logical and greatly reduces the amount of code, but also brings a more convenient debugging experience for users, allowing them to flexibly define the forward and backward of the model.

Figure 1: Previous training pipeline of MMClassification

Figure 2: Current training pipeline of MMClassification 1.0

Amazing New Features

Cross-library Configuration Call

In OpenMMLab 2.0, all libraries support the cross-library configuration call, which means we can call other libraries’ configuration file without the need of explicitly copying the configuration file. For example, when we work on the exploration of backbone network, we can easily benchmark one trained backbone network on various downstream tasks(e.g., detection, segmentation) in MMClassification. We provide an example of how to conduct objection detection with SwinTransformer on Faster R-CNN quickly, by using cross-library configuration call in MMClassification.

_base_ = 'mmdet::faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py'
custom_imports=dict(imports='mmcls.models')
model = dict(
    backbone=dict(
        type='mmcls.SwinTransformer',
        arch='tiny',
        img_size=224,
        out_indices=(0, 1, 2, 3),
        init_cfg=dict(
            type='Pretrained',
            checkpoint=...,
            prefix='backbone'
        ),
        _delete_=True),
    neck=dict(in_channels=[96, 192, 384, 768]))

Powerful Visualization Tools

In the newly upgraded MMClassification, we have integrated powerful visualization tools for visualizing the images and optimizer parameter strategies. For the input data, we support multiple modes of image visualization, such as concatenation mode and pipeline mode.

Figure 3: Concatenation mode (original image on the left, transformed image on the right)

Figure 4 Pipeline mode (original image on the left, and the other images are transformed images)

We also support the visualization of the optimizer parameter policy, enabling the user to visualize the learning rate , momentum , and other parameters before training, as shown in the following two figures. This feature allows users to debug in advance to ensure that the optimizer parameters are reasonable.

Figure 5: Optimizer momentum curve

Figure 6: Learning Rate Curve

Data Preprocessor Module

To speed up the data preprocessing process, the data_preprocessor module has been introduced into the OpenMMLab 2.0 architecture. This module executes the device conversion and preprocessing on the data obtained from the dataloader.

The dataset transform process only works on one image each time, while the data preprocessing module can process one batch of images.
The dataset transform can contain complex logic that performs different operations on each image, which is usually executed on the CPU, while the data preprocessing module can perform batch processing on a batch of images using GPU acceleration.

Why does it accelerate?

The data preprocessing module contains normalization and channel conversion operations, which are less efficient on CPU and more efficient using GPU batch processing. The experimental results show that for the 224-size images with a batch size of 128, our method can reduce the time of iterations from 31 ms down to 12 ms （on GPU 1660 Super，CPU i7–8700）。

What is the difference in usage?

The Normalize transformation operation in the original dataset pipeline can be removed and replaced with a separate data_preprocessor configuration field.

data_preprocessor = dict(
    # conver the input from BGR format to RGB format
    to_rgb=True,
    # normalization in RGB format
    mean=[123.675, 116.28, 103.53],
    std=[58.395, 57.12, 57.375],
)

Easy-to-use Inference API

We have provided an easy-to-use inference API to enable the users to easily use our provided models to conduct image classification inference. Here we give an example:

>>> from mmcls.apis import ImageClsInferencer
>>> 
>>> inferencer = ImageClsInferencer(model='resnet50_8xb32_in1k')
>>> inferencer('./demo/demo.jpg')
{'pred_label': 65, 
 'pred_score': 0.6649363040924072,
 'pred_class': 'sea snake'}

https://youtu.be/2B5jDpByno4

Video 1: Demonstration of inference API

Multi-backend Visualizer

We have introduced ClsVisualizer in this new version, which is inherited from MMEngine's Visualizer and can be used to visualize and store the required data information during training and testing, with the following main features.

Visualization and storage of data samples.
Plotting, visualization and storage of images.
Visualization and storage of feature maps.
Storage of scalar data such as loss function (Loss) and classification accuracy (Accuracy).

We also provide a variety of storage backend support such as: Local storage, Tensorboard backend storage, and Wandb backend storage.

The visualizer not only allows it to be called anywhere in the codebase, but also supports extensions to the storage backend on top of it.

We provide the following example to show how to use ClsVisualizer。

import torch
import mmcv
from pathlib import Path
 from mmcls.visualization import ClsVisualizer
from mmcls.structures import ClsDataSample
 # Build Visualizer
vis = ClsVisualizer(
    save_dir="./outputs",
    vis_backends=[dict(type='TensorboardVisBackend')],
)
vis.dataset_meta = {'classes': ['cat', 'bird', 'dog']}
 # Visualize the image classification
img = mmcv.imread("./demo/bird.JPEG", channel_order='rgb')
data_sample = ClsDataSample()
data_sample.set_gt_label(1).set_pred_label(1).set_pred_score(torch.tensor([0.1, 0.8, 0.1]))
vis.add_datasample('res', img, data_sample)
 # Visualize the scalar information, e.g., loss
for i in range(10):
    vis.add_scalar('loss', step=i, value=10-i)

We have also partnered with Wandb to provide support for Wandb, which is an easy-to-use experiment management tool that supports monitoring and data collection for all kinds of experiments, as well as multiple open source frameworks and system environments.

Figure 7: Features of Wandb

Richer Algorithms Library

MMClassification currently covers a rich set of model architectures, including CNN backbone and Transformer backbone, covering different types of algorithms from standard models to lightweight models, as shown in the figure below. So far, we support 34 model architectures and provide 220 pre-trained model weights.

Figure 8: Model zoo of MMClassification

We have also recently supported some cutting-edge algorithms such as Swin-V2, MobileOne, EdgeNeXt, EfficientFormer , etc. Their basic structures are shown in the following figure.

Figure 9: Architectures of the new support models

Easier Model Deployment

In MMClassification 1.0, we have improved the support for model deployment. Users can quickly obtain deployable classification models with the help of MMDeploy tool. We present here a simple deployment process for MMDeploy-based classification models.

First download the MMDeploy code

git clone --recursive -b dev-1.x https://github.com/open-mmlab/mmdeploy.git
cd mmdeploy

2. One-click installation of MMDeploy and onnx backend (Ubuntu system)

python3 tools/scripts/build_ubuntu_x64_ort.py $(nproc)

3. Perform classification model transformation

# Download model and configuration files
mim download mmcls --config resnet18_8xb32_in1k --dest .# Model conversion
python tools/deploy.py \
    configs/mmcls/classification_onnxruntime_dynamic.py \  # Convert model into onnx
    resnet18_8xb32_in1k.py \                               # configuration file
    resnet18_8xb32_in1k_20210831-fbbb1da6.pth \            # model weight
    tests/data/tiger.jpeg \                                # demo image
    --work-dir mmdeploy_models/mmcls/ort \                 # location to store the converted model

4. Validating the converted model with the ONNX backend

First setup the environment variables required for the ONNX backend to run

export PYTHONPATH=$(pwd)/build/lib:$PYTHONPATH
export LD_LIBRARY_PATH=$(pwd)/../mmdeploy-dep/onnxruntime-linux-x64-1.8.1/lib/:$LD_LIBRARY_PATH

The inference task is then performed using a deployable model

>>> from mmdeploy_python import Classifier
>>> import cv2
>>> 
>>> img = cv2.imread('tests/data/tiger.jpeg')
>>> # build onnx classification model
>>> classifier = Classifier(model_path='./mmdeploy_models/mmcls/ort', device_name='cpu', device_id=0)
>>> # execute inter
>>> classifier(img)
[(292, 17.54273223876953)]

https://youtu.be/pVzLbaORWSg

Video 2: Demonstration of model deployment

Summary

We provide a brief summary of MMClassification 1.0, which serves as the foundemental model library for OpenMMLab 2.0 and is now widely used in open source community. Based on the new architecture and ecology, we can easily support multiple downstream tasks with scalable and easily configurable features. We also provide a rich set of algorithmic models, convenient research tools, and easy model inference.

Figure 10: Summary of MMClassification 1.0 features

Follow-up Planning

In the future, we will also continue to update to bring the community a better library of foundation models. We also plan to introduce easier-to-use inference interfaces, richer model libraries, better training support, and more powerful feature analysis tools. We will also continue to expand the capabilities of MMClassification to support more classification datasets and a wider range of tasks (e.g., retrieval tasks, unbalanced distribution classification, semi-supervised classification, weakly supervised classification, etc.). We also welcome the community to try out the new version of MMClassification, share your valuable suggestions, and contribute code to MMClassification.

Maintenance Timeline

We will officially release MMClassification version 1.0 to the main branch on January 1, 2023. For those interested in the new version, you can use the 1.x branch to experience our new version. We will keep the dual branches maintained simultaneously in the future, and the main maintenance timeline is shown below：）