More than Editing, Unlock the Magic!

4 min readApr 26, 2023

Since its inception, MMEditing has been the preferred algorithm library for many image super-resolution, editing, and generation tasks, helping research teams win more than 10 top international competitions and supporting over 100 GitHub ecosystem projects.

After iterations in OpenMMLab 2.0 and the code merge with MMGeneration, MMEditing has evolved into an incredibly robust tool that supports both GAN-based and traditional CNN-based low-level algorithms.

Today, MMEditing will embrace Generative AI and officially change its name to MMagic (Multimodal Advanced, Generative, and Intelligent Creation), committed to creating a more advanced and comprehensive open-source algorithm library for AIGC (AI Generated Content).

With excellent training and experiment management support from MMEngine, MMagic will provide more agile and flexible experimental support for researchers and AIGC enthusiasts, helping you on your AIGC exploration journey.

In MMagic, we support tasks such as fine-tuning for stable diffusion, image editing, and image and video generation. In addition, we also support optimization strategies based on xFormers to improve training and inference efficiency.

For the Diffusion Model, we provide the following “magic” :

Support for image generation based on Stable Diffusion and Disco Diffusion
Support for Finetune methods such as Dreambooth and DreamBooth LoRA
Supporting controllability in text-to-image generation using ControlNet
Support for xFormers acceleration
Support for video generation based on MultiFrame Render
Support for calling basic models and sampling strategies through Wrapper

To improve your “spellcasting” efficiency, we have made the following adjustments to the “magic circuit”:

Support for 33 algorithms accelerated by Pytorch 2.0
Refactor DataSample to support the combination and splitting of batch dimensions
Refactor DataPreprocessor and unify the data format for various tasks during training and inference
Refactor MultiValLoop and MultiTestLoop, supporting the evaluation of both generation-type metrics (e.g. FID) and reconstruction-type metrics (e.g. SSIM), and supporting the evaluation of multiple datasets at once

Talk is cheap. Show me the demo!

Support Inferencer - Quickly implement model inference with just a few lines of code

In MMagic, fast inference API is supported, which allows for quick invocation by specifying the model.

from mmagic.apis import MMagicInferencer
# create an inferencer!
magician = MMagicInferencer(model_name='stable_diffusion')
text_prompts = 'A mecha robot in a favela in expressionist style'
result_out_dir = 'robot.png'
magician.infer(text=text_prompts, result_out_dir=result_out_dir)

The demo can also be performed through the command line:

python demo/mmediting_inference_demo.py --model-name eg3d \
    --model-config configs/eg3d/eg3d_cvt-official-rgb_afhq-512x512.py \
    --model-ckpt https://download.openmmlab.com/mmediting/eg3d/eg3d_cvt-official-rgb_afhq-512x512-ca1dd7c9.pth \
    --result-out-dir eg3d_output \  # save images and videos to `eg3d_output`
    --interpolation camera \  # interpolation camera position only
    --num-images 100  # generate 100 images during interpolation

2. Supporting MultiLoop — Evaluate multiple datasets in one run.

To facilitate users to evaluate multiple metrics of multiple datasets at once, we provide MultiValLoop and MultiTestLoop.

# 1. Use `MultiValLoop` to replace `ValLoop` provided by MMEngine
val_cfg = dict(type='MultiValLoop')
 
# 2. set evaluation metrics for different datasets
div2k_evaluator = dict(
    type='EditEvaluator',
    metrics=dict(type='SSIM', crop_border=2, prefix='DIV2K'))
set5_evaluator = dict(
    type='EditEvaluator',
    metrics=[
        dict(type='PSNR', crop_border=2, prefix='Set5'),
        dict(type='SSIM', crop_border=2, prefix='Set5'),
    ])
val_evaluator = [div2k_evaluator, set5_evaluator]

# 3. Define datasets
div2k_dataloader = dict(...)
set5_dataloader = dict(...)
val_dataloader = [div2k_dataloader, set5_dataloader]

3. MMagic supports ControlNet and multi-frame rendering for creating stunning images/long videos.

Take a break from your research and play basketball!

4. SAM + MMagic = Generate Anything！

SAM (Segment Anything Model) is a popular model these days and can also provide more support for MMagic! If you want to create your own animation, you can go to OpenMMLab PlayGround. With MMagic, experience more magic in generation!

Let’s open a new era beyond editing together. More than Editing, Unlock the Magic!

More than Editing, Unlock the Magic!

Written by OpenMMLab

No responses yet