Awesome Datasets for Super-Resolution: Introduction and Pre-processing

OpenMMLab
7 min readMar 31, 2023

--

In the research of image/video super-resolution, a comprehensive understanding of the datasets is crucial. As a toolbox for low-level vision tasks, MMEditing has supported a large number of SOTA super-resolution models and also relevant popular super-resolution datasets. This article aims to provide an overview of commonly used super-resolution datasets, and how they are utilized in both image and video super-resolution tasks.

https://github.com/open-mmlab/mmediting

1. Image Super-Resolution Datasets

1.1 DIV2K

The DIV2K dataset is one of the most popular datasets used for image super-resolution, which is collected for NTIRE2017 and NTIRE2018 Super-Resolution Challenges. The dataset is composed of 800 images for training, 100 images for validation, and 100 images for testing. Each image has a 2K resolution.

1.1.1 Download

The DIV2K dataset can be downloaded from its homepage. For general image super-resolution, the folder structure should look like:

data
├── DIV2K
│ ├── DIV2K_train_HR
│ │ ├── 0001.png
│ │ ├── 0002.png
│ │ ├── ...
│ ├── DIV2K_train_LR_bicubic
│ │ ├── X2
│ │ ├── X3
│ │ ├── X4
│ ├── DIV2K_valid_HR
│ ├── DIV2K_valid_LR_bicubic
│ │ ├── X2
│ │ ├── X3
│ │ ├── X4

1.1.2 Usage

MMEditing provides tutorials on the use of the DIV2K dataset.

If you want to use LMDB datasets for faster IO speed, you can make LMDB files by running the following command:

python tools/dataset_converters/div2k/preprocess_div2k_dataset.py --data-root ./data/DIV2K --make-lm

MMEditing also supports cropping DIV2K images to sub-images for faster IO. You can run the following command:

python tools/dataset_converters/div2k/preprocess_div2k_dataset.py --data-root ./data/DIV2

1.2 DF2K

The DF2k dataset is a merged dataset of DIV2K and Flickr2K, which is proposed by Enhanced Deep Residual Networks for Single Image Super-Resolution. The Flickr2K dataset contains 2,650 images with 2K resolution, which can be downloaded from here.

1.2.1 Download

You can download DIV2K and Flickr2K, respectively. Then you need to merge them.

1.2.2 Usage

The Flickr2K dataset has the same folder structure as the DIV2K dataset. You can simply merge them.

data
├── Flickr2K
│ ├── Flickr2K_HR
│ │ ├── 000001.png
│ │ ├── 000002.png
│ │ ├── ...
│ ├── Flickr2K_LR_bicubic
│ │ ├── X2
│ │ ├── X3
│ │ ├── X4

1.3 DF2K_OST

The DF2k_OST dataset is a merged dataset of DF2K and OST, which is proposed by Real-ESRGAN.

1.3.1 Download

You can download DIV2K, Flickr2K and OST, respectively. Or you can download the merged dataset from here.

1.3.2 Usage

MMEditing provides tutorials on the use of the DF2K_OST dataset.

If you want to use LMDB datasets for faster IO speed, you can make LMDB files by running the following command:

python tools/dataset_converters/df2k_ost/preprocess_df2k_ost_dataset.py --data-root ./data/df2k_ost --make-lmdb

MMEditing also supports cropping DF2K_OST images to sub-images for faster IO. You can run the following command:

python tools/dataset_converters/df2k_ost/preprocess_df2k_ost_dataset.py --data-root ./data/df2k_ost

1.4 Face Super-resolution datasets

Some face image datasets, such as FFHQ and CelebA-HQ, are used for face super-resolution. The FFHQ(Flickr-Faces-HQ) dataset is a high-quality image dataset of human faces. It consists of 70,000 high-quality human face images at 1,024x1,024 resolution. The CelebA-HQ dataset is a high-quality version of CelebA that consists of 30,000 images at 1,024x1,024 resolution.

1.4.1 Download

The FFHQ dataset can be downloaded from its homepage. The CelebA-HQ dataset can be prepared following tutorials from its homepage.

1.4.2 Usage

MMEditing provides tutorials on the use of the FFHQ and CelebA-HQ datasets. You can just run a command to generate downsampled images.

python tools/dataset_converters/glean/preprocess_ffhq_celebahq_dataset.py --data-root ./data/ffhq/images
python tools/dataset_converters/glean/preprocess_ffhq_celebahq_dataset.py --data-root ./data/CelebA-HQ/GT

1.5 Common Benchmark datasets

The common benchmark datasets used for testing performance of Image Super-Resolution contain Set5, Set14, Urban100, BSDS100, Manga109, and so on. The Set5 dataset is composed of 5 images (“baby”, “bird”, “butterfly”, “head”, “woman”). The Set14 dataset is composed of 14 images (“baboon”, “barbara”, “bridge”, “coastguard”, “comic”, “face”, “flowers”, “foreman”, “lenna”, “man”, “monarch”, “pepper”, “ppt3”, “zebra”). The Urban100 dataset contains 100 building images of urban scenes. The BSDS100 dataset contains 100 images ranging from natural images to object-specific such as plants, people, food, etc. The Manga109 dataset contains 109 Japanese manga volumes.

These datasets can be downloaded from here.

2. Video Super-Resolution Datasets

2.1 REDS

The REDS(Realistic and Dynamic Scenes) dataset was proposed in the NTIRE19 Challenge. It is often used for video deblurring and super-resolution. The dataset is composed of 240 video sequences for training, 30 video sequences for validation, and 30 video sequences for testing. Each video sequence has 100 consecutive frames with a resolution of 720x1,280.

2.1.1 Download

The REDS dataset can be downloaded from its homepage. For video super-resolution, you should download train_sharp, train_sharp_bicubic, val_sharp, and val_sharp_bicubic.

2,1.2 Usage

In the official REDS dataset, the training and validation subsets are publicly available, but the testing subsets are not available. The most common use is to merge original training and validation subsets as training datasets and select four sequences(‘000’, ‘011’, ‘015’, and ‘020’) from original training subsets as testing datasets named REDS4.

The folder structure should look like:

data
├── REDS
│ ├── train_sharp
│ │ ├── 000
│ │ ├── 001
│ │ ├── ...
│ ├── train_sharp_bicubic
│ │ ├── X4
│ │ | ├── 000
│ │ | ├── 001
│ │ | ├── ...
├── REDS4
│ ├── sharp
│ ├── sharp_bicubic

MMEditing provides tutorials on the use of the REDS dataset. You can just run a command to prepare the REDS dataset.

python tools/dataset_converters/reds/preprocess_reds_dataset.py --root-path ./data/REDS

If you want to use LMDB datasets for faster IO speed, you can make LMDB files by running the following command:

python tools/dataset_converters/reds/preprocess_reds_dataset.py --root-path ./data/REDS --make-lmdb

MMEditing also supports cropping REDS images to sub-images for faster IO. You can run the following command:

python tools/dataset_converters/reds/crop_sub_images.py --data-root ./data/REDS  -scales 4

2.2 Vimeo-90K

The Vimeo-90K is a large-scale high-quality video dataset proposed by Video Enhancement with Task-Oriented Flow. It is designed for the following four video processing tasks: temporal frame interpolation, video denoising, video deblocking, and video super-resolution. The dataset is composed of the Triplet dataset (for temporal frame interpolation) and the Septuplet dataset (for video denoising, deblocking, and super-resolution). The septuplet dataset consists of 91,701 7-frame sequences with a resolution of 256x448.

2.2.1 Download

The Vimeo-90K dataset can be downloaded from its homepage. For video super-resolution, you should download the Septuplet dataset(82GB).

The folder structure should look like:

vimeo_septuplet
├── sequences
│ ├── 00001
│ │ ├── 0001
│ │ │ ├── im1.png
│ │ │ ├── im2.png
│ │ │ ├── ...
│ │ ├── 0002
│ │ ├── 0003
│ │ ├── ...
│ ├── 00002
│ ├── ...
├── sep_trainlist.txt
├── sep_testlist.txt

2.2.2 Usage

The original Vimeo-90K dataset doesn’t provide downsampled video sequences, so we need to generate downsampled LR video sequences before usage.

MMEditing provides tutorials on the use of the Vimeo-90K dataset. You can just run a command to generate downsampled images.

python tools/dataset_converters/vimeo90k/preprocess_vimeo90k_dataset.py --data-root ./data/vimeo90

If you want to use LMDB datasets for faster IO speed, you can make LMDB files by running the following command:

python tools/dataset_converters/vimeo90k/preprocess_vimeo90k_dataset.py --data-root ./data/vimeo90k --train_list ./data/vimeo90k/sep_trainlist.txt --gt-path ./data/vimeo90k/GT --lq-path ./data/Vimeo90k/BIx4  --make-lmdb

2.3 Vid4

The Vid4 dataset is one of the most popular testing datasets used for video super-resolution. The dataset consists of four video sequences: ‘calendar’ (41 frames with a resolution of 576x720), ‘city’ (34 frames with a resolution of 576x704), ‘foliage’ (49 frames with a resolution of 480x720), and ‘walk’(47 frames with a resolution of 480x720).

2.3.1 Download

The Vid4 dataset can be downloaded from here.

2.3.2 Usage

MMEditing provides tutorials on the use of the Vid4 dataset.

2.4 UDM10

The UDM10 dataset is a common testing dataset used for video super-resolution. The dataset is composed of 10 video sequences. Each video sequence has 32 consecutive frames with a resolution of 720x1,272.

2.4.1 Download

The UDM10 datasets can be downloaded from here.

2.4.2 Usage

MMEditing provides tutorials on the use of the UDM10 dataset.

2.5 SPMCS

The SPMCS dataset is composed of 30 video sequences. Each video sequence has 31 consecutive frames with a resolution of 540x960. Each video sequence contains bicubic downsampled input for x2, x3, and x4 scale factors and high-resolution ground truth.

2.5.1 Download

The SPMCS datasets can be downloaded from its homepage or from here.

2.5.2 Usage

MMEditing provides tutorials on the use of the SPMCS dataset.

Let’s dive into the world of super-resolution! MMEditing has got your back with an array of top-notch super-resolution models and popular datasets. Want to train your own super-resolution model? No problem! MMEditing supports custom dataset designs too. And we’ve got you covered with all the documents and preprocessing scripts you’ll need to get started.

--

--