Awesome Datasets for Super-Resolution: Introduction and Pre-processing
In the research of image/video super-resolution, a comprehensive understanding of the datasets is crucial. As a toolbox for low-level vision tasks, MMEditing has supported a large number of SOTA super-resolution models and also relevant popular super-resolution datasets. This article aims to provide an overview of commonly used super-resolution datasets, and how they are utilized in both image and video super-resolution tasks.
https://github.com/open-mmlab/mmediting
1. Image Super-Resolution Datasets
1.1 DIV2K
The DIV2K dataset is one of the most popular datasets used for image super-resolution, which is collected for NTIRE2017 and NTIRE2018 Super-Resolution Challenges. The dataset is composed of 800 images for training, 100 images for validation, and 100 images for testing. Each image has a 2K resolution.
1.1.1 Download
The DIV2K dataset can be downloaded from its homepage. For general image super-resolution, the folder structure should look like:
data
├── DIV2K
│ ├── DIV2K_train_HR
│ │ ├── 0001.png
│ │ ├── 0002.png
│ │ ├── ...
│ ├── DIV2K_train_LR_bicubic
│ │ ├── X2
│ │ ├── X3
│ │ ├── X4
│ ├── DIV2K_valid_HR
│ ├── DIV2K_valid_LR_bicubic
│ │ ├── X2
│ │ ├── X3
│ │ ├── X4
1.1.2 Usage
MMEditing provides tutorials on the use of the DIV2K dataset.
If you want to use LMDB datasets for faster IO speed, you can make LMDB files by running the following command:
python tools/dataset_converters/div2k/preprocess_div2k_dataset.py --data-root ./data/DIV2K --make-lm
MMEditing also supports cropping DIV2K images to sub-images for faster IO. You can run the following command:
python tools/dataset_converters/div2k/preprocess_div2k_dataset.py --data-root ./data/DIV2
1.2 DF2K
The DF2k dataset is a merged dataset of DIV2K and Flickr2K, which is proposed by Enhanced Deep Residual Networks for Single Image Super-Resolution. The Flickr2K dataset contains 2,650 images with 2K resolution, which can be downloaded from here.
1.2.1 Download
You can download DIV2K and Flickr2K, respectively. Then you need to merge them.
1.2.2 Usage
The Flickr2K dataset has the same folder structure as the DIV2K dataset. You can simply merge them.
data
├── Flickr2K
│ ├── Flickr2K_HR
│ │ ├── 000001.png
│ │ ├── 000002.png
│ │ ├── ...
│ ├── Flickr2K_LR_bicubic
│ │ ├── X2
│ │ ├── X3
│ │ ├── X4
1.3 DF2K_OST
The DF2k_OST dataset is a merged dataset of DF2K and OST, which is proposed by Real-ESRGAN.
1.3.1 Download
You can download DIV2K, Flickr2K and OST, respectively. Or you can download the merged dataset from here.
1.3.2 Usage
MMEditing provides tutorials on the use of the DF2K_OST dataset.
If you want to use LMDB datasets for faster IO speed, you can make LMDB files by running the following command:
python tools/dataset_converters/df2k_ost/preprocess_df2k_ost_dataset.py --data-root ./data/df2k_ost --make-lmdb
MMEditing also supports cropping DF2K_OST images to sub-images for faster IO. You can run the following command:
python tools/dataset_converters/df2k_ost/preprocess_df2k_ost_dataset.py --data-root ./data/df2k_ost
1.4 Face Super-resolution datasets
Some face image datasets, such as FFHQ and CelebA-HQ, are used for face super-resolution. The FFHQ(Flickr-Faces-HQ) dataset is a high-quality image dataset of human faces. It consists of 70,000 high-quality human face images at 1,024x1,024 resolution. The CelebA-HQ dataset is a high-quality version of CelebA that consists of 30,000 images at 1,024x1,024 resolution.
1.4.1 Download
The FFHQ dataset can be downloaded from its homepage. The CelebA-HQ dataset can be prepared following tutorials from its homepage.
1.4.2 Usage
MMEditing provides tutorials on the use of the FFHQ and CelebA-HQ datasets. You can just run a command to generate downsampled images.
python tools/dataset_converters/glean/preprocess_ffhq_celebahq_dataset.py --data-root ./data/ffhq/images
python tools/dataset_converters/glean/preprocess_ffhq_celebahq_dataset.py --data-root ./data/CelebA-HQ/GT
1.5 Common Benchmark datasets
The common benchmark datasets used for testing performance of Image Super-Resolution contain Set5, Set14, Urban100, BSDS100, Manga109, and so on. The Set5 dataset is composed of 5 images (“baby”, “bird”, “butterfly”, “head”, “woman”). The Set14 dataset is composed of 14 images (“baboon”, “barbara”, “bridge”, “coastguard”, “comic”, “face”, “flowers”, “foreman”, “lenna”, “man”, “monarch”, “pepper”, “ppt3”, “zebra”). The Urban100 dataset contains 100 building images of urban scenes. The BSDS100 dataset contains 100 images ranging from natural images to object-specific such as plants, people, food, etc. The Manga109 dataset contains 109 Japanese manga volumes.
These datasets can be downloaded from here.
2. Video Super-Resolution Datasets
2.1 REDS
The REDS(Realistic and Dynamic Scenes) dataset was proposed in the NTIRE19 Challenge. It is often used for video deblurring and super-resolution. The dataset is composed of 240 video sequences for training, 30 video sequences for validation, and 30 video sequences for testing. Each video sequence has 100 consecutive frames with a resolution of 720x1,280.
2.1.1 Download
The REDS dataset can be downloaded from its homepage. For video super-resolution, you should download train_sharp, train_sharp_bicubic, val_sharp, and val_sharp_bicubic.
2,1.2 Usage
In the official REDS dataset, the training and validation subsets are publicly available, but the testing subsets are not available. The most common use is to merge original training and validation subsets as training datasets and select four sequences(‘000’, ‘011’, ‘015’, and ‘020’) from original training subsets as testing datasets named REDS4.
The folder structure should look like:
data
├── REDS
│ ├── train_sharp
│ │ ├── 000
│ │ ├── 001
│ │ ├── ...
│ ├── train_sharp_bicubic
│ │ ├── X4
│ │ | ├── 000
│ │ | ├── 001
│ │ | ├── ...
├── REDS4
│ ├── sharp
│ ├── sharp_bicubic
MMEditing provides tutorials on the use of the REDS dataset. You can just run a command to prepare the REDS dataset.
python tools/dataset_converters/reds/preprocess_reds_dataset.py --root-path ./data/REDS
If you want to use LMDB datasets for faster IO speed, you can make LMDB files by running the following command:
python tools/dataset_converters/reds/preprocess_reds_dataset.py --root-path ./data/REDS --make-lmdb
MMEditing also supports cropping REDS images to sub-images for faster IO. You can run the following command:
python tools/dataset_converters/reds/crop_sub_images.py --data-root ./data/REDS -scales 4
2.2 Vimeo-90K
The Vimeo-90K is a large-scale high-quality video dataset proposed by Video Enhancement with Task-Oriented Flow. It is designed for the following four video processing tasks: temporal frame interpolation, video denoising, video deblocking, and video super-resolution. The dataset is composed of the Triplet dataset (for temporal frame interpolation) and the Septuplet dataset (for video denoising, deblocking, and super-resolution). The septuplet dataset consists of 91,701 7-frame sequences with a resolution of 256x448.
2.2.1 Download
The Vimeo-90K dataset can be downloaded from its homepage. For video super-resolution, you should download the Septuplet dataset(82GB).
The folder structure should look like:
vimeo_septuplet
├── sequences
│ ├── 00001
│ │ ├── 0001
│ │ │ ├── im1.png
│ │ │ ├── im2.png
│ │ │ ├── ...
│ │ ├── 0002
│ │ ├── 0003
│ │ ├── ...
│ ├── 00002
│ ├── ...
├── sep_trainlist.txt
├── sep_testlist.txt
2.2.2 Usage
The original Vimeo-90K dataset doesn’t provide downsampled video sequences, so we need to generate downsampled LR video sequences before usage.
MMEditing provides tutorials on the use of the Vimeo-90K dataset. You can just run a command to generate downsampled images.
python tools/dataset_converters/vimeo90k/preprocess_vimeo90k_dataset.py --data-root ./data/vimeo90
If you want to use LMDB datasets for faster IO speed, you can make LMDB files by running the following command:
python tools/dataset_converters/vimeo90k/preprocess_vimeo90k_dataset.py --data-root ./data/vimeo90k --train_list ./data/vimeo90k/sep_trainlist.txt --gt-path ./data/vimeo90k/GT --lq-path ./data/Vimeo90k/BIx4 --make-lmdb
2.3 Vid4
The Vid4 dataset is one of the most popular testing datasets used for video super-resolution. The dataset consists of four video sequences: ‘calendar’ (41 frames with a resolution of 576x720), ‘city’ (34 frames with a resolution of 576x704), ‘foliage’ (49 frames with a resolution of 480x720), and ‘walk’(47 frames with a resolution of 480x720).
2.3.1 Download
The Vid4 dataset can be downloaded from here.
2.3.2 Usage
MMEditing provides tutorials on the use of the Vid4 dataset.
2.4 UDM10
The UDM10 dataset is a common testing dataset used for video super-resolution. The dataset is composed of 10 video sequences. Each video sequence has 32 consecutive frames with a resolution of 720x1,272.
2.4.1 Download
The UDM10 datasets can be downloaded from here.
2.4.2 Usage
MMEditing provides tutorials on the use of the UDM10 dataset.
2.5 SPMCS
The SPMCS dataset is composed of 30 video sequences. Each video sequence has 31 consecutive frames with a resolution of 540x960. Each video sequence contains bicubic downsampled input for x2, x3, and x4 scale factors and high-resolution ground truth.
2.5.1 Download
The SPMCS datasets can be downloaded from its homepage or from here.
2.5.2 Usage
MMEditing provides tutorials on the use of the SPMCS dataset.
Let’s dive into the world of super-resolution! MMEditing has got your back with an array of top-notch super-resolution models and popular datasets. Want to train your own super-resolution model? No problem! MMEditing supports custom dataset designs too. And we’ve got you covered with all the documents and preprocessing scripts you’ll need to get started.