Announcing MMFlow: OpenMMLab Optical Flow Toolbox and Benchmark.
MMFlow is an open source optical flow toolbox based on PyTorch, which is the first toolbox that provides a framework for unified implementation and evaluation of optical flow algorithms. It is a part of the OpenMMLab project.
Optical flow is a 2D velocity field, representing the apparent 2D image motion of pixels from the reference image to the target image [1]. Here is an example for a visualized flow map from Sintel dataset [2–3]. The character moving leftward raises optical flow, which it is rendered mainly in blue according to the color wheel that represents the direction-color relationship.
Optical flow estimation is a fundamental building block in many real world applications. In 2015, FlowNet
has been marked as a milestone by using CNN for optical flow estimation. Its accuracies are approaching state-of-the-art energy minimization approaches, while the speed is several orders of magnitude faster.
Yet, the implementations of optical flow models are standalone, making it hard to obtain or reproduce quality baseline algorithms. Especially, flow models are always implemented on different deep-learning frameworks, and this forces researchers to switch between deep-learning frameworks or libraries, which might be a challenge to compare speed and accuracy of models.
Here, we’re excited to announce our new project — MMFlow, which strives to overcome these challenges! Thanks to the generic framework from OpenMMLab, MMFlow is able to unify the implementation and evaluation of optical flow algorithms. There are many more features out there, including but not limited to:
- The First Unified Framework for Optical Flow: MMFlow is the first toolbox that provides a framework for unified implementation and evaluation of optical flow algorithms.
- Flexible and Modular Design: We decompose the flow estimation framework into different components, making it much easier and more flexible to build a new model by combining different modules.
- Plenty of Algorithms and Datasets Out of the Box: The toolbox directly supports popular and contemporary optical flow models, e.g.
FlowNet
,PWC-Net
,RAFT
, etc., and representative datasets,FlyingChairs
,FlyingThings3D
,Sintel
,KITTI
, etc.
If these are exactly what you need, don’t hesitate to check out our project on GitHub https://github.com/open-mmlab/mmflow! Feel free to watch, star, and clone our projects. You are also welcome to help its growth by raising any issues and even making a PR!
Run a Demo
We provide two types of demo, from which you can get a sense of how MMFlow works. Here is an example to predict the optical flow between two adjacent frames and render the flow map.
python demo/image_demo.py ${IMAGE1} ${IMAGE2} \${CONFIG_FILE} ${CHECKPOINT_FILE} ${OUTPUT_DIR} \[--out_prefix] ${OUTPUT_PREFIX} [--device] ${DEVICE}
Optional arguments:
--out_prefix
: The prefix for the output results, including the raw flow files and visualized flow maps.
--device
: Device used for inference.
Example:
Assume that you have already downloaded the checkpoints to the directory checkpoints/
,and output will be saved in the directory raft_demo
.
python demo/image_demo.py demo/frame_0001.png demo/frame_0002.png \configs/raft/raft_8x2_100k_mixed_368x768.py \checkpoints/raft_8x2_100k_mixed_368x768.pth raft_demo
You will get a flow file raft_demo.flo
and visualized flow map raft_demo.png
python demo/video_demo.py ${VIDEO} ${CONFIG_FILE} ${CHECKPOINT_FILE} ${OUTPUT_FILE} \[--gt] ${GROUND_TRUTH} [--device] ${DEVICE}
Optional arguments:
--gt
: The video file of ground truth for input video. If specified, ground truth video frames will be concatenated with predicted result maps for comparison.
--device
: Device used for inference.
Example:
Assume that you have already downloaded the checkpoints to the directory
checkpoints/
,and output will be saved as raft_demo.mp4
.
python demo/video_demo.py demo/demo.mp4 \configs/raft/raft_8x2_100k_mixed_368x768.py \checkpoints/raft_8x2_100k_mixed_368x768.pth \raft_demo.gif --gt demo/demo_gt.mp4
MMFlow Framework
Modular Design is one of MMFlow’s major features, and here is the whole framework of MMFlow:
MMFlow consists of 4 main parts, datasets
, models
, core
and apis
.
datasets
is for dataset loading and data augmentation. In this part, we support various datasets for supervised optical flow algorithms, useful data augmentation transform inpipelines
, and samplers for data loading insamplers
- The
models
are the most vital part containing models of learning-based optical flow. We implement each model as a flow estimator and decompose it into two components: encoder and decoder. The loss functions for training flow models are in this module as well. apis
provides high-level APIs for models training, testing, and inference.core
provides evaluation tools and customized hooks for model training.
References
1. Michael Black, Optical flow: The “good parts” version, Machine Learning Summer School (MLSS), Tübiungen, 2013.
2. Black M J. Robust incremental optical flow[D]. Yale University, 1992.
3. Butler D J, Wulff J, Stanley G B, et al. A naturalistic open source movie for optical flow evaluation[C]//European conference on computer vision. Springer, Berlin, Heidelberg, 2012: 611–625.