🕖9:00–12:00 AM Sunday, June 18th, 2023
📍Vancouver, Canada
The CVPR, the premier annual computer vision event, is about to be held. OpenMMLab invites all CVers to join, explore the unknown, and ignite the spark of idea. At this year’s CVPR, OpenMMLab will host a tutorial: Boosting Computer Vision Research with OpenMMLab and OpenDataLab.
Since their inception, OpenMMLab and OpenDataLab have been committed to promoting the research and application on computer vision. We not only provide a rich library of open-source toolboxes and open datasets, but also actively promote the development and innovation of computer vision technology.
This Tutorial will introduce two open platforms which can significantly accelerate the research in computer vision — — OpenMMLab and OpenDataLab. The share is promoted by young scientist Kai Chen, invites young scientist Conghui He from Shanghai AI Lab, young researchers Zhang Songyang, Zeng Yanhong and Wenwei Zhang from Shanghai AI Lab, to bring wonderful shares.
Whether you are an experienced researcher or a novice passionate about computer vision, we sincerely invite you to join this seminar.
Time
UTC-7 (Canada): 09:00–12:00 AM(June 18th)
UTC+8 (China): 00:00–03:00 AM(June 19th)
Onsite:CVPR Venue-Registration required
Virtual:ZOOM Online Meeting-Registration required
Talk 1
OpenMMLab 2.0: A General, Unified and Flexible Open-source Algorithm Platform
In this part, the following topics will be covered:
1. Overall introduction to OpenMMLab, including the architecture, modules and the impact.
2. The basic usage of toolboxes and some example research projects.
3. Model deployment with OpenMMLab toolchains and practices of using OpenMMLab for research and production.
Kai Chen is currently a Research Scientist & PI at Shanghai AI Laboratory. He is leading the OpenMMLab team, which has received more than 85,000 stars on GitHub. OpenMMLab targets developing state-of-the-art computer vision algorithms for research and industrial applications, as well as building influential open-source projects. He also leads an engineering and product team to build open platforms for AI. Kai Chen received the PhD degree in The Chinese University of Hong Kong in 2019, under the supervision of Prof. Dahua Lin and Chen Change Loy at MMLab. Before that, he received the B.Eng. degree from Tsinghua University in 2015. He has published more than 30 papers on top-tier conferences and journals on computer vision.
Talk 2
Learning Fundamental Vision Models with OpenMMLab
Introduction of vision fundamental models and related library (MMPreTrain)
- We will first introduce deep learning foundation model library (MMPreTrain). The milestone works of foundation models will be presented, and we then elaborate the design of our codebase and highlight properties.
- In this part, we will highlight the newly updated features related to multimodality learning. First, we will discuss the recent progress of the community in multimodal learning, followed by sharing practical examples of how to use MMPreTrain for various multimodal tasks.
- We then will show how to play foundation models based on image classification. We will illustrate how to use MMPretrain for the classic image classification task and introduce the recent progress in vision backbone.
- Finally, we share more recent works on self-supervised learning. We would like to introduce the concept and pipeline of self-supervised learning at the beginning. Then we will introduce how to learn self-supervised learning with MMPretrain .
Songyang Zhang has joined OpenMMLab, Shanghai AI Laboratory, as a Young Researcher. He leads a team working on foundation models, including research and open-source platforms. His team develops and maintains the OpenMMLab projects MMClassification and MMSelfSup. He obtained his Ph.D. in Computer Science at the University of Chinese Academy of Science, in the joint program at PLUS Lab, ShanghaiTech University, supervised Prof. Xuming He in 2022. He got his B.Sc. degree in 2017, and worked at MC² Lab, Beihang University, under the supervision of Prof. Mai Xu. He also worked as the research intern in TuSimple, Tencent Youtu Lab, and Megvii Research. He has published 13 papers on top-tier conferences and journals on computer vision. Selected works: Distribution Alignment: A Unified Framework for Long-tail Visual Recognition(CVPR 2019); SGTR: End-to-end Scene Graph Generation with Transformer (CVPR 2022); Dynamic Grained Encoder for Vision Transformers (NeruIPS 2021) and so on.
Talk 3
General Object detection with MMDetection 3.0
This tutorial will introduce how to conduct research projects related to object detection efficiently with MMDetection 3.0. The contents will cover the following three parts:
- This part will introduce object detection, instance segmentation, and panoptic segmentation. We will go through problem formulation, challenges, and representative methods in these fields.
- In this part, we will introduce MMDetection, the OpenMMLab Detection Toolbox and Benchmark, which is also one of the most popular toolboxes in object detection and is the foundation of many toolboxes like MMDetection3D and MMTracking. We will go through the modular design and model zoo of MMDetection 3.0.
- A hands-on tutorial that reveals how to play with MMDetection 3.0 to conduct model inference and training to facilitate research projects.
Wenwei Zhang is a final year Ph.D. student at the School of Computer Science and Engineering, Nanyang Technological University, Singapore. He is a member of NTU MMLab, affiliated with the NTU S-Lab, supervised by Professor Chen Change Loy. He also works closely with Jiangmiao Pang and Kai Chen, focusing on object recognition and scene understanding tasks. His main works lie in Unified Framework for X, which includes unified image, video, and point segmentation, Dense Unsupervised Learning, and robust multi-modality multi-object tracking. He has published eight papers in top conferences and won several international competitions. Selected works: Dense Siamese Network for Dense Unsupervised Learning (ECCV 2022); Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation (CVPR Oral 2022); K-Net: Towards Unified Image Segmentation (NeurIPS 2021).
Talk 4
Learning to Generate, Edit, and Enhance Images and Videos with MMagic
This tutorial introduces the state-of-the-art open-source toolbox, MMagic, to AI researchers and practitioners, who want to play with advanced research achievements in the field of image and video generation, editing, and enhancement. This talk includes
- A brief introduction to the tasks of image and video generation, editing and enhancement. We introduce classical and popular tasks, challenges, and state-of-the-art approaches in this field.
- An introduction to MMaigc, an open-source toolbox and benchmark that provides a unified and flexible solution to various tasks (e.g., image super-resolution, text-to-image, etc.) and models (e.g., GAN, diffusion model, etc.) in the field of image and video generation/editing/enhancement. We introduce the overall design and implementation of MMagic.
- A hands-on tutorial that reveals how to quickly run with pre-trained models and design new models for various tasks by MMagic.
Yanhong Zeng is a Researcher at Shanghai AI Laboratory. Before that, she received her Ph.D. degree in Sun Yat-sen University (SYSU) under the joint PhD program between Microsoft Research Asia (MSRA) and SYSU in 2022. Her research interests include image/video synthesis and editing, multi-modal language and vision. She has published papers in top-tier conferences and transactions such as CVPR, EECV, NeurIPS and TVCG. She served as the reviewer of CVPR, NeurIPS, ICLR, AAAI, ICML, SIGGRAPH, TIP, TVCG, etc. She was awarded ICML Outstanding Reviewer.
Talk 5
Introduction to OpenDataLab:An Open Data Platform for Artificial Intelligence
The following topics will be covered in this talk:
1. An overall introduction to OpenDataLab, including the open dataset platform, open-source data processing toolkits and a dataset description language (DSDL).
2. Open-Source data toolkits related research work supported by OpenDataLab .
3. The exploration of defining a dataset description language, and how to use it to standardize datasets for different tasks, formats and modalities.
Conghui He, PhD from Tsinghua University, Visiting Ph.D. from Stanford University and Imperial College London. He used to be a senior researcher of Tencent WeChat. He is currently a Research Scientist & PI at Shanghai AI Laboratory and Research Director of SenseTime. He is leading the OpenDataLab team, which targets research in data fields in AI, as well as building influential open data and open-source projects.
His main research interests are high-performance computing, reconfigurable computing, distributed computing, computer vision, etc. He has published a series of papers in academic conferences such as ICCV, TC, SC, FPGA, FCCM, FPL, FPT, and served as a reviewer for FPT conferences. He has won the IEEE/IBM Smarter Earth Challenge Global Champion and the Gordon Bell Award, the highest award in the field of high-performance computing applications.
In addition, we would like to extend an invitation to the OpenMMLab Academic Club by the Shanghai AI Laboratory- OpenMMLab. The club will host a range of academic activities to facilitate scholarly connections and exchange of ideas. Join us to share insights, and network with peers in our mission to drive AI research forward.
Join us with the link: OpenMMLab Academic Club Invitation