There is a growing trend of research in few-shot learning (FSL), which involves adapting learned knowledge to learn new concepts with limited few-shot training examples. This tutorial comprises several talks, including an overview of few-shot learning by Dr. Da Li and a discussion of seminal and state-of-the-art meta-learning methods for FSL by Prof. Timothy Hospedales. The tutorial will cover both gradient-based and amortised meta-learners, as well as some theory for meta-learning, and Dr. Yanwei Fu will introduce recent FSL techniques that use statistical methods, such as exploiting the support of unlabeled instances for few-shot visual recognition and causal inference for few-shot learning. Dr. Yu-Xiong Wang will also discuss various applications of FSL in fields beyond computer vision, such as natural language processing, reinforcement learning, and robotics.
Deep learning models have excelled in many computer vision tasks such as image recognition. However, the aforementioned exceptional performance highly relies on the availability of sufficient labelled training data and is not maintained when applied to vision problems with limited training samples. For example, an object recognizer well trained on some known categories, including cat, dog etc., can fail to learn to recognize objects of a novel category, such as mouse, with one or a few training samples. This severely limits the scalability of a deployed model to open-ended learning in the real world, where limited training data is very common due to long tail data and other considerations. However, learning from extremely limited (e.g., one or few) examples is an important human ability. For example, kids have no problem recognizing a 'giraffe' by only glancing at a picture of a 'giraffe' beforehand. Motivated by the above observations, there has been a growing wave of research in few-shot learning (FSL), which aims to learn new concepts by adapting the learned knowledge with limited few-shot training (support) examples.
This tutorial will have three long talks, and two short talks. We will summarize the main contents of each talk.
This talk is given by Dr. Da Li. We will give the overview of few-shot learning, including but not limited to motivation, challenges & related generalized tasks (such as domain adaptation, long-tailed recognition, zero-shot learning, and open set recognition) general methodologies for solving FSL. In particular, we will both discuss few-shot learning using conventional methods and from the perspective of building on large pre-trained foundation models. And we will also make an extensive summary of benchmarks and evaluation protocols for few-shot learning.
Meta-learning aims to transfer knowledge from the experience of learning a set of known tasks to accelerate the sample efficiency of learning new tasks from the same distribution. This learning-to-learn is achieved by optimising the learning algorithm with respect to the sample efficiency of seen tasks. Since it's modern renaissance with roots in meta-learned initialization (MAML) and metric-learners (ProtoNet), the field has blossomed with a plethora of approaches including curvature learning, non-linear metric-learning, generalizable feature embedding, generating feature representations conditional on the few-shot support set an generalizable feature alignment modules.
In this tutorial, we will review various seminal and state of the art meta-learning methods for FSL and discuss their pros and cons. We will cover both gradient-based and amortised meta-learners. We will introduce some theory for meta-learning, in terms of generalisation bounds, and discuss what insights these hold for algorithm design. Finally, we will discuss current challenges including generalisation beyond the distribution of the training tasks, and applications to tasks beyond image recognition such as segmentation, pose, and reinforcement learning.
This talk is delivered by Prof. Timothy Hospedales.
Deep learning based models require an avalanche of expensive human labeled training data and many iterations to train their large number of parameters. This severely limits their scalability to the real-world long-tail distributed categories, some of which are with a large number of instances, but with only a few manually annotated. Different to prior arts that leverage meta-learning or data augmentation strategies to alleviate this extremely data scarce problem, this talk will introduce some recent few-shot learning by using statistical methods such as ICI and MFL. Additionally, we will also introduce the methods that exploit the support of unlabeled instances for few-shot visual recognition. Theoretically, under some mild conditions, we show that the methods should be guaranteed to improve the few-shot learning performance by using unlabeled instances.
We will also want to talk about causal inference for few-shot learning. We are particularly interested in understanding the causal inference for few-shot learning, as this may potentially be one of very promising research directions for fewshot learning. Specifically, despite extensive previous efforts are made on few-shot learning tasks, we emphasize that most existing methods did not take into account the distributional shift caused by sample selection bias in the FSL scenario. Such a selection bias can induce spurious correlation between the semantic causal features, that are causally and semantically related to the class label, and the other non-causal features. Critically, the former ones should be invariant across changes in distributions, highly related to the classes of interest, and thus well generalizable to novel classes, while the latter ones are not stable to changes in the distribution. So it is interesting and inspiring to utilize causal inference for few-shot learning.
This long talk is made by Dr. Yanwei Fu.
While few-shot learning was initiated in the context of object recognition, the lack of data is a fundamental challenge, pervasive and entrenched, in nearly every field of computer vision and machine learning. In recent years, various few-shot learning methods have been proposed to address a much richer space of few-shot learning tasks, ranging from discriminative recognition tasks (e.g., object and scene classification and detection, fine-grained recognition, video action recognition, domain adaptation, and image retrieval), to generative tasks (e.g., image synthesis and human motion prediction), to cross-modality tasks (e.g., with LiDAR and RGB data), to motor control and robotics tasks (e.g., navigation), and to natural language progressing and vision-language tasks (e.g., machine translation, program induction, and visual question answering). In addition, going beyond artificial few-shot settings, recent work has started to investigate a variety of more realistic, large-scale scenarios, including cross-domain, long-tail, open-world, and continual settings.
Taking the few-shot generative task as a concrete example, an amortized probabilistic meta-learner is exploited to generate multiple views of an object from just a single image. A generative query network is proposed to render scenes from novel views. Talking heads are synthesized from little data by learning to initialize an adversarial model for rapid adaptation. In the video domain, a weight generator is meta-learned to synthesize videos given a few reference images.
This long talk is given by Dr. Yu-Xiong Wang.
Due to visa issues, several speakers are unable to attend CVPR for an in-person talk.
Sessions | Title | Slides | Speakers | |
9:00 - 9:05 | Opening | Yanwei Fu | Virtual | |
9:05 - 9:35 | Introduction of Few-shot Learning | slides | Da Li | Virtual |
9:35 - 10:35 | Meta-Learning for Few-shot Learning | slides | Timothy Hospedales | In-person |
10:35 - 10:50 | Break | |||
10:50 - 11:40 | Few-shot Learning by Statistical Methods | slides | Yanwei Fu and Yikai Wang | In-person |
11:40 - 12:30 | Few-Shot Learning in the Wild | Yu-Xiong Wang | Virtual |
Contact the Organizing Committee: yanweifu@fudan.edu.cn, yi-kai.wang@outlook.com