This year’s virtual ICML conference hosted 10800+ attendees from 75 countries. Apparently, the virtual format makes big research conferences such as ICML more accessible to the AI community all over the world.
With almost 5000 research papers submitted to ICML 2020 and an acceptance rate of 21.8%, a total of 1088 papers were presented at the conference. As usual, the Outstanding Papers awards were given to exemplary papers at this year’s ICML. To help you stay aware of the most prominent AI research breakthroughs, we’ve summarized the key ideas of these papers.
Subscribe to our AI Research mailing list at the bottom of this article to be alerted when we release new summaries.
If you’d like to skip around, here are the papers we featured:
- On Learning Sets of Symmetric Elements
- Tuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging Problems
- Generative Pretraining from Pixels
- Efficiently Sampling Functions from Gaussian Process Posteriors
ICML 2020 Best Paper Awards
1. On Learning Sets of Symmetric Elements, by Haggai Maron, Or Litany, Gal Chechik, Ethan Fetaya
Original Abstract
Learning from unordered sets is a fundamental learning setup, recently attracting increasing attention. Research in this area has focused on the case where elements of the set are represented by feature vectors, and far less emphasis has been given to the common case where set elements themselves adhere to their own symmetries. That case is relevant to numerous applications, from deblurring image bursts to multi-view 3D shape recognition and reconstruction.
In this paper, we present a principled approach to learning sets of general symmetric elements. We first characterize the space of linear layers that are equivariant both to element reordering and to the inherent symmetries of elements, like translation in the case of images. We further show that networks that are composed of these layers, called Deep Sets for Symmetric Elements layers (DSS), are universal approximators of both invariant and equivariant functions. DSS layers are also straightforward to implement. Finally, we show that they improve over existing set-learning architectures in a series of experiments with images, graphs, and point-clouds.
Our Summary
The research paper focuses on learning sets in the case when the elements of the set exhibit certain symmetries. That case is relevant when learning with sets of images, sets of point-clouds, or sets of graphs. The research team from NVIDIA Research, Stanford University, and Bar Ilan University introduces a principled approach to learning such sets, where they first characterize the space of linear layers that are equivariant both to element reordering and to the inherent symmetries of elements and then show that networks that are composed of these layers are universal approximators of both invariant and equivariant functions. The experiments demonstrate that the proposed approach achieves significant improvements over the previous approaches.
What’s the core idea of this paper?
- The research paper introduces a new principled approach to learning from unordered sets by utilizing the symmetries that the set elements exhibit:
- The authors describe the symmetry group of the sets and characterize the space of linear layers that are equivariant to this group. These layers are called Deep Sets for Symmetric elements layers (DSS).
- Then, the researchers prove that if invariant networks for the elements of interest are universal, the corresponding invariant DSS networks on sets of such elements are also universal. The same result holds for equivariant networks and equivariant DSS networks.
What’s the key achievement?
- The experimental results show that DSS layers outperform previous approaches in a series of tasks, including classification, frame selection in images and shapes, highest quality image selection, color-channel matching, and burst image deblurring.
What does the AI community think?
- The paper received the Outstanding Paper Award at ICML 2020.
2. Tuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging Problems, by Kaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Carola-Bibiane Schnlieb, Hua Huang
Original Abstract
Plug-and-play (PnP) is a non-convex framework that combines ADMM or other proximal algorithms with advanced denoiser priors. Recently, PnP has achieved great empirical success, especially with the integration of deep learning-based denoisers. However, a key problem of PnP based approaches is that they require manual parameter tweaking. It is necessary to obtain high-quality results across the high discrepancy in terms of imaging conditions and varying scene content. In this work, we present a tuning-free PnP proximal algorithm, which can automatically determine the internal parameters including the penalty parameter, the denoising strength and the terminal time. A key part of our approach is to develop a policy network for automatic search of parameters, which can be effectively learned via mixed model-free and model-based deep reinforcement learning. We demonstrate, through numerical and visual experiments, that the learned policy can customize different parameters for different states, and often more efficient and effective than existing handcrafted criteria. Moreover, we discuss the practical considerations of the plugged denoisers, which together with our learned policy yield state-of-the-art results. This is prevalent on both linear and nonlinear exemplary inverse imaging problems, and in particular, we show promising results on Compressed Sensing MRI and phase retrieval.
Our Summary
A key issue with plug-and-play (PnP) approaches is the need to manually tweak parameters. The PnP algorithm introduced in this paper is tuning-free and can automatically determine internal parameters, including the penalty parameter, the denoising strength, and the terminal time. The parameters are optimized with a reinforcement learning (RL) algorithm, where a high reward is given if the policy leads to faster convergence and better restoration accuracy. The extensive numerical and visual experiments demonstrate the effectiveness of the suggested approach on compressed sensing MRI and phase retrieval problems.
What’s the core idea of this paper?
- PnP algorithms offer promising image recovery results. However, their performance is very sensitive to the internal parameter selection (i.e., the penalty parameter, the denoising strength, and the terminal time). The common approach is manual parameter tweaking for each specific problem setting, which is very cumbersome and time-consuming.
- To address this problem, the researchers introduce an RL-based method with a policy network that can customize well-suited parameters for different images:
- an automated parameter selection problem is formulated as a Markov decision process;
- a policy agent gets higher rewards for faster convergence and better restoration accuracy;
- the discrete terminal time and the continuous denoising strength and penalty parameters are optimized jointly.
What’s the key achievement?
- An extensive range of numerical and visual experiments demonstrate that the introduced tuning-free PnP algorithm:
- outperforms state-of-the-art techniques by a large margin on the linear inverse imaging problem, namely compressed sensing MRI (especially under the difficult settings);
- demonstrates state-of-the-art performance on the non-linear inverse imaging problem, namely phase retrieval, where it produces cleaner and clearer results than competing techniques;
- often reaches a level of performance comparable to the “oracle” parameters tuned via the inaccessible ground truth.
What does the AI community think?
- The paper received the Outstanding Paper Award at ICML 2020.
What are possible business applications?
- The introduced tuning-free PnP proximal algorithm can be applied to different inverse imaging problems, including magnetic resonance imaging (MRI), computed tomography (CT), microscopy, and inverse scattering.
Where can you get implementation code?
- The implementation of this research paper will be released on GitHub.
3. Generative Pretraining from Pixels, by Mark Chen, Alec Radford, Rewon Child, Jeff Wu, Heewoo Jun, Prafulla Dhariwal, David Luan, Ilya Sutskever
Original Abstract
Inspired by progress in unsupervised representation learning for natural language, we examine whether similar models can learn useful representations for images. We train a sequence Transformer to auto-regressively predict pixels, without incorporating knowledge of the 2D input structure. Despite training on low-resolution ImageNet without labels, we find that a GPT-2 scale model learns strong image representations as measured by linear probing, fine-tuning, and low-data classification. On CIFAR-10, we achieve 96.3% accuracy with a linear probe, outperforming a supervised Wide ResNet, and 99.0% accuracy with full finetuning, matching the top supervised pre-trained models. An even larger model trained on a mixture of ImageNet and web images is competitive with self-supervised benchmarks on ImageNet, achieving 72.0% top-1 accuracy on a linear probe of our features.
Our Summary
Generative pre-training methods have had a substantial impact on natural language processing over the last few years. The OpenAI research team re-evaluates these techniques on images and demonstrates that generative pre-training is competitive with other self-supervised approaches. The introduced approach consists of a pre-training stage, where both autoregressive and BERT objectives are explored, and a fine-tuning stage. The authors apply Transformer architecture to predict pixels instead of language tokens. The experiments demonstrate that generative image modeling learns state-of-the-art representations for low-resolution datasets and achieves comparable results to other self-supervised methods on ImageNet.
What’s the core idea of this paper?
- The authors claim that generative pre-training methods for images can be competitive with other self-supervised approaches when using a flexible architecture such as Transformer, an efficient likelihood-based objective, and significant computational resources (2048 TPU cores).
- They introduce Image GPT, or iGPT, which is based on GPT-2 but where the sequence Transformer architecture predicts pixels instead of language tokens:
- First, raw images are resized to low resolution and reshaped into a 1D sequence.
- Second, autoregressive next pixel prediction or masked pixel prediction (BERT) is chosen as the pre-training objective.
- Finally, the representations are learned by these objectives with linear probes or fine-tuning.
What’s the key achievement?
- The experiments demonstrate that iGPT:
- outperforms a supervised WideResNet on CIFAR-10, CIFAR-100, and STL-10 datasets;
- achieves 72% accuracy on ImageNet, which is competitive with the recent contrastive learning approaches that require fewer parameters but work with higher resolution and utilize knowledge of the 2D input structure;
- after fine-tuning, achieves 99% accuracy on CIFAR-10, similar to GPipe, the best model which pre-trains using ImageNet labels.
What does the AI community think?
- The paper received an Honorable Mention at ICML 2020.
What are future research areas?
- Exploring more efficient self-attention approaches.
- Revisiting the representation learning capabilities of other families of generative models (e.g., flows, VAEs).
Where can you get implementation code?
- TensorFlow implementation of iGPT by the OpenAI team is available here.
- PyTorch implementation of the model is available here.
4. Efficiently Sampling Functions from Gaussian Process Posteriors, by James T. Wilson, Viacheslav Borovitskiy, Alexander Terenin, Peter Mostowsky, Marc Peter Deisenroth
Original Abstract
Gaussian processes are the gold standard for many real-world modeling problems, especially in cases where a model’s success hinges upon its ability to faithfully represent predictive uncertainty. These problems typically exist as parts of larger frameworks, wherein quantities of interest are ultimately defined by integrating over posterior distributions. These quantities are frequently intractable, motivating the use of Monte Carlo methods. Despite substantial progress in scaling up Gaussian processes to large training sets, methods for accurately generating draws from their posterior distributions still scale cubically in the number of test locations. We identify a decomposition of Gaussian processes that naturally lends itself to scalable sampling by separating out the prior from the data. Building off of this factorization, we propose an easy-to-use and general-purpose approach for fast posterior sampling, which seamlessly pairs with sparse approximations to afford scalability both during training and at test time. In a series of experiments designed to test competing sampling schemes’ statistical properties and practical ramifications, we demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
Our Summary
In this paper, the authors explore techniques for efficiently sampling from Gaussian process (GP) posteriors. After investigating the behaviors of naive approaches to sampling and fast approximation strategies using Fourier features, they find that many of these strategies are complementary. They therefore introduce an approach that incorporates the best of different sampling approaches. First, they suggest decomposing the posterior as the sum of a prior and an update. Then they combine this idea with techniques from literature on approximate GPs and obtain an easy-to-use general-purpose approach for fast posterior sampling. The experiments demonstrate that decoupled sample paths accurately represent GP posteriors at a much lower cost.
What’s the core idea of this paper?
- The introduced approach to sampling functions from GP posteriors centers on the observation that it is possible to implicitly condition Gaussian random variables by combining them with an explicit corrective term.
- The authors translate this intuition to Gaussian processes and suggest decomposing the posterior as the sum of a prior and an update.
- Building on this factorization, the researchers suggest an efficient approach for fast posterior sampling that seamlessly pairs with sparse approximations to achieve scalability both during training and at test time.
What’s the key achievement?
- Introducing an easy-to-use and general-purpose approach to sampling from GP posteriors.
- Demonstrating, with a series of experiments, that decoupled sample paths:
- avoid many shortcomings of the alternative sampling strategies;
- accurately represent GP posteriors at a much lower cost; for example, simulation of a well-known model of a biological neuron required only 20 seconds using decoupled sampling, while the iterative approach required 10 hours.
What does the AI community think?
- The paper received an Honorable Mention at ICML 2020.
Where can you get implementation code?
- The authors released the implementation of this paper on GitHub.
If you like these research summaries, you might be also interested in the following articles:
- Best Research Papers From ACL 2020
- The Highest-Trending Research Papers From CVPR 2020
- Top 2019 ML Research Papers You Should Know About
- 10 Cutting-Edge Research Papers In Computer Vision From 2019
- What Are Major NLP Achievements & Papers From 2019?
Enjoy this article? Sign up for more AI research updates.
We’ll let you know when we release more summary articles like this one.
Leave a Reply
You must be logged in to post a comment.