More and more industries and organizations are leveraging artificial intelligence to delight customers and cut through the competition. However, development and deployment of deep learning models is time-consuming and costly – often prohibitively costly. That’s when automated machine learning (AutoML) comes into play. AutoML solutions can significantly increase the efficiency of ML model development. Even more importantly, they lower the entry barriers for leveraging AI in business settings by allowing people without IT backgrounds to utilize the most advanced ML algorithms. Thanks to AutoML, even small startups without ML engineers in the team can take advantage of machine learning.
But what is the current state of AutoML solutions? He, Zhao, and Chu from Hong Kong Baptist University provide a comprehensive survey of the state-of-the-art AutoML. In this article, we want to summarize the key ideas of this research paper in case you don’t have time for deep-diving into the topic. Following the structure of the original paper, we’ll touch upon the available AutoML techniques, summarize existing approaches to Neural Architecture Search (NAS), provide you with the results of comparing NAS algorithms with human-designed models, and finally, list some ideas for future research.
If these accessible AI research analyses are useful for you, subscribe to receive our regular industry updates below.
AutoML techniques for the Preparation Stage
- Data preparation is the first step in the machine learning pipeline. Therefore, a powerful AutoML system should be able to deal with this problem.
- Data collection. To assist data scientists at this step, AutoML may include components for data augmentation and data searching, including tools for filtering out unrelated data, re-ranking relevant results, self-labeling, and solving the problem of unbalanced data.
- Data cleaning. At this important stage of the machine learning process, robust AutoML systems assist by applying such pre-processing techniques as standardization, scaling, one-code encoding of qualitative characteristics, filling in missing values, etc.
- Feature engineering is essential for maximizing the value of raw data for ML modeling. This step may be divided into three sub-topics: feature selection, feature construction, and feature extraction.
- Feature selection reduces feature redundancy by selecting only the important features. AutoML can help with feature selection using complete, heuristic, and random search algorithms. Then, the subset of features is evaluated through the filter, wrapper, or embedded method.
- Feature construction is used to expand original feature spaces by constructing new features from the basic feature space. The types of transformation operations used by AutoML systems will depend on the feature type and may include conjunctions, disjunctions, negation, min, max, addition, subtraction, mean, etc.
- Feature extraction helps to reduce the dimension of features by some functional mapping. In contrast to feature selection, feature extraction will alter the original features. The possible approaches include principal component analysis (PCA), independent component analysis, isomap, nonlinear dimensionality reduction, linear discriminant analysis (LDA), etc.
Model Development via Neural Architecture Search
- Model structures. The system generates a model by selecting and combining a set of primitive operations (e.g., convolution, pooling, concatenation, elemental addition, skip connection, etc.). The literature related to NAS introduces different approaches to model generation, including generating entire structure, cell-based structure, hierarchical structure, or Network Morphism-based structure.
- Hyperparameter optimization (HPO). After defining the representation of the network structure, the system proceeds with finding the best-performing architecture through optimization of the model’s hyperparameters. Commonly used HPO algorithms include grid and random search, reinforcement learning, evolutionary algorithms, Bayesian optimization, and gradient descent.
- Model estimation. Finally, we need to evaluate the performance of the network. The basic idea is to train the network to convergence, and then evaluate its performance based on the final results. However, this approach requires too much time and computing power, and several algorithms have been proposed to solve this problem. Possible approaches include:
- Low fidelity – accelerating model evaluation by reducing dataset or model size (e.g., using a subset of the training dataset, using images with lower resolution, training with fewer filters per layer, or reducing training time).
- Transfer learning – accelerating the process of NAS by leveraging the knowledge from prior tasks, or sharing parameters among child networks, or inheriting the weights from previous architectures.
- Surrogate – finding the best configurations after a good approximation is obtained.
- Early stopping – stopping the evaluations which are predicted to perform poorly on the validation set.
NAS vs. Humans
In their survey of the state of the art in AutoML, the authors also compared how automatically generated models perform compared to the manually designed ones. Specifically, they evaluated several popular types of NAS algorithms, including random search, reinforcement learning, evolutionary algorithms, and gradient descent-based algorithms. The findings demonstrate that:
- In object classification, NAS algorithms outperform human-designed models. In particular, the top three positions in the CIFAR10 leaderboard are taken by automatically generated models (GPIPE, sharpDARTS, and Proxyless-G).
- In language modeling, there is still a big gap between automatically generated models and the models designed by experts, with the first four models in PTB leaderboard created manually (GPT-2, FRAGE, AWD-LSTM-DOC, and Transformer-XL).
Future Research
The researchers identified the following venues for future research:
- Completing the AutoML pipeline with data collection procedures and automated feature engineering.
- Exploring ways to improve the interpretability of AutoML output.
- Fostering reproducibility for all processes in the AutoML pipeline.
- Increasing flexibility of the encoding scheme to enable automated generation of novel network architectures.
- Extending AutoML to new areas and tasks.
- Incorporating “lifelong learning” into AutoML solutions.
Enjoy this article? Sign up for more AI research updates.
We’ll let you know when we release more summary articles like this one.
Leave a Reply
You must be logged in to post a comment.