Automated Artificial Intelligence (AutoAI) is a variation of the

automated machine learning Automated machine learning (AutoML) is the process of automating the tasks of applying machine learning to real-world problems. AutoML potentially includes every stage from beginning with a raw dataset to building a machine learning model ready ...

, or AutoML, technology, which extends the automation of model building towards automation of the full life cycle of a machine learning model. It applies

intelligent automation Intelligent automation, or alternately intelligent process automation, is a software term that refers to a combination of artificial intelligence (AI) and robotic process automation (RPA). Companies use intelligent automation to cut costs by using ...

to the task of building

predictive A prediction ( Latin ''præ-'', "before," and ''dicere'', "to say"), or forecast, is a statement about a future event or data. They are often, but not always, based upon experience or knowledge. There is no universal agreement about the exa ...

machine learning models by preparing data for training, identifying the best type of model for the given data, then choosing the features, or columns of data, that best support the problem the model is solving. Finally, automation tests a variety of tuning options to reach the best result as it generates, then ranks, model-candidate pipelines. The best performing pipelines can be put into production to process new data, and deliver predictions based on the model training. Automated artificial intelligence can also be applied to making sure the model does not have inherent bias and automating the tasks for continuous improvement of the model. Managing an AutoAI model requires frequent monitoring and updating, managed by a process known as model operations, or

ModelOps ModelOps (model operations), as defined by Gartner, "is focused primarily on the governance and life cycle management of a wide range of operationalized artificial intelligence (AI) and decision models, including machine learning, knowledge graphs, ...

. The Automated Machine Learning and Data Science Team (AMLDS), a small team within IBM Research, which was formed to “apply techniques from

Artificial Intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...

(AI),

Machine Learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

(ML), and

data management Data management comprises all disciplines related to handling data as a valuable resource. Concept The concept of data management arose in the 1980s as technology moved from sequential processing (first punched cards, then magnetic tape) to ...

to accelerate and optimize the creation of machine learning and data science workflows,” is credited with advancing the development of AutoAI.

Use Case

A typical use case for AutoAI would be training a model to predict how customers might respond to a sales incentive. The model is first trained with actual data on how customers responded to the promotion. Presented with new data, the model can provide a prediction of how a new customer might respond, with a confidence score for the prediction. Prior to AutoML, data scientists had to build these predictive models by hand, testing various combinations of algorithms, then testing to see how predictions compared to actual results. Where AutoML automated some of the process of preparing the data for training, applying algorithms to process the data and then further optimizing the results, AutoAI provides greater intelligent automation that allows for testing significantly more combinations of factors to generate model candidate pipelines that more accurately reflect and address the problem being solved. Once built, a model can be tested for bias and updated to improve performance.

The AutoAI process

The user initiates the process by providing a set of training data and identifying the prediction column, which sets up the problem to solve. For example, the prediction column might contain possible values of yes or no in response to an offered incentive. In the ''data pre-processing'' stage, AutoAI applies numerous algorithms, or estimators, to analyze, clean (for example, remove redundant information or impute missing data), and prepare structured raw data for machine learning (ML). The next is ''automated model selection'' that matches the data with a model type, such as classification or regression. For example, if there are only two types of data in a prediction column, AutoAI prepares to build a binary classification model. If there is an unknowable set of possible answers, AutoAI prepares a regression model, which employs a different set of algorithms, or problem-solving transformations. AutoAI ranks after testing candidate algorithms against small sub-sets of the information, increasing the size of the subset gradually for the algorithms that turns most promising to reach at the best match. This process of iterative and incremental machine learning is what sets AutoAI apart from earlier versions of AutoML. ''

Feature engineering Feature engineering or feature extraction or feature discovery is the process of using domain knowledge to extract features (characteristics, properties, attributes) from raw data. The motivation is to use these extra features to improve the qu ...

'' transforms the raw data into the combination that represents the problem to arrive at the best accurate prediction. Part of this process is to evaluate how data in the training data source can best support an accurate prediction. Using

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

s, it weights some data as more important than others to achieve the desired result. AutoAI automates the consideration of numerous feature construction options in a non-exhaustive, structured manner, meanwhile progressively maximizing the accuracy of model using reinforcement learning. This results from an optimized sequence of information and data transformations that matches the best algorithms of the step involving model selection. Finally, AutoAI applies the ''

hyperparameter optimization In machine learning, hyperparameter optimization or tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. A hyperparameter is a parameter whose value is used to control the learning process. By contrast, the ...

'' step to refine and advance the best performing model pipelines. Pipelines are model candidates that are evaluated and ranked by metrics such as accuracy and precision. At the end of the process, the user can review the pipelines and choose the pipeline or pipelines to put into production to deliver predictions on new data.

History

In August 2017, AMLDS announced that they were researching the use of automated feature engineering to eliminate guesswork in data science. AMDLS members Udayan Khurana, Horst Samulowitz, Gregory Bramble, Deepak Toraga, and Peter Kirchner, along with Fatemeh Nargesian of the University of Toronto and Elias Khalil of Georgia Tech, presented their preliminary research at

IJCAI The International Joint Conference on Artificial Intelligence (IJCAI) is the leading conference in the field of Artificial Intelligence. The conference series has been organized by the nonprofit IJCAI Organization since 1969, making it the oldest pr ...

that same year. Called “Learning-based Feature Engineering,” their method learned the correlations between feature distributions, target distributions, and transformations, built meta-models that used past observations to predict viable transformations, and generalized thousands of data sets spanning different domains. To address feature vectors of different sizes, it used Quantile Sketch Array to capture the essential character of a feature. In 2018, IBM Research announced Deep Learning as a Service, which opened popular deep learning libraries such as Caffe, Torch and

TensorFlow TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learning ...

, to developers in the cloud. AMLDS continued their work and used it in a well known

Kaggle Kaggle, a subsidiary of Google LLC, is an online community of data scientists and machine learning practitioners. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with oth ...

competition. It completed in the top ten percent. Jean-Francois Puget, PhD, a distinguished engineer specializing machine learning (ML) and optimization at IBM, entered the competition. He found out and decided to be ready for IBM AI and data science platforms like

IBM Watson IBM Watson is a question-answering computer system capable of answering questions posed in natural language, developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's founder ...

. In December 2018, IBM Research announced NeuNetS, a new capability that automated neural network model synthesis as part of automated AI model development and deployment. In 2020, Liu et al. proposed a method for AutoML that used the alternating direction method of multipliers (ADMM) to configure multiple stages of an ML pipeline, such as transformations, feature engineering and selection, and predictive modeling. This was the first recorded time that IBM Research publicly applied the term “Auto” to machine-learning.

AutoAI: The evolution of AutoML

2019 was the year that

AutoML Automated machine learning (AutoML) is the process of automating the tasks of applying machine learning to real-world problems. AutoML potentially includes every stage from beginning with a raw dataset to building a machine learning model ready ...

became more widely discussed as a concept. “The Forrester New Wave™: Automation-Focused Machine Learning Solutions, Q2 2019,” evaluated AutoML solutions and found that the more powerful versions offered feature engineering. A Gartner Technical Professional Advice report from August 2019 reported that, based on their research, AutoML could augment data science and machine learning. They described AutoML as the automation of data preparation, feature engineering and model engineering tasks. AutoAI is the evolution of AutoML. One of AutoAI's principal inventors, Jean-Francois Puget, PhD, describes it as automatically performing data preparation, feature engineering, machine learning algorithm selection, and hyper-parameter optimization to find the best possible machine learning model. The hyper-parameter optimization algorithm used in AutoAI differs from the hyper-parameter tuning of AutoML. The algorithm is made optimized for cost function evaluations such as model training and scoring which are typical in machine learning, enabling rapid convergence to a good solution despite evaluation times of each iteration being of long duration. Research scientists at IBM Research published a paper "Towards Automating the AI Operations Lifecycle", which describes the advantages and available technologies for automating more of the process, with the goal of limiting the human involvement required to build, test, and maintain a machine learning application. However, some HCI researchers argue that the machine learning application and its recommendations are inevitably taken by human decision makers, thus it is impossible to eliminate human involvement in the process. Rather, a more transparent and interpretable AutoAI design is the key to gain trust from human users, but such design itself is quite a challenge.

Awards for AutoAI

*Winner, Best Innovation in Intelligent Automation Award at the AIconics AI Summit (2019), San Francisco. *Winner, iF Design Guide award for Communication in a Software Application (2020)

References