For most businesses, machine learning seems close to rocket science, appearing expensive and talent demanding. And, if you’re aiming at building another Netflix recommendation system, it really is. But the trend of making everything-as-a-service has affected this sophisticated sphere, too. You can jump-start an ML initiative without much investment, which would be the right move if you are new to data science and just want to grab the low hanging fruit.
One of ML’s most inspiring stories is the one about a Japanese farmer who decided to sort cucumbers automatically to help his parents with this painstaking operation. Unlike the stories that abound about large enterprises, the guy had neither expertise in machine learning, nor a big budget. But he did manage to get familiar with TensorFlow and employed deep learning to recognize different classes of cucumbers.
By using machine learning cloud services, you can start building your first working models, yielding valuable insights from predictions with a relatively small team. We’ve already discussed machine learning strategy. Now let’s have a look at the best machine learning platforms on the market and consider some of the infrastructural decisions to be made.
Do you like this in-depth educational content on applied machine learning? Subscribe to our Enterprise AI mailing list to be alerted when we release new material.
What is machine learning as a service
Machine learning as a service (MLaaS) is an umbrella definition of various cloud-based platforms that cover most infrastructure issues such as data pre-processing, model training, and model evaluation, with further prediction. Prediction results can be bridged with your internal IT infrastructure through REST APIs.
Amazon Machine Learning services, Azure Machine Learning, Google Cloud AI, and IBM Watson are four leading cloud MLaaS services that allow for fast model training and deployment. These should be considered first if you assemble a homegrown data science team out of available software engineers. Have a look at our data science team structures story to have a better idea of roles distribution.
Within this article, we’ll first give an overview of the main machine-learning-as-a-service platforms by Amazon, Google, Microsoft, and IBM, and will follow it by comparing machine learning APIs that these vendors support. Please note that this overview isn’t intended to provide exhaustive instructions on when and how to use these platforms, but rather what to look for before you start reading through their documentation.
Machine learning services for custom predictive analytics tasks
Predictive analytics with Amazon ML
Amazon Machine Learning services are available on two levels: predictive analytics with Amazon ML and the SageMaker tool for data scientists.
Amazon Machine Learning for predictive analytics is one of the most automated solutions on the market and the best fit for deadline-sensitive operations. The service can load data from multiple sources, including Amazon RDS, Amazon Redshift, CSV files, etc. All data preprocessing operations are performed automatically: The service identifies which fields are categorical and which are numerical, and it doesn’t ask a user to choose the methods of further data preprocessing (dimensionality reduction and whitening).
Prediction capacities of Amazon ML are limited to three options: binary classification, multiclass classification, and regression. That said, this Amazon ML service doesn’t support any unsupervised learning methods, and a user must select a target variable to label it in a training set. Also, a user isn’t required to know any machine learning methods because Amazon chooses them automatically after looking at the provided data.
Predictive analytics can be used in the form of real-time or on-demand data with two separate APIs available. The only thing you have to consider is that currently Amazon seems to capitalize on its more powerful ML-based services, like SageMaker, discussed below. Keep in mind that the documentation for this ML was last updated in 2016.
The latest updates of the whole platform touched on the improvements of separate modules, SDKs, and frameworks. The major part of the update is all about Deep Learning AMI (virtual machine manager for Elastic Cube), its pre-built deep-learning frameworks for CUDA processing, and minor improvements for integration with Linux. More on that can be found in the AWS release notes.
This high automation level acts both as an advantage and disadvantage for Amazon ML use. If you need a fully automated yet limited solution, the service can match your expectations. If not, there’s SageMaker.
Amazon SageMaker and frameworks-based services
SageMaker is a machine learning environment that’s supposed to simplify the work of a fellow data scientist by providing tools for quick model building and deployment. For instance, it provides Jupyter, an authoring notebook, to simplify data exploration and analysis without server management hassle. Amazon also has built-in algorithms that are optimized for large datasets and computations in distributed systems. These include:
- Linear learner is a supervised method for classification and regression
- Factorization machines is for classification and regression designed for sparse datasets
- XGBoost is a supervised boosted trees algorithm that increases prediction accuracy in classification, regression, and ranking by combining the predictions of simpler algorithms
- Image classification is based on ResNet, which can also be applied for transfer learning
- Seq2seq is a supervised algorithm for predicting sequences (e.g. translating sentences, converting strings of words into shorter ones as a summary, etc.)
- K-means is an unsupervised learning method for clustering tasks
- Principal component analysis is used for dimensionality reduction
- Latent Dirichlet allocation is an unsupervised method used for finding categories in documents
- Neural topic model (NTM) is an unsupervised method that explores documents, reveals top ranking words, and defines the topics (users can’t predefine topics, but they can set the expected number of them)
- DeepAR forecasting is a supervised learning algorithm used for forecasting time series that employs recurrent neural networks (RNN)
- BlazingText is a natural language processing (NLP) algorithm built on the Word2vec basis, which allows it to map words in large collections of texts with vector representations
- Random Cut Forest is an anomaly detection unsupervised algorithm capable of assigning anomaly scores to each data point
Built-in SageMaker methods largely intersect with the ML APIs that Amazon suggests, but here it allows data scientists to play with them and use their own datasets.
If you don’t want to use these, you can add your own methods and run models via SageMaker leveraging its deployment features. Or you can integrate SageMaker with TensorFlow, Keras, Gluon, Torch, MXNet, and other machine learning libraries.
Generally, Amazon machine learning services provide enough freedom for both experienced data scientists and those who just need things done without digging deeper into dataset preparations and modeling. This would be a solid choice for companies that already use Amazon cloud services and don’t plan to transition to another cloud provider.
Microsoft Azure Machine Learning Studio
Azure Machine Learning platform, is aimed at setting a powerful playground both for newcomers and experienced data scientists. The roster of Microsoft machine learning products is similar to the ones from Amazon, but Azure, as of today, seems more flexible in terms of out-of-the-box algorithms.
Services from Azure can be divided into two main categories: Azure Machine Learning Studio and Bot Service. Let’s find out what’s under the hood of Azure ML Studio. We’ll return to Bot Service in the section dedicated to specific APIs and tools.
ML Studio is the main MLaaS package to look at. Almost all operations in Azure ML Studio must be completed using a graphical drag-and-drop interface. This includes data exploration, preprocessing, choosing methods, and validating modeling results.
Approaching machine learning with Azure entails some learning curve. But it eventually leads to a deeper understanding of all major techniques in the field. The Azure ML graphical interface visualizes each step within the workflow and supports newcomers. Perhaps the main benefit of using Azure is the variety of algorithms available to play with. The Studio supports around 100 methods that address classification (binary+multiclass), anomaly detection, regression, recommendation, and text analysis. It’s worth mentioning that the platform has one clustering algorithm (K-means).
Another big part of Azure ML is Cortana Intelligence Gallery. It’s a collection of machine learning solutions provided by the community to be explored and reused by data scientists. The Azure product is a powerful tool for starting with machine learning and introducing its capabilities to new employees.
2019’s platform updates focus mainly on the Python Machine Learning SDK, and a preview of a new web experience for Azure ML Workspaces (basically a UI for an ML platform). It will allow developers to deploy models, visualize data, and work on dataset preparation in one place. The interface is currently unavailable for fully functional usage. However, there is some information on it described in the documentation. The web interface will be supported by all popular browsers, offering easy-to-use navigation both for experienced engineers and beginners.
Microsoft Azure Machine Learning Services
September 2017 saw Microsoft introducing a new set of ML-focused products that received the umbrella name Azure Machine Learning Services. The release has been responsible for some confusion in the Azure developer community as it required engineers to choose between the two different platforms that can’t be cross-integrated. So, we asked Matt Winkler, a group program manager at Microsoft, who works with Azure AI products to give us some inside info about their platform:
“Azure ML Services is our next generation infrastructure for building and deploying models at scale, using any tool or framework. Azure ML Services provides end-to-end lifecycle management, keeping track of all of your experiments across your entire team, storing code, config, parameter settings, and environment details to make it easy to rank, search, and replicate any experiment done by your team. Once you have a model you like, you can easily encapsulate it in a container and deploy to Azure, on-prem, or IOT devices, and it’s easy to scale and manage as it is “just another container” that runs on Kubernetes. Azure ML Services makes it easy to start locally, in a Python editor or notebook of your choice, and then easily compute in Azure and scale up/out when you need it.” says Matt.
Basically, the services suggest a support environment to build models, experiment with them, and use a broad variety of open source components and frameworks. Unlike ML Studio, it doesn’t have built-in methods and requires custom model engineering. The platform is aimed at rather experienced data scientists to operate it. And if you have the right team, ML Services offers a powerful toolset to manage ML experiments, use popular frameworks like TensorFlow, scikit-learn, etc. (which isn’t available with ML Studio), and deploy models into production in a third-party service like Docker.
Let’s have a closer look at what the ML Services platform suggests.
Python packages. These proprietary packages have libraries and functions that aim at four main groups of tasks: computer vision, forecasting, text analysis, and hardware acceleration.
Experimentation. With any Python tools and frameworks, engineers can build different models, compare them, set the project to specific historic configuration, and continue development from any moment in history.
Model management. The tool provides an environment to host, version, manage, and monitor models that run on Azure, on-premises, or even Edge devices.
Workbench. This product is a convenient desktop and command-line environment with dashboards and evaluation tools to track model development.
Visual Studio Tools for AI. Basically, this extension adds tools to the VS IDE to work with deep learning and other AI products.
If you still don’t know whether you should stick with Azure ML Studio or ML Services, Matt Winkler suggests, “We think of them as two different capabilities of the same service – Azure Machine Learning – that serve different types of customers. People newer to the ML space, and those not familiar with coding (analysts, data folks, etc.) love the convenience of Azure ML Studio, whereas professional data scientists and AI developers who are comfortable with Python prefer the capabilities in Azure ML services.”
Google Cloud AutoML
Similar to Amazon, Google provides its machine learning and AI services on two levels: Google Cloud Machine Learning for tech savvy data specialists and Cloud AutoML platform, which replaced a deprecated Prediction API.
Google Cloud AutoML is a cloud-based ML platform tailored for inexperienced users. Customers can upload their datasets, train custom models, and deploy them in the website. Needless to say, AutoML is fully integrated with all Google’s services and it stores data in the cloud. Trained models can be deployed via the REST API interface.
There are several products available with AutoML that you can access via a graphical interface. To briefly describe them, these are image and video processing services, a natural language processing and translation engine, and training models on structured data. Since each product can be accessed via an API, we’ll cover them separately in the API section.
Google Cloud Machine Learning Engine
Cloud AutoML is a successor of Prediction API, offering automation at the cost of flexibility. Google ML Engine is the direct opposite. It caters to experienced data scientists, it’s very flexible, and it suggests using cloud infrastructure with TensorFlow as a machine learning driver. Additionally, Google is testing a number of other popular frameworks like XGBoost, scikit-learn, and Keras. So, ML Engine is pretty similar to SageMaker in principle. With the recent update, in-built algorithms were introduced, so you may read a manual to denote if they suit your data set and the task of the model. But you should keep in mind that it’s still in beta.
TensorFlow is another Google product, which is an open source machine learning library of various data science tools rather than ML-as-a-service. It doesn’t have a visual interface and the learning curve for TensorFlow would be quite steep. However, the library is also targeted at software engineers that plan transitioning to data science. Google TensorFlow is quite powerful, but aimed mostly at deep neural network tasks.
Basically, the combination of TensorFlow and Google Cloud service suggests infrastructure-as-a-service and platform-as-a-service solutions according to the three-tier model of cloud services. We talked about this concept in our whitepaper on digital transformation. Have a look, if you aren’t familiar with it.
IBM Watson Machine Learning Studio
IBM suggests a single machine learning platform both for experienced data scientists and newcomers in the sphere. Technically, the system offers two approaches: automated and manual – for expert practitioners. Similar to a deprecated Google Prediction API or still operational Amazon ML, Watson Studio has an AutoAI which brings a fully automated data processing and model building interface that needs little to no training to start processing data, preparing models, and deploying them into production.
The automated part can solve three main types of tasks: binary classification, multiclass classification, and regression. You can choose either a fully automated approach or manually pick the ML method to be used. Currently, IBM has ten methods to cover these three groups of tasks:
- Logistic regression
- Decision tree classifier
- Random forest classifier
- Gradient boosted tree classifier
- Naive Bayes
- Linear regression
- Decision tree regressor
- Random forest regressor
- Gradient boosted tree regressor
- Isotonic regression
Besides AutoAI, there are two other services that you can use for building models:
SPSS Modeler. SPSS is a software package used to transform data into statistical business information. Acquired by IBM in 2009 and integrated as a stand-alone ML service, now its a product without a graphic user interface that allows you to upload the data set, use SQL statements to manipulate data, and train models to operate with business information.
Neural Network Modeler. This is similar to the SPSS Modeler except that it focuses neural network models and enables the processing of visual as well as textual data. The network modeler uses a flow editor to train models, deploy them, and move transformed data between databases, e.g. IBM Cloud Object Storage.
Separately, IBM offers deep neural network training workflow with flow editor interface similar to the one used in Azure ML Studio.
If you’re looking for advanced capabilities, IBM ML has notebooks such as Jupiter to program models manually using popular frameworks like TensorFlow, scikit-learn, PyTorch, and others.
To wrap up with machine learning as a service (MLaaS) platforms, it seems that Azure has currently the most versatile toolset on the MLaaS market. It covers the majority of ML-related tasks, provides two distinct products for building custom models, and has a solid set of APIs for those who don’t want to attack data science with their bare hands.
One of the latest updates made in 2019 is the discontinuation of the old model builder, which was replaced by AutoAI. The models trained with model builder are still operable within the ML Studio, but new models now can be trained in AutoAI. Other updates concern support for the latest versions of TensorFlow and Python.
Machine learning APIs from Amazon, Microsoft, Google, and IBM comparison
Besides full-blown platforms, you can use high-level APIs. These are the services with trained models under the hood that you can feed your data into and get results. APIs don’t require machine learning expertise at all. Currently, the APIs from these four vendors can be broadly divided into three large groups:
1) text recognition, translation, and textual analysis
2) image + video recognition and related analysis
3) other, that includes specific uncategorized services
Speech and text processing APIs: Amazon
Amazon provides multiple APIs that aim at popular tasks within text analysis. These are also highly automated in terms of machine learning and just need proper integration to work.
Amazon Lex. The Lex API is created to embed chatbots in your applications as it contains automatic speech recognition (ASR) and natural language processing (NLP) capacities. These are based on deep learning models. The API can recognize written and spoken text and the Lex interface allows you to hook the recognized inputs to various back-end solutions. Obviously, Amazon encourages use of its Lambda cloud environment. So, prior to subscribing to Lex, get acquainted with Lambda as well. Besides standalone apps, Lex currently supports deploying chatbots for Facebook Messenger, Slack, and Twilio.
Amazon Transcribe. While Lex is a complex chatbot-oriented tool, Transcribe is created solely for recognizing spoken text. The tool can recognize multiple speakers and works with low-quality telephony audio. This makes the API a go-to solution for cataloging audio archives or a good support for the further text analysis of call-center data.
Amazon Polly. The Polly service is kind of a reverse of Lex. It turns text into speech, which will allow your chatbots to respond with voice. It’s not going to compose the text though, just make the text sound close to human. If you’ve ever tried Alexa, you’ve got the idea. Currently, it supports both female and male voices for 25 languages, mostly English and Western European ones. Some languages have multiple female and male voices, so there’s even a variety to choose from. Like Lex, Polly is recommended for use with Lambda.
Amazon Comprehend. Comprehend is another NLP set of APIs that, unlike Lex and Transcribe, aim at different text analysis tasks. Currently, Comprehend supports:
- Entities extraction (recognizing names, dates, organizations, etc.)
- Key phrase detection
- Language recognition
- Sentiment analysis (how positive, neutral, or negative a text is)
- Topic modeling (defining dominant topics by analyzing keywords)
This service will help you analyze social media responses, comments, and other big textual data that’s not amenable to manual analysis, e.g. the combo of Comprehend and Transcribe will help analyze sentiment in your telephony-driven customer service.
Amazon Translate. As the name states, the Translate service translates texts. Amazon claims that it uses neural networks which – compared to rule-based translation approaches – provides better translation quality. Unfortunately, the current version supports translation from only six languages into English and from English into those six. The languages are Arabic, Chinese, French, German, Portuguese, and Spanish.
Speech and text processing APIs: Microsoft Azure Cognitive Services
Amazon provides multiple APIs that aim at popular tasks within text analysis. These are also highly automated in terms of machine learning and just need proper integration to work.
Amazon Lex. The Lex API is created to embed chatbots in your applications as it contains automatic speech recognition (ASR) and natural language processing (NLP) capacities. These are based on deep learning models. The API can recognize written and spoken text and the Lex interface allows you to hook the recognized inputs to various back-end solutions. Obviously, Amazon encourages use of its Lambda cloud environment. So, prior to subscribing to Lex, get acquainted with Lambda as well. Besides standalone apps, Lex currently supports deploying chatbots for Facebook Messenger, Slack, and Twilio.
Amazon Transcribe. While Lex is a complex chatbot-oriented tool, Transcribe is created solely for recognizing spoken text. The tool can recognize multiple speakers and works with low-quality telephony audio. This makes the API a go-to solution for cataloging audio archives or a good support for the further text analysis of call-center data.
Amazon Polly. The Polly service is kind of a reverse of Lex. It turns text into speech, which will allow your chatbots to respond with voice. It’s not going to compose the text though, just make the text sound close to human. If you’ve ever tried Alexa, you’ve got the idea. Currently, it supports both female and male voices for 29 languages, mostly English and Western European ones. Some languages have multiple female and male voices, so there’s even a variety to choose from. Like Lex, Polly is recommended for use with Lambda.
Amazon Comprehend. Comprehend is another NLP set of APIs that, unlike Lex and Transcribe, aim at different text analysis tasks. Currently, Comprehend supports:
- Entities extraction (recognizing names, dates, organizations, etc.)
- Key phrase detection
- Language recognition
- Sentiment analysis (how positive, neutral, or negative a text is)
- Topic modeling (defining dominant topics by analyzing keywords)
This service will help you analyze social media responses, comments, and other big textual data that’s not amenable to manual analysis, e.g. the combo of Comprehend and Transcribe will help analyze sentiment in your telephony-driven customer service.
Amazon Translate. As the name states, the Translate service translates texts. Amazon claims that it uses neural networks which – compared to rule-based translation approaches – provides better translation quality. Unfortunately, the current version supports translation from only six languages into English and from English into those six. The languages are Arabic, Chinese, French, German, Portuguese, and Spanish.
Speech and text processing APIs: Microsoft Azure Cognitive Services
Just like Amazon, Microsoft suggests high-level APIs, Cognitive Services, that can be integrated with your infrastructure and perform tasks with no data science expertise needed.
Speech. The speech set contains four APIs that apply different types of natural language processing (NLP) techniques for natural speech recognition and other operations:
- Translator Speech API
- Bing Speech API to convert text into speech and speech into text
- Speaker Recognition API for voice verification tasks
- Custom Speech Service to apply Azure NLP capacities using own data and models
Language. The language group of APIs focuses on textual analysis similar to Amazon Comprehend:
- Language Understanding Intelligent Service (LUIS) is an API that analyzes intentions in text to be recognized as commands (e.g. “run YouTube app” or “turn on the living room lights”)
- Text Analysis API for sentiment analysis and defining topics
- Bing Spell Check
- Translator Text API
- Web Language Model API that estimates probabilities of words combinations and supports word autocompletion
- Linguistic Analysis API used for sentence separation, tagging the parts of speech, and dividing texts into labeled phrases
Speech and text processing APIs: Google Cloud ML Services/ Cloud AutoML
While this set of APIs mainly intersects with what Amazon and Microsoft Azure suggest, it has some interesting and unique things to look at. Since the AutoML platform came along instead of Prediction API, now it extends the capabilities of Google Cloud ML services. So, every API concerning automated machine learning from Google is an actual option to train custom models.
Dialogflow. With various chatbots topping today’s trends, Google also has something to offer. Dialogflow is powered by NLP technologies and aims at defining intents in the text, and interpreting what a person wants. The API can be tweaked and customized for needed intents using Java, Node.js, and Python.
Cloud natural language API. This one is almost identical in its core features to Comprehend by Amazon and Language by Microsoft.
- Defining entities in text
- Recognizing sentiment
- Analyzing syntax structures
- Categorizing topics (e.g. food, news, electronics, etc.)
Cloud speech API. This service recognizes natural speech, and perhaps its main benefit compared to similar APIs is the abundance of languages supported by Google. Currently, its vocab works with over 110 global languages and variants of them. It also has some additional features:
- Word hints allow for customizing recognition to specific contexts and words that can be spoken (e.g. for better understanding of local or industry jargon)
- Filtering inappropriate content
- Handling noisy audio
Cloud translation API. Basically, you can use this API to employ Google Translate in your products. This one includes over a hundred languages and automatic language detection.
AutoML Natural Language API. AutoML is currently in development, so most of the products that come with AutoML are yet to be released. The natural language processor is also in beta. Basically, what it does is allow you to upload the training data through AutoML UI and train custom models. Available features are:
- Defining content in English
- Defining entities in text
- Analyzing syntax structure
AutoML translation API. The translation API is now in beta and currently contains information only about its custom modeling capabilities. While we mention it, keep in mind that it will be updated in the future.
Speech and text processing APIs: IBM Watson
IBM also competes for the API market. Let’s have a look at their set of interfaces.
Speech to Text. Currently, IBM offers speech recognition for 9 languages, including Japanese, Portuguese, and Arabic. The API can recognize multiple speakers, spot keywords, and handle lossy audio. An interesting feature is capturing word alternatives and reporting them. For instance, if the system spots the word “Boston,” it can assume that there may be an “Austin” alternative. Upon analyzing its hypothesis, the API assigns a confidence score to each alternative.
Text to Speech. Interestingly, 9 text to speech languages only partly match those in the speech to text API. While both products support Western European languages, Text to speech lacks Korean and Chinese. English, German, and Spanish allow you to choose between male and female voices; the rest of the languages come with female voices only. This is in line with the trend toward making voice assistants sound mostly female.
Language translator. The API supports 21 languages for translation from and to English. Additionally, you can add custom models and expand the language coverage.
Natural language classifier. Unlike most of the APIs mentioned, the classifier by IBM can’t be used without your own dataset. Basically, the tool allows you to train models using your own business data and then classify incoming records. Common use cases are tagging products in eCommerce, fraud detection, categorizing messages, social media feeds, etc.
Natural language understanding. The language understanding feature set at IBM is extensive. Besides standard information extraction like keyword and entity extraction with syntax analysis, the API suggests a number of interesting capabilities that aren’t available from other providers. These include metadata analysis and finding relations between entities. Additionally, IBM suggests a separate environment to train your own models for text analysis using Knowledge Studio.
Personality insights. A relatively unusual API allows for analyzing texts and extracting clues about how the writer engages with the world. This basically means that the system will return:
- personality characteristics (e.g. agreeableness, conscientiousness, extraversion, emotional range, and openness)
- needs (e.g. curiosity, excitement, challenge)
- values (e.g. helping others, achieving success, hedonism).
Based on this data, the API can infer consumption preferences (e.g. music, learning, movies). The most common use case for such a system is user-generated content analysis for precise product marketing.
Tone analyzer. Tone analyzer is a separate API that focuses on sentiment analysis and is aimed at social media research and various customer engagement analytics. Don’t be confused by its somewhat ambiguous name. The analyzer covers only written text and doesn’t extract insights from spoken ones.
Besides text and speech, Amazon, Microsoft, Google, and IBM provide rather versatile APIs for image and video analysis.
While image analysis closely intersects with video APIs, many tools for video analysis are still in development or beta versions. For instance, Google suggests rich support for various image processing tasks but definitely lacks video analysis features already available at Microsoft and Amazon.
Image and video processing APIs: Amazon Rekognition
No, we didn’t misspell the word. The Rekognition API is used for image and, recently, video recognition tasks. They include:
- Objects detection and classification (find and detect different objects in images and define what they are)
- In videos, it can detect activities like “dancing” or complex actions like “extinguishing fire”
- Face recognition (for detecting faces and finding matching ones) and facial analysis (this one is pretty interesting as it detects smiles, analyzes eyes, and even defines emotional sentiment in videos)
- Detecting inappropriate videos
- Recognizing celebrities in images and videos (for whatever goals that might be)
Image and video processing APIs: Microsoft Azure Cognitive Services
The Vision package from Microsoft combines six APIs that focus on different types of image, video, and text analysis.
- Computer vision that recognizes objects, actions (e.g. walking), written and typed texts, and defines dominant colors in images
- Content moderator detects inappropriate content in images, texts, and videos
- Face API detects faces, groups them, defines age, emotions, genders, poses, smiles, and facial hair
- Emotion API is another face recognition tool that describes facial expressions
- Custom Vision Service supports building custom image recognition models using your own data
- Video indexer is a tool to find people in videos, define sentiment of speech, and mark keywords
Image and video processing APIs: Google Cloud Services/ Cloud AutoML
Cloud vision API. The tool is built for image recognition tasks and is quite powerful for finding specific image attributes:
- Labeling objects
- Detecting faces and analyzing expressions (no specific face recognition or identification)
- Finding landmarks and describing the scene (e.g. vacation, wedding, etc.)
- Finding texts in images and identifying languages
- Dominant colors
Cloud Video Intelligence. The video recognition API from Google is early in development so it lacks many features available with Amazon Rekognition and Microsoft Cognitive Services. Currently, the API provides the following toolset:
- Labeling objects and defining actions
- Identifying explicit content
- Transcribing speech
AutoML Vision API. AutoML has also came up with several products to train models with AutoML Vision the first one to be announced. Since all AutoML APIs are now in beta, the product currently offers:
- Labeling objects and engaging human labeling service
- Registering trained models in AutoML
AutoML Video Intelligence Classification API. This is a pre-release API for video processing, which will be able to classify specific shots from your video using your own data labels.
While on the feature-list level Google AI services may be lacking some abilities, the power of Google APIs is in the vast datasets that Google has access to.
Image and (no) video processing APIs: IBM Visual Recognition
The Visual Recognition API by IBM currently doesn’t support video analysis, which is already available at other providers (hence the headline of this section). And the image recognition engine suggests a basic set of features, somewhat limited compared to what other vendors offer:
- Object recognition
- Face recognition (the API returns age and gender)
- Food recognition (for some reason, IBM engineered a dedicated model for foods)
- Inappropriate content detection
- Text recognition (this part API of the API is in private beta, so you must request a separate access)
Specific APIs and tools
Here, we’ll discuss specific API offerings and tools that come from Microsoft and Google. We didn’t include Amazon here, as their sets of APIs merely match the above-mentioned categories of text analysis and image+video analysis. However, some of the capacities of these specific APIs are also present in Amazon products.
Azure Service Bot framework. Microsoft has put a lot of effort into providing its users with flexible bot development toolset. Basically, the service contains a full-blown environment for building, testing, and deploying bots using different programming languages.
Interestingly, the Bot Service doesn’t necessarily require machine learning approaches. As Microsoft provides five templates for bots (basic, form, language understanding, proactive, and Q&A), only the language understanding type requires advanced AI techniques.
Currently, you can use .NET and Node.js technologies to build bots with Azure and deploy them on the following platforms and services:
- Bing
- Cortana
- Skype
- Web Chat
- Office 365 email
- GroupMe
- Facebook Messenger
- Slack
- Kik
- Telegram
- Twilio
AWS ML hardware. Recently introduced physical products by Amazon are packed with dedicated APIs to program hardware with deep/machine learning models. The lineup of ML-algorithm-based-products of Amazon is presented by three units:
AWS DeepLens is a programmable camera that is used to apply ML to the actual hardware. In this case, you can apply Amazon ML services using the camera that can be used for visual data recognition and training ML models on them.
AWS DeepRacer is another hardware piece of the ML pack that is basically a 1/18 radio-controlled car utilizing reinforcement learning.
AWS Inferentia is a chip tailored for deep learning processing and can be used to reduce computing costs. It supports TensorFlow, PyTorch, and Apache MXNet.
Bing Search from Microsoft. Microsoft suggests seven APIs that connect with the core Bing search features, including autosuggest, news, image, and video search.
Knowledge from Microsoft. This APIs group combines text analysis with a broad spectrum of unique tasks:
- Recommendations API allows for building recommender systems for purchase personalization
- Knowledge Exploration Service allows you to type in natural queries to retrieve data from databases, visualize data, and autocomplete queries
- Entity Linking Intelligence API is designed to highlight names and phrases that denote proper entities (e.g. Age of Exploration) and ensure disambiguation
- Academic Knowledge API does word autocompletion, finds similarities in documents both in words and concepts, and searches for graph patterns in documents
- QnA Maker API can be used to match variations of questions with answers to build customer care chatbots and applications
- Custom Decision Service is a reinforcement learning tool to personalize and rank different types of content (e.g. links, ads, etc.) depending on user’s preferences
Google Cloud Job Discovery. The API is still in the early development, but soon it may redefine the job search capacities that we have today. Unlike conventional job search engines that rely on precise keyword matches, Google employs machine learning to find relevant connections between highly variative job descriptions and avoid ambiguity. For instance, it strives to reduce irrelevant or too broad returns, like returning all jobs with the keyword “assistant” for the query “sales assistant.” What are the main features of the API?
- Fixing spelling errors in job search queries
- Matching the desired seniority level
- Finding relevant jobs that may have variative expressions and industry jargon involved (e.g. returning “barista” for the “server” query instead of “network specialist”; or “engagement specialist” for the “biz dev” query)
- Dealing with acronyms (e.g. returning “human resources assistant” for the “HR” query)
- Matching variative location descriptions
Watson Assistant. The Watson chatbot platform (formerly Conversation) is rather famous among AI engineers that specialize in conversational interfaces. IBM provides a full-fledged infrastructure for building and deploying bots capable of live conversation leveraging entity and user intent analysis in messages.
Engineers can either use built-in support from Facebook Messenger and Slack deployment or create a client application to run the bot there.
All four platforms described before provide fairly exhaustive documentation to jump-start machine learning experiments and deploy trained models in a corporate infrastructure. There are also a number of other ML-as-a-Service solutions that come from startups, and are respected by data scientists, like PredicSis and BigML.
Data storage
Finding the right storage for collecting data and further processing it with machine learning is no longer a great challenge, assuming that your data scientists have enough knowledge to operate popular storage solutions.
In most cases, machine learning requires both SQL and NoSQL database schemes, which are supported by many established and trusted solutions like Hadoop Distributed File System (HDFS), Cassandra, Amazon S3, and Redshift. For organizations that have used powerful storage systems before embarking on machine learning, this won’t be a barrier. If you plan to work with some ML-as-a-service system, the most straightforward way is to choose the same provider both for storage and machine learning as this will reduce time spent on configuring a data source.
However, some of these platforms can be easily integrated with other storages. Azure ML, for instance, mainly integrates with other Microsoft products (Azure SQL, Azure Table, Azure Blob) but also supports Hadoop and a handful of other data source options. These include direct data upload from a desktop or on-premise server. The challenges may arise if your machine learning workflow is diversified and data comes from multiple sources.
Modeling and computing
We’ve discussed ML-as-a-service solutions that mainly provide computing capacities. But if the learning workflow is performed internally, the computing challenge will strike sooner or later. Machine learning in most cases requires much computing power. Data sampling (making a curated subset) is still a relevant practice, regardless of the fact that the era of big data has come. While model prototyping can be done on a laptop, training a complex model using a large dataset requires investment into more powerful hardware. The same applies to data preprocessing, which can take days on regular office machines. In a deadline-sensitive environment – where sometimes models should be altered and retrained weekly or daily – this simply isn’t an option. There are three viable approaches to handling processing while keeping high performance:
- Accelerate hardware. If you do relatively simple tasks and don’t apply your models for big data, use solid-state drives (SSDs) for such tasks as data preparation or using analytics software. Computationally intensive operations can be addressed with one or several graphical processing units (GPUs). A number of libraries are available to let GPUs process models written even with such high-level languages as Python.
- Consider distributed computing. Distributed computing implies having multiple machines with tasks split across them. However, this approach isn’t going to work for all machine learning techniques.
- Use cloud computing for scalability. If your models process customer-related data that has intensive peak-moments, cloud computing services will allow for rapid scalability. For the companies that are required to have their data on-premise only, it’s worth considering private cloud infrastructure.
The next move
It’s easy to get lost in the variety of solutions available. They differ in terms of algorithms, they differ in terms of required skillsets, and eventually they differ in tasks. This situation is quite common for this young market as even the four leading solutions that we’ve talked about aren’t fully competitive with each other. And more than that, the velocity of change is impressive. There’s a high likelihood that you’ll stick with one vendor and suddenly another one will roll out something unexpectedly that matches your business needs.
The right move is to articulate what you plan to achieve with machine learning as early as possible. It’s not easy. Creating a bridge between data science and business value is tricky if you lack either data science or domain expertise. We at AltexSoft encounter this problem often when discussing machine learning applications with our clients. It’s usually a matter of simplifying the general problem to a single attribute. Whether it’s the price forecast or another numeric value, the class of an object or segregation of objects into multiple groups, once you find this attribute, deciding the vendor and choosing what’s proposed will be simpler.
Bradford Cross, founding partner at DCVC, argues that ML-as-a-services isn’t a viable business model. According to him, it falls in the gap between data scientists who are going to use open source products and executives who are going to buy tools solving tasks at the higher levels. However, it seems that the industry is currently overcoming its teething problems and eventually we’ll see far more companies turning to ML-as-a-service to avoid expensive talent acquisitions and still possess versatile data tools.
This article was originally published on AltexSoft and re-published to TOPBOTS with permission from the author.
Enjoy this article? Sign up for more updates on applied ML.
We’ll let you know when we release more technical education.
Leave a Reply
You must be logged in to post a comment.