What is machine learning

Fernando Pavón

CEO of Gamco

In recent years, all topics related to Artificial Intelligence (AI) have aroused enormous interest. Perhaps it is because the heart of the big technology companies is driven by algorithms that learn from user data and optimize the service provided to these users; or because a multitude of tools and technologies have been made available to technicians to help implement software based on machine learning algorithms, "machine learning algorithms".deep-learning"natural language processing or image processing.

We should not get carried away by the current fashion: everyone talks about and even knows about Artificial Intelligence. But far from being something very new or invented by American geniuses three days ago, there are professionals in Spain who have been dedicated to AI for many years, some since the 1990s. This is not an exaggeration: AI is a science that is implemented using technological tools. It is not just computer science, it is not just statistics, it is not just mathematics, or psychology or neurology.

It is a "new science" that began in the 1940's with the modeling of the neuron by Warren McCulloch y Walter Pitts. Science that is usually dated to have officially begun in the summer of 1956 during the Dartmouth College days at Stanford; but in 1950 Turing had already written his very famous article on "Computing Machinery and Intelligence"with the formulation of the Turing test to know if a machine is intelligent or not. After the so-called "Artificial Intelligence Winter", it resurfaced in the second half of the 80's with new paradigms and training algorithms for artificial neuron networks, until the new golden age of our days.

This article will review one of the branches of Artificial Intelligence that is perhaps generating a greater number of practical applications: Machine Learning (ML). We will see what it consists of, training methods and practical examples; less sensationalist examples than those that appear from time to time in the newspapers, such as the Pentagon being able to predict what is going to happen in the next few days.

What is Machine Learning?

Machine Learning deals with how to build computer systems that automatically learn from experience and improve with new known data. It attempts to model the learning processes observed in the nervous systems of living beings and especially in the human brain. One of the most widely used machine learning techniques today is Artificial Neuron Networks, which are inspired by the cells that make up the human nervous system: neurons.

Sometimes it is useful to understand a concept by comparing the reality it represents with what it is not; Machine Learning is often presented as "advanced statistics" or as a part of computing or programming, we see why not:

Statistics deals with populations from a larger data set, describing certain variables and usually starting a priori from a model that attempts to fit the observed population. AI encompasses the entire available data set, all known variables and builds a model, automatically, from the data. The Artificial Intelligence (and Machine Learning as part of it) is an empirical science. Statistics is used to describe the results of predictions and actions performed by AI.
In addition, machine learning studies certain issues not present in statistics such as: the architecture of the systems and algorithms that must be used to be very effective in capturing, storing, indexing, retrieving and merging data.
Computer science primarily aims to solve the questions of how to manually program computer systems. While Machine Learning focuses on the question of how to get computers to "program" themselves; starting from an initial structure and using experience gathered from data, through training algorithms.

In addition to what we have seen in the previous two points, Machine Learning is closely related to the study of human and animal learning in psychology, neurology and related fields. The questions of how computers can learn and how animals learn will probably have closely related answers.

▷ You may be interested in: Why Machine Learning (ML) is so popular in the 21st Century

Evolution of Machine Learning

The origin of the current science of Machine Learning can be traced back to the work of our novel prize winner Ramón y Cajal, who first described the nervous systems of living beings, describing the neuron and synaptic processes. These studies were the basis for AI pioneers to model artificial neurons, giving rise to Artificial Neural Networks (ANN).

Based on the historical division offered by Rusell and Norving, the following historical stages can be distinguished:

When did Artificial Intelligence begin?

The first work that is generally recognized as belonging to Artificial Intelligence was done by Warren McCulloch and Walter Pitts. Both proposed an artificial neuron model in which each neuron was characterized by an "on-off" state.

McCulloch and Pitts also suggested that artificial neural networks could learn.

Donald Hebb developed a simple rule to modify the weight of connections between neurons. His rule, called "Hebbian Learnig", is still a useful model today.

In 1950, Marvin Minsky and Dean Edmons built the first neural computer: the SNARC, which simulated a network of 40 neurons.

This brief review of the principles of Artificial Intelligence cannot end without mentioning Alan Turing's influential work "Computing Machinery and Intelligence", where the famous Turing test was introduced, as well as the concepts of Machine Learning, Genetic Algorithms and Reinforcement Learning.

How did Artificial Intelligence start?

We can place the "official birth" in the summer of 1956 at Dartmouth College at Stanford.

The father was John McCarthy, who convinced Minsky, Claude Shannon, and Nathaniel Rochester to bring together the most eminent researchers in the fields of automata theory, neural networks, and the study of intelligence to organize a two-month workshop in the summer of 1956.

At Dartmouth, it was defined why a new discipline is needed instead of grouping AI studies within one of the existing disciplines.

Main reasons why AI should be considered a new discipline:

AI aims to duplicate human faculties such as creativity, self-learning or the use of language.
The methodology used comes from computer science and Artificial Intelligence is the only specialty that tries to make machines that can function autonomously in complex and dynamic environments.

Great Expectations (1952-1969)

These were years of great enthusiasm because some very promising work appeared:

Arthur Samuel created in 1952 a program to play checkers that was able to "learn to play". In fact, the program ended up playing better than its creator. The program was shown on television in 1956.
In 1958, McCarthy created the Lisp language, which became the dominant language for AI for the next 30 years.
The neural networks introduced by McCulloch and Pitts also underwent important developments.
The Perceptron convergence theorem was developed (nowadays many artificial neural networks are of the "Multilayer Perceptron" type), which ensured that the learning algorithm could adjust the connection weights of a perceptron in such a way as to adapt to any function defined by the input variables.

A Dose of Reality (1966-1973)

Many researchers in the new field of AI made bold predictions that never came true. For example, the Nobel laureate Herbert Simon predicted in 1957 that machines would be able to think, learn and create, so that they would soon surpass the human mind itself. This has evidently proved to be false.

One of the main difficulties of AI centered on fundamental limitations of the basic structures used to generate intelligent behavior. For example, in 1969 Minsky and Papert proved that the perceptron could actually learn very little.

The Return of Artificial Neural Networks and General Artificial Intelligence (1986-present)

In the mid-1980s, several research groups advanced the back-propagation learning algorithm for neural networks. Specifically, for the Multilayer Perceptron, originally developed in 1974.

This algorithm was applied to many learning problems and the dissemination of the results in parallel and distributed processing papers caused a great deal of excitement.

In recent years, it has become more common to build on existing theories than to develop new ones. In this way, these theories are being endowed with the mathematical rigor they require, which is making it possible to implement their efficiency in real problems rather than in simulations or simple laboratory examples.

Some leading researchers in Artificial Intelligence and Machine Learning have expressed their dissatisfaction with the progress of Artificial Intelligence, thinking that rather than continuing to improve performance in certain areas or specific examples, AI should return to the principle expressed by Simon: "machines that think, learn and create". This trend has given rise to new lines of work such as Artificial General Intelligence (AGI).

The emergence of Bigdata and AI development ecosystems (2011-present).

Finally, it is worth highlighting the quantitative and qualitative leap that Big Data technologies have made in the processing of large amounts of data. This, together with the development of environments and libraries by the major technology companies: Google, AWS, Facebook or Microsoft, allow the implementation of systems based on Machine Learning algorithms with relative ease. Even in recent years, trained models have become available, for example for image processing or natural language processing.

Want to try our intelligent CRM at no cost?

Get a free diagnostic and assessment based on your actual business data

More information

Perhaps the current state of the art lies not only in learning from data, but in the production of original content, an example occurs in the field of natural language: the generation of text summaries and even the development of original texts, some developments are already being used for news writing, contract drafting (virtual lawyers), etc.

Why is Machine Learning important and what are its benefits?

Machine Learning is important because it allows to successfully address problems that would otherwise be very difficult to solve, these problems are all those that meet mainly two characteristics:

The application is too complex so that software developers can manually design and implement the algorithm. For example, anyone could recognize a familiar face in a photograph, but it is not so easy to write an algorithm to perform this task. Here, Machine Learning may be the software development technique of choice because it is relatively easy to obtain a training set with labeled data (photographs that do or do not contain the known face) and at the same time it is very inefficient to write an algorithm that would recognize the face in question in different photographs.
The software should automatically adjust to your operating context once it is "deployed". For example, speech recognizers must self-adjust to the user who has acquired the software (his way of speaking, accents, syntactic construction of sentences, ...). Machine Learning provides the mechanisms for software self-adaptation. Software applications that must adapt to users are growing very fast: online stores that adapt to user preferences or email software that adapts to user preferences about "spam".

In short, if the "fundamental laws" that govern a system (industrial, economic or social) are not known or cannot be programmed, perhaps the only way to solve issues or problems of these systems is by using machine learning: from known historical data, AI can infer the behavior of these systems and therefore create models that explain it, and can predict its future evolution.

To list the benefits of Machine Learning we can focus on Artificial Neuron Networks, a fundamental technique that underpins machine learning and whose main advantages can be listed as follows:

Adaptive LearningLearning: Ability to learn to perform tasks based on training or initial experience. This initial learning can be adapted as new data is learned, so your knowledge will always be up to date and will follow the reality reflected in the data.
GeneralisationThe first point: from concrete data, they can infer knowledge that allows them to respond adequately to new data. It is no more than a corollary of the first point, a good learning from experience, makes the acquired knowledge generalized to the whole problem for which the neural network was trained. It is not a question of memorizing specific data, but really of learning.
Self-organizationA neural network can create its own organization or representation of the information it receives through a learning stage. This characteristic allows it to self-create the structure of knowledge and even generate new knowledge. This is an attempt to approach the capacity of human beings to create from their knowledge, experiences and spurred by different needs.
Fault toleranceOn the one hand: the partial destruction of a network leads to a degradation of its structure; however, some network capabilities can be retained, even if it suffers a great deal of damage. On the other hand: it is possible to learn from data with a low quality, or with loss of information. Think, for example, of photographs that saturate certain pixels, image processing can "intuit" the possible values of those pixels; or sensor signals in factories that present electromagnetic noise, control algorithms based on predictive neural models are working in these environments.
Real-time operationNeural computations can be performed in parallel; for this purpose, machines are designed and manufactured with special hardware to obtain this capability. A neural network is nothing more than the interconnection of simple elements, neurons, organized in layers and that are "independent" and can be executed in parallel, at least the neurons of the same layer can be executed in parallel.

Machine Learning Techniques

To see the main Machine Learning techniques, we can start from the three main problems that can be faced with Artificial Intelligence: segmentation, classification and prediction.

Each type of problem and the most appropriate type of machine learning will be discussed:

Segmentation. The available data are composed of discrete variables or characteristics that are grouped into patterns, without a strong or dominant temporal component. There are no output variables. Segmentation is used to make an exploratory analysis of the available data, trying to find unknown patterns of behavior or to group these patterns into segments or groups that can be defined.

In order to segment a data set, the so-called unsupervised learning. This learning, from vectors of characteristics or variables, tries to find similarities and differences between them, forming groups of vectors or patterns that present similar characteristics.
By studying the characteristics of the patterns that make up a group, it is possible to see which are those that define that group and which are the most different from the rest of the groups.

If unsupervised learning techniques are used, such as the so-called self-organizing maps, the patterns of variables are "projected" on a map, usually in two dimensions, where those that have common or more similar characteristics are placed in the same segment and in neighboring segments. It is possible, as in any map, to see what is far away or near and therefore to know which behavior patterns are more similar and which are less similar. It is also possible to analyze "paths" on the map to study the difficulty of moving from one segment to another.

Ranking. As in segmentation the variables or features are discrete (they take the values of the contents in a finite and usually small set of possibilities), but in this case there is a clear output for the system. If you want to classify each feature pattern into a finite number of classes, you will use a supervised learning. The output will define to which class the input pattern belongs. During training, the correct class to which each pattern belongs is known so that the neural network parameters can be adjusted, for example, iteratively so that the correct output class is matched most of the time.
Prediction. In this case the data available are mainly in the form of a time series, or time is a very relevant characteristic for the problem. A problem in which the future will be important, how far in advance the next value of the time series needs to be predicted (this is called the prediction horizon).

In this case, a supervised learningFor each input vector at a given time, the output, which is usually a continuous variable, is predicted at a future time. During training, the value of that output variable in the future is known, and the error made (predicted value - actual future value) is used to readjust the neural network parameters.

In the previous problems we have seen supervised and unsupervised learning, if the problem we are facing is one of decision making in changing environmentsIn these cases, the machine learning-based system must learn as it "plays" in that environment. reinforcement learning could be the technique to use.

Reinforcement learning is also known as a semi-supervised learning model. In this technique, an agent is allowed to interact with an environment in order to maximize rewards. Reinforcement learning is nothing more than a Markov decision process.

Real-life examples

The following are some real-life examples pertaining to the three or four types of machine learning problems and techniques discussed in the previous section. The number of examples will necessarily be very small, not even a brief sample of the large number of current applications of machine learning.

1. Prediction of customer credit defaults.

It is very important for financial institutions to anticipate possible payment problems of their credit customers.

It is a classification problemwhere a client's pattern of characteristics can belong to one of two possible classes: will pay his loans or will not pay his loans.

Patterns of characteristics will be formed with historical clients, their credits, movements, ... and the class to which they belong will be assigned: they paid their credits or they did not pay their credits. Through supervised learning models will be trained so that when faced with a pattern of client characteristics, they will evaluate to which class they belong: those who will pay their loans or those who will not.

2. Short-term electricity consumption demand forecasting.

One of the great problems of electrical energy is its storage, so it is important to know what its consumption will be in the short term. It is a prediction problem where given the historical consumption every day and every hour, at a given time you want to predict the consumption at a future time. For example, at a given time you want to predict the consumption 48 hours later.

This prediction problem will use previous consumption variables (for example: one hour ago, two hours ago, 24 hours ago, ...), plus some external variable to the time series of hourly consumption (for example: current and predicted temperature, if 48 hours from now is a holiday, ...) to predict the consumption 48 hours later. Through supervised learningIn addition, predictive models will be trained to predict future consumption, calculating the error made with the actual consumption and using this error to readjust the parameters of the predictive model.

3. To know how many types of customers buy a company's products and what are their most relevant characteristics..

The vast majority of companies want to know their customers, many times they do not know exactly what they are looking for, they have collected a large number of variables about them and want to exploit them "to see what comes out", what important characteristics for the business define their customers. These variables can be mainly: socio-economic variables, commercial transactions with the company and characteristics of the products or services purchased.

This is a segmentation problemwhere a pattern of characteristics is created from the known variables and by means of unsupervised learning segments of similar customer patterns will be automatically created. These segments will make it easier to analyze the characteristics of the customer patterns that have fallen into each segment. Many times it will facilitate the analysis to represent some concrete characteristic of the patterns, for example, average or maximum sales of the customers of a segment, or which two products have been purchased more or through which channel they make their purchases: physical or online stores.

4. Resource allocation in a computational cluster

The allocation of limited resources to different tasks usually involves the development of algorithms that implement rules defined by human experts. The reinforcement learning can be used to allocate and schedule computing resources to queued jobs, with the goal of minimizing the time it takes for jobs to run.

The agent (software component of reinforcement learning) knows the current state of resource allocation and a profile of the queued jobs, this agent can choose between more than one action at each time it has to make a decision. The reward can be defined as the inverse of the time it takes to execute the jobs (the longer the time, the lower the reward and the shorter the time then the higher the reward). Supervised learning will iteratively define the policy that maximizes the total reward (the shorter the execution time).

5. Define which products or services should be offered to which customers because they have a higher probability of being purchased by those customers..

Practically all companies have the need to know which of their products or services may be required by a particular customer at a given time, in order not to lose that business opportunity. This is a classification problemHowever, contrary to what we have seen in example 1, the classes are not binary, but would be many, for example they could be one for each product in the portfolio. The algorithm of supervised learning would use patterns of customer characteristics that would output which product they bought at a given time, learning how to assign the right product to a customer at each time.

Conclusion

In this article we have tried to show that Machine Learning is a science with practical applications, which solves complex and current business problems. This science already has a long enough history to have enough knowledge, experience and technologies to build robust systems based on machine learning.

Due to the limitation of this article, it has not been possible to include other experiences and applications that we are using practically every day: Internet search engines, face and object recognition in photographs taken with cell phones, photo search in those same cell phones, driving assistants in cars, air traffic control assistants, defense and security applications, and a long etcetera.

We have seen why Machine Learning is important and what characteristics a problem must have for it to make sense to apply Artificial Intelligence techniques, and how, based on the type of problem to be solved, it is possible to know which machine learning technique may be more appropriate.

In conclusion, and even at the risk of seeming exaggerated, I would like to emphasize that Artificial Intelligence will be for companies the science that will make the difference between two companies competing in the same sector, that will provide the most important competitive advantages and that, in short, will determine which company will survive and grow, and which will simply maintain its inertia with difficulty and even disappear.

The above approach is justified by two characteristics of a good implementation of AI-based solutions in enterprises:

Automates processes, making them much more efficient.
Able to customize services and products on a customer-by-customer basis, improving the company's service to the customer while maintaining or improving operating costs.

If we think about the two previous points, the company continues to do the same as before but better, much more efficiently while improving the service given to its customers.

This is important, but the best thing is that it frees talent from repetitive tasks that add little value. Talent is the most valuable asset that companies have, and thanks to Machine Learning it can be dedicated to tasks that bring a lot of value to customers and at the same time are stimulating for "natural intelligence".

Share:

What is artificial intelligence?

Before explaining what artificial intelligence is, we would like to start with a sentence from the book Age of intelligent machines (1992), by Raymond Ku [...]

Since 2008, several countries have enacted legislation that recognizes the importance of integrating artificial intelligence (AI) into key areas of life [...]

The content of this article synthesizes part of the chapter "Concept and brief history of Artificial Intelligence" of the thesis Generation of Artificial [...]

One of the decisions faced by a company that needs an IT infrastructure is the choice of where to locate this infrastructure and where to install it.

See more entries