Backpropagation Basics

Published by Daniel at May 1, 2022

Artificial Neural Networks (ANN), a type of artificial intelligence learning algorithm, have been designed to resemble brain function and power to execute processes efficiently.Inspired by the brain’s ability to modify synaptic connections between neurons for learning purposes, researchers have been trying to imitate the process with artificial intelligence.

However, how individual synaptic modifications have behavioral consequences remains a mystery to scientists.(1)Artificial Neural Networks use synaptic connections to achieve machine learning without biological restrictions. Deep neural networks use an architecture with multiple layers of neurons to achieve high efficiency in completing tasks.

To study how synaptic updates can improve performance within these networks, researchers first study the architecture of the network. Then they use “error functions” to measure how well or poorly the algorithms achieve their goals.The results are then used to search for other algorithms that compute the necessary changes to minimize the error. Backpropagation of error or “backprop” is the most frequently used algorithm for successfully training deep neural networks (i.e., speech and image recognition, language translation). Backprop combined with reinforcement learning has solved several control issues.(1)

Artificial Neural Networks use synaptic connections to achieve machine learning without biological restrictions. Deep neural networks use an architecture with multiple layers of neurons to achieve high efficiency in completing tasks.To study how synaptic updates can improve performance within these networks, researchers first study the architecture of the network. Then they use “error functions” to measure how well or poorly the algorithms achieve their goals.The results are then used to search for other algorithms that compute the necessary changes to minimize the error. Backpropagation of error or “backprop” is the most frequently used algorithm for successfully training deep neural networks (i.e., speech and image recognition, language translation). Backprop combined with reinforcement learning has solved several control issues.(1)

Backprop usually requires explicit output targets referenced to create error signals through feedback connections. This allows proper adjustments to be made to specific synapses to fit the established goals better. By using the chain rule of calculus, backprop calculates how small changes in the strength of each connection influence the total network’s error; it also considers the effect of slight changes on all downstream neurons. Error calculation starts in the final layer and propagates backward. Error signals are calculated for every neuron in the network. Then, the final error output is reduced by changing the postsynaptic activity of each neuron in the direction specified by the error signals.(1,2)

This process has been postulated to work similarly in the human brain. For example, it can be easily demonstrated by a child learning the sounds of letters; the input would be the shape of the letter, and the predicted output would be the sound produced by the child in the first trial. If the parent corrects the pronunciation (providing a target output), the child will try to modify the final sound corresponding to the image in the second trial.(3)

Error calculation starts in the final layer and propagates backward. Error signals are calculated for every neuron in the network. Then, the final error output is reduced by changing the postsynaptic activity of each neuron in the direction specified by the error signals.(1,2,)

One of the best qualities of backprop is quickly recognizing and finding the “internal representations” of inputs. Internal representations are the hidden activities of the network that represent input data. They are not specified initially and need to be identified during the learning process. For example, oriented edges in handwritten digits lead to machine learning based on this generalization from the input. Backprop also has two main characteristics that are crucial for its functioning. The first is predicting synapse-specific changes; the second is the need for feedback connections to send the error information to deeper neuron layers and compute the synaptic adjustments.(1)

It was initially thought that the brain learned using backpropagation; however, several aspects of ML backprop have revealed that the brain uses even more specific and sophisticated mechanisms.

Those limitations are the lack of local error representation, the need for symmetry of forward and backward weights, and the use of unrealistic models of neurons with continuous outputs instead of spikes, among others.⁽³⁾

Even though backprop was not the most significant breakthrough in neuroscience as it was thought, it offers a broad understanding of how the cortex learns and allows researchers to now train multilayer NN to be highly capable of competing with human abilities.(2)

One of the best qualities of backprop is quickly recognizing and finding the “internal representations” of inputs. Internal representations are the hidden activities of the network that represent input data. They are not specified initially and need to be identified during the learning process. For example, oriented edges in handwritten digits lead to machine learning based on this generalization from the input.Backprop also has two main characteristics that are crucial for its functioning.

Backprop also has two main characteristics that are crucial for its functioning. The first is predicting synapse-specific changes; the second is the need for feedback connections to send the error information to deeper neuron layers and compute the synaptic adjustments.(1)

It was initially thought that the brain learned using backpropagation; however, several aspects of ML backprop have revealed that the brain uses even more specific and sophisticated mechanisms.Those limitations are the lack of local error representation, the need for symmetry of forward and backward weights, and the use of unrealistic models of neurons with continuous outputs instead of spikes, among others.⁽³⁾

References