Scientists have come up with a method to reduce the power consumption of neural networks for use on mobile platforms





The breakthrough of the last years of artificial intelligence systems in the areas of autonomous driving, speech recognition, machine vision and automatic translation became possible thanks to the development of artificial neural networks. But for their launch and training they need a lot of memory and energy. Therefore, often AI components work on servers in the cloud and exchange data with desktop or mobile devices.



Neural networks consist of thousands of simple but closely interconnected information processing nodes, usually organized in layers. Neural networks differ in the number of layers, connections between nodes and nodes in each layer.



Connections between nodes are associated with weights, which determine how much the output of the node will facilitate the calculation of the next node. During training, in which networks are presented with examples of calculations that they learn to perform, these weights are constantly adjusted until the result of the last layer of the network matches the result of the calculation.



Which network will be more energy efficient? A small net with larger weights or a deeper net with smaller weights? Many researchers have tried to answer these questions. Recently, the main activity in the deep learning community has been aimed at developing efficient neural network architectures for platforms with limited computing capabilities. However, most of these studies have focused either on reducing the size of the model or on computing, while for smartphones and many other devices, energy consumption is of paramount importance due to the use of batteries and the limitations on the heat pack.



Researchers from the Massachusetts Institute of Technology (MIT), under the leadership of Vivienne Sze, Associate Professor of Electrical Engineering and Computer Science, have developed a new approach to optimizing convolutional neural networks that is focused on minimizing energy consumption using the new tool for estimating energy consumption.



In 2016, Vivienne Se and her colleagues presented a new energy-efficient computer chip optimized for neural networks. This microcircuit allows powerful artificial intelligence systems to work locally on mobile devices. Now, scientists have approached the problem from the other side and created several technologies for developing more energy efficient neural networks.



First, a team of researchers has developed an analytical method that can be used to determine how much energy a neural network consumes when working on a certain type of hardware. The scientists then used the method to evaluate new technologies for optimizing neural networks so that they could work more efficiently on handheld devices.



Researchers will present their work at the Computer Vision and Pattern Recognition Conference. In the document, they represent methods that, according to them, reduce energy consumption by 73% compared with the standard implementation of neural networks and by 43% exceed the existing methods for optimizing neural networks for mobile platforms.



The first thing that the team of scientists led by Sae did was to develop an energy modeling tool that takes into account transactions, movements, and data flow. If you provide him with a network architecture and the value of its weights, he will tell you how much energy this neural network will use. The developed technology gives an idea of ​​what energy is spent on, so that algorithm developers can better understand and use this information as a kind of feedback.



When the researchers found out what actions the energy is being spent on, they used this model to control the designer of energy-efficient neural networks. Sy explains that earlier other scientists who were trying to reduce the power consumption of neural networks used the trimming method. Connections with low weights between nodes have a very weak effect on the final result of the neural network, so many of them can be safely eliminated, cut off.



With the help of a new model, Se and her colleagues refined this approach. Although trimming a large number of low-weight connections does not significantly affect the output of the neural network, reducing all such connections would probably have a more serious effect on its work. Therefore, it was necessary to develop a mechanism that would help determine when to stop. Thus, MIT scientists cut off those layers of the network that consume more energy, which leads to the highest possible savings. Scientists themselves call this method energy-saving pruning.



Weights in the neural network can be both positive and negative, so the method of researchers also considers cases where connections with weights of the opposite sign are prone to mutual reduction. The inputs for this node are the outputs of the nodes in the underlying layer, multiplied by the weight of their connections. It can be said that the method of scientists from Massachusetts considers not only weights, but also how connected nodes process data during training.



If groups of compounds with positive and negative weights are successively displaced, they can be safely cut. According to the researchers, this leads to the creation of more efficient networks with fewer connections than with previously used cropping methods.



All Articles