In recent years we have seen a growing number of edge devices adopted by consumers, in their homes (e.g., smart cameras and doorbells), in their cars (e.g., driver assisted systems), and even on their persons (e.g., smart watches and rings). Similar growth is reported in industries including aerospace, agriculture, healthcare, transport, and manufacturing. At the same time that devices are getting smaller, Deep Neural Networks (DNN) that power most forms of artificial intelligence are getting larger, requiring more compute power, memory, and bandwidth. This creates a growing disconnect between advances in artificial intelligence and the ability to develop smart devices at the edge. In this paper, we present a novel approach to running state-of-the-art AI algorithms at the edge. We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks (BWN) and XNOR-Networks. In BWN, the filters are approximated with binary values resulting in 32x memory saving. In XNOR-Networks, both the filters and the input to convolutional layers are binary. XNOR-Networks approximate convolutions using primarily binary operations. This results in 58x faster convolutional operations (in terms of number of the high precision operations) and 32x memory savings. XNOR-Nets offer the possibility of running state-of-the-art networks on CPUs (rather than GPUs) in real-time. Our binary networks are simple, accurate, efficient, and work on challenging visual tasks. We evaluate our approach on the ImageNet classification task. The classification accuracy with a BWN version of AlexNet is the same as the full-precision AlexNet. Our code is available at: urlhttp://allenai.org/plato/xnornet.
In recent years, the approach of using Deep Neural Networks (DNN) to create artificial intelligence has been highly successful in teaching computers to recognize8,11,17,18 and detect4,5,16 objects, read text, and understand speech.7 Such capabilities could have significant impacts on industries such a healthcare, agriculture, aerospace, transport, and manufacturing, yet to date there are limited real world applications of DNN and Convolutional Deep Neural Networks (CDNN). While there has been some progress made with virtual reality (VR by Oculus),13 augmented reality (AR by HoloLens),6 and smart wearable devices, the majority of applications rely on edge devices that have limited or no bandwidth, are low powered, and require the data to be stored locally for privacy and security reasons. These constraints are at odds with the current state-of-the-art CNNs and DCNNs that require large amounts of compute power and are therefore currently limited to the cloud.
CNN-based recognition systems need large amounts of memory and computational power. While they perform well on expensive, GPU-based machines, they are often unsuitable for smaller devices like cell phones and embedded electronics. For example, AlexNet,11 one of the most well-known DNN architecture for image classification, has 61M parameters (249MB of memory) and performs 1.5B high precision operations to classify one image. These numbers are even higher for deeper CNNs for example, VGG17 (see Section 3.1). These models quickly overtax the limited storage, battery power, and compute capabilities of smaller devices like cell phones.