The convergence of AI and IoT, are we there yet?
Nja, or yes and no.
AI is converging with IoT, at least in the context of big data. Machine learning is applied to the massive amount of data collected by IoT devices to identify patterns and detect anomalies. It could be argued that many IoT applications will be worthless without AI to derive value from the data collected.
But when it comes to have the AI algorithms deployed on the IoT devices and perform inference locally, or in other words, making decision directly based on the sensor input, there have not been many commercial deployments, except for a few use cases/trials in drones, surveillance camera and self-driving cars.
AI deployment on edge devices can be especially valuable for mission-critical applications that require low latency decision or situation where reliable connectivity to the cloud is not a given. And according to IDC, at least 40 percent of IoT-created data will be stored, processed, analyzed, and acted upon close to or at the edge of the network by 2019. So how would we get there? To answer that let’s first go back to the basics of AI and get a good understanding of where we are today.Figure 1: A Venn diagram showing how deep learning stands in the bigger picture of AI (reproduced from the book Deep Learning by Ian Goodfellow)
Basics on deep learning
It would be so out of fashion to talk about AI today without mentioning deep learning. Deep learning is but one of many machine learning techniques within the huge AI universe, but recent advancements in the hardware and AI algorithm have unlocked its powerful capability manifested through applications like AlphaGo and ImageNET. You can find out more about deep learning in this excellent book from MIT, but below I will just highlight a few basics to lay the ground for our discussion.
As a concept, deep learning’s origin dates back to the 1940s and it is a particular kind of machine learning that learns to represent the world as a nested hierarchy of concepts, with each concept defined in relation to simpler concepts. But over the years of development it becomes almost synonymous to artificial neural network, which is the actual mathematical model that implements deep learning.
Artificial neural network is inspired by our brain, where billions of neurons connect to each other and somehow create intelligence. While we still don’t understand that “somehow” part, an artificial neural network mimics the brain structure in that it consists layers and layers of connected “artificial neurons”. Despite its name, these artificial neurons have no biologic function and are only abstract nodes that contain mathematical formula to perform simple calculations. For each neuron, it could have multiple input but only one output, and each input is weighted differently in the calculation.
As layers of artificial neurons form up a neural network, the artificial neuron in the first layer takes input from the data source, like the color of a pixel of an image or the readings from sensors, and after a simple calculation its output is forwarded as input to the artificial neurons in the next layer, and so on. This process goes on until a final output is produced from the artificial neuron in the last or output layer. The more layers between the data input and the final output, the deeper the network is. And the neural network is essentially defined by the weightings of the input to each artificial neuron and the parameters used in the math formula.
To develop a functioning neural network, it needs to go through two stages, a training stage and an inference stage.
In a typical training stage (for supervised learning), the neural network is fed with a large amount of labelled data, and an optimization algorithm will continuously adjust the weighting parameters to map the input data examples to an output until the difference to the correct responses is minimized. Since the neural network can be extremely complex, e.g., Baidu Deep Speech 2 having 300 million parameters, while Google’s Neural Machine Translation model has an enormous 8.7 billion parameters, such an iterative training process is extremely demanding on computation power. So far, the training for most commercial application of deep learning can only be done in power hungry dedicated server farms. Its requirement for large processing bandwidth led the rise of GPU, FPGA and ASIC chips over conventional CPU for AI training.
In the second stage, inference, the trained model (which essentially comprises a file describing the structure of the neural network and another file listing the values of all the weight parameters) is compiled as an application or inference engine running in the production environment, such as a datacentre. This engine then takes in the real-world data and feeds it forward through the network to perform inference. Since there is no need for the iterative optimization process, the computation resource requirement, while still significant, is orders of magnitude lower than that required in the training process.
Advancements that enable edge AI
Nowadays, most of the commercial applications using neural-network-based AI inference engines are still deployed in the cloud. Typical examples are Siri, Alexa and Google Translate (although strictly speaking, Siri uses a local acoustic model to detect “Hi Siri” that triggers its activation).
But things have been developing fast in the past couple of years, we start to see applications where the neural-network-based AI inference is moved from the cloud to the edge, such as automotive or even embedded devices like security cameras. Advancements in both software and hardware are feeding into the edge-based AI processing.
On the software/algorithm side, researchers from academia and industry are developing techniques to compress the trained neural network into smaller footprint. A technique called weight quantization, for example, represents each weight parameter of a neural network in integer numbers (8bits) instead of the standard floating numbers (32bit), and greatly reduces the size of the code without noticeable degradation of the inference accuracy.
On the hardware side, the advancement is equally impressive with the recent launch of many chipsets customized for AI processing, such as NVidia’s Jetson processors and ARMs CortexA-75. Using these processors, many of the voice, image processing, and AR applications can be performed using device-based inference instead of relying on the cloud. Intel’s latest Mobivus, packed into an USB stick, consumes only 1 watt of power when running an embedded pre-trained neural network to identify objects in live video feeds.
Even though 1 watt is still not quite yet the ultra-low power consumption level that IoT and sensor devices are striving for, it should be kept in mind that deep learning is one of the most powerful and complex machine learning algorithms. It can be an overkill for many IoT sensor use cases that only require far simpler AI function. Realizing this opportunity, researchers from big companies like Microsoft as well as start-ups like Imagimob are focusing on building a library of simpler algorithms, each tuned to perform optimally for a niche set of scenarios. To name a few, motion intelligence such as fall detection in wearables and predictive maintenance for robotics are being deployed with tiny footprint in the order of kilobytes on small microcontrollers with ultra-low power consumption. Despite the challenge of this bottom up approach (mainly the difficulty in identifying a niche but meaningful problem to solve), we can expect that more and more such products will emerge in the coming years.
Implications of edge AI
The device based AI inference brings a couple of obvious advantages over the cloud based one: first and foremost, the low latency for local decision making; secondly the reduced reliance on internet connectivity; and thirdly the protection of data privacy. All these benefits are derived from the fact that, with edge AI, raw data can be processed locally and only a small amount, if any, of the aggregated and/or anonymized data need to be sent to the cloud.
No doubt edge AI will lead to another gold rush for chip manufactures, but also it opens up more IoT deployment strategies than just connecting dumb cheap devices into a super brain in the cloud. In some situations, smarter devices plus cheap connectivity can be an as effective or even better solution, while in other cases the applications might not only require local intelligence but will still be assisted by tightly linked cloud-based applications in a hybrid environment. Therefore, it is high time that both IoT connectivity providers and enterprises who are considering deploying IoT should start to investigate and understand the impact of such edge AI powered solutions.
The future is even more exciting if we broaden our scope to include the development in distributed ledger technologies (e.g., blockchain for de-centralized security and authentication) and self-organizing/healing mesh technologies for distributed networking. The combination of these technologies could completely change the deployment and operation of IoT services and lead to the rise of truly distributed and autonomous network of things. So, let me reformulate the original answer: the convergence of AI and IoT is not there yet, but it will happen soon, and it will shape our future.
Lei is a Manager at Northstream