The strategic potential of IoT has driven engineers to deploy an ever-growing array of Edge devices, which collect, process, and run inference on data without consistent connections to the Internet. Historically, organisations have sent data collected at the Edge to the Cloud for processing by machine learning models, as the devices lacked the computational power required to run them. With the development of more powerful processors and model compression software, the reliance on Cloud-based computing has decreased. Instead, Edge devices now have the capability to perform intensive AI calculations locally, which were previously done in the Cloud. With Internet-connected devices expected to reach 29 billion by 2030, the need for Edge AI is growing exponentially and is forecasted to be integrated into 65% of Edge devices by 2027.
Enabling technologies leading to the evolution of Edge AI
The Edge AI market is expected to grow from $15.6 billion in 2022 to $107.4 billion by 2029. Although Edge AI is not a new concept, the latest technological advancements have made Edge AI simpler and more economical to implement. The four main advancements driving Edge AI today are:
- Microcontrollers (MCUs) and Digital Signal Processors (DSPs) – Vector processors have become more powerful and are being customised by chip vendors to suit the needs of AI processing. These types of processors are currently the dominant Edge AI hardware.
- GPUs (Graphical Processing Units) – Originally used for graphics-intensive applications, such as gaming and video editing, GPUs are now used to train and run inference on AI models.
- AI accelerator ASICs – While GPUs perform better than CPUs for AI-related tasks, custom, application-specific integrated circuits (ASICs) tailor-made for AI workloads can offer even greater speed and efficiency. Neural Processing Units (NPUs), a form of ASIC, are specifically designed to process AI models, which makes them better suited for this task than a CPU.
- Model compression techniques – As Edge devices are typically constrained in memory and processing power, it’s vital to compress models while maintaining similar levels of accuracy and performance. The most popular AI compression techniques today are:
- Pruning – Removes unnecessary or less essential parameters to improve the AI model’s efficiency, speed, and memory requirements while minimising performance deterioration. Quantisation – Reduces the precision of numerical values in a model to lower the memory load and improve the model’s inference speed and energy efficiency. Knowledge distillation – Transfers the knowledge of a complex model to a more compact one that can mimic the behavior and performance of the original model.
- Low-rank factorisation –Compresses high-dimensional data by factoring it into lower-dimensional representations to simplify complex neural network models while preserving the identifying characteristics.
Using Edge AI to reduce reliance on Cloud-based computing
While Edge AI may not eliminate Cloud-based computing, the need to handle increasingly massive amounts of data makes one thing clear: engineers cannot afford to overlook the game-changing benefits of Edge AI today. The main advantages of Edge AI are real-time processing and decision-making, which lower latency and reduce costs associated with power usage and Cloud processing. Running inference on data locally results in less raw data sent to a public, private, or hybrid Cloud for processing. Cloud services are vital in specific applications and can be enhanced by running data inference on Edge devices.
The reduced reliance on maintaining a persistent Internet connection enables engineers to implement Edge AI models more efficiently in many industries. Over 400 use cases for Edge computing across 19 industries and six technology domains exist. AI models integrated into Edge devices can have potentially life-saving implications in some applications, such as automotive and medicine.
Automotive safety-critical systems
An automobile is one example of an Edge device that collects and processes data locally, reducing the amount of data that must be sent to the Cloud. Due to the self-contained nature of a car’s Electronic Control Unit (ECU), data processing must be performed locally, and safety-critical decisions must be made in real-time. Machine learning models on an Edge device, such as a car ECU, ensure passenger safety by using real-time data to adapt to road conditions and reduce collisions.
One motor vehicle manufacturer trained a machine learning model to detect oversteering, which occurs when a vehicle’s rear tires lose grip with the road during a turn. The manufacturer captured thousands of data points related to vehicle acceleration, steering, and yaw rate (angular velocity). By loading the data into MATLAB, engineers could train a machine learning model to recognise oversteering using the Statistics and Machine Learning Toolbox. The machine learning model was then deployed and integrated in the ECU using MATLAB Coder.
Real-time decision-making in medical devices
An advantage of Edge AI in the medical device field is its ability to enable fast decision-making. Real-time data analysis and anomaly detection enable timely intervention and reduce the risks associated with life-threatening and long-term health conditions. Medical Edge devices can also communicate with applications in the Cloud for data logging purposes, which is not a time-sensitive task. In this way, Cloud-based computing does not take away from but rather complements data inference on the Edge to create an even more powerful network of devices.
For example, a technology institute research group developed predictive algorithms for Artificial Pancreas (AP) systems that detect impending hypoglycemia and hyperglycemia. The group created virtual patients and used MATLAB to simulate physiological signals such as heart rate and energy expenditure. The completed algorithm was deployed on a mobile device that communicates with an insulin pump, a glucose monitor, and a wearable wristband to enable effective glucose concentration control. Ultimately, the research group created a network of Edge devices working in tandem as an integrated health monitoring system.
Conclusion
The amount of data engineers must manage grows daily, and Edge AI implementation can help maintain operational and cost efficiency while reducing reliance on Cloud processing. As engineers build on Cloud-based inference and AI-enabling technologies continue to evolve, integrating AI into Edge devices is quickly becoming necessary for companies to differentiate their products. Most importantly, Edge AI should be considered an additional tool alongside Cloud-based inference rather than a replacement or complete overhaul of the current AI-based systems. By implementing AI on the Edge and using the Cloud for applications where latency is not a concern, engineers can expand their AI toolbox irrespective of their industry.

Jack Ferrari is Product Manager of Edge AI at MathWorks.