This article originally appeared in the February 25 magazine issue of IoT Insider.
The rapid advancements in AI over the last few years are having a profound impact on IoT. The growth of Edge AI and the increasing application of small language models will be among the key trends driving this transformation. The evolution of these technologies is expected to enhance performance, cost-efficiency, privacy and security, while enabling more intelligent and localised decision-making to take place.
AI at the Edge will grow in prominence
In 2024, we have seen an increasing number of AI workloads running at the Edge – on device – rather than being processed in large data centres. This means power and cost savings, as well as privacy and security benefits for consumers and businesses.
2025 will likely see the emergence of sophisticated hybrid AI architectures that separate AI tasks between Edge devices and the Cloud. These systems will use AI algorithms in Edge devices to detect events of interest before employing cloud models to provide additional information. Determining where to run AI workloads, locally or in the Cloud, will be based on factors like available power, latency requirements, privacy concerns, cost and computational complexity.
Edge AI workloads represent a shift towards decentralised AI, enabling smarter, faster, and more secure processing on the devices closest to the data source, which will be particularly beneficial in markets that require higher performance and localised decision-making, such as industrial and smart cities.
The acceleration of small language models (SLMs)
Smaller, compact models with an increased compression, quantisation and a decreasing number of parameters are evolving at a rapid rate. Examples include Llama, Gemma and Phi3, which are more cost-effective, efficient, and easier to deploy on devices with limited computational resources and we expect the number to grow in 2025. These models can run directly on Edge devices, bringing enhanced performance and privacy. We expect to see a rise in SLMs used for on-device tasks for language and device interactions, as well as vision-based tasks, such as interpreting and scanning for events. In future, learnings will be distilled from larger models to develop local expert systems.
A consistent tool chain and software ecosystem will be crucial for Edge AI development
The broad range of AI applications, particularly in IoT, will need to use different computational engines for different AI demands. To maximise the deployment of AI workloads, CPUs will remain a key focus for deployment on existing devices. New IoT devices will offer enhanced performance for AI with increased memory sizes and higher performance Cortex-A CPUs.
While Edge AI accelerators, such as the latest Ethos-U NPUs, will be used to accelerate low power machine learning tasks, and bring power-efficient Edge inference to a broader range of use cases, a consistent tool chain and software ecosystem will be the most crucial factor for the rise of Edge AI.
2025 will see IoT transition from a network of connected devices to a network of intelligent systems with devices becoming smarter and more capable of performing complex, high-performance tasks with limited resources. Industries such as industrial, manufacturing, retail, and healthcare, where real-time data processing is critical for tasks which require precise monitoring, will particularly benefit from this shift. To fully realise this potential, success will depend on partnerships and industry collaboration, to drive innovation and unlock new opportunities.