Small language models (SLMs) in IIoT edge computing
This past November 25th to 29th, ALTEN hosted its second edition of International Tech Week, featuring a series of distinctive and insightful webinars on the given theme: Automation Across Borders. One standout session was SLMs in IIoT edge computing delivered by Salman Sattar, Solution Engineer at ALTEN Sweden.
With an extensive background in enterprise IoT and machine learning, Salman was able to showcase just how SLMs can be utilised. In this article, Salman will explain how they can be applied in physical environments, edge devices, software & processing units, as well as in cloud gateways.
Edge devices in IIoT
Edge devices operate at the borderline of IoT paradigms and are equipped with sensors and communication technologies, making them real-time data collectors and analytics at the edge of the network.
Decisions are then made solely in edge instead of centralised cloud systems.
Edge devices are used in predictive maintenance and fleet management, as well as safety, communication, and energy domains.
What is an SLM?
Small language models (SLMs) are smaller versions of AI models designed to process and generate intelligent results at the small computation and efficiency level. We call the models “small” because of their relatively small number of parameters, as opposed to large language models (LLMs), such as GPT-3. The small number of parameters makes them lighter, more efficient, and more convenient.
SLMs usage in mobile applications is efficient as they don’t have a lot of computing power or memory requirements. SLMs can also work well on simpler hardware, meaning they can be used with multiple different settings.
The Working Models of SLMs
SLMs can be utilised for work in the real domain for knowledge distillation, transferring knowledge from a pre-trained LLM to an SLM, as well as in pruning, which calls for the removal of the less useful parts of a model.
Additionally, SLMs can be used for quantisation, which is the reduction of a model’s size and computational requirements without significantly compromising performance.
The 4 layers of SLM Architecture
- Physical:
This layer represents the real-world environment where the edge device is deployed. It can be a factory floor, a ship at sea, or any other location relevant to the application requirements.
- Edge device:
This is responsible for communication using sensors, a processing unit, and connectivity.
- Software and processing:
This involves communication with the external factor, resolution of communication protocols, and processing of local data storage.
- Cloud gateway:
Here we connect SLMs to IoT (Internet of Things) devices. It helps to remotely manage onsite entities via OTA (Over the air) updates. Security is implemented at this layer to provide protective barriers for stored and incoming data.
Parameters in AI
In today’s world, we are talking about several million or billion parameters, or variables, in AI. Depending on the parameter used, one can finetune their results. The impact of parameters on results determines:
- Accuracy
- The resolution of a bigger problem with smaller knowledge set.
- Training computation
- Models are trained by using computation at different levels.
- Smart models
- Parameter sharing: The same parameter is used across different layers.
- Factorisation: The breakdown of larger contexts to smaller ones helps reduce parameters.
- Efficiency
- With less parameters and iterative smart models’ computation, the efficiency is achieved.
Reference architecture
The pre-trained data and synthetic data from real systems are converted and together contribute to the finalised SLM. The SLM is then made more intelligent and effective for application to edge devices.
See the illustration below for the reference architecture on a conceptual level.

Summary
SLMs are specifically designed for edge and low processing devices. Their need for low processing enables them to process complex AI tasks while using minimal computational resources.
However, many legacy edge devices, such as brownfield hubs, cannot accommodate these SLMs due to their unsupported architecture. To maintain a unified architecture and solution across both modern and legacy devices, a viable mitigation strategy is necessary to run SLMs in the cloud, as though they were operating on the edge, using secure tunneling.