By Chandrakant Deshmukh, Senior Vice President – Architecture & IP Governance, Mastek
Walk into any enterprise IoT engagement today and you’ll find the same paradox: the devices are deployed, the telemetry is flowing, and yet the operational intelligence promised three years ago remains stubbornly out of reach.
The problem is rarely the hardware, and almost never the connectivity. It is the architecture – the decisions about where data is processed, how it moves, and what it is allowed to become on the way.
Having spent the better part of two decades designing data platforms for manufacturing, supply chain, and infrastructure clients, I have come to a direct conclusion: the gap between data generated at the Edge and data acted upon in real time is where most IoT ROI quietly disappears.
Enterprises are still defaulting to a “lift and stream” pattern — ship everything to the Cloud, figure out the analytics later. That pattern made economic sense in 2015. At today’s device densities and data volumes, it breaks the business case.
Why Cloud-first IoT architectures are breaking down
Three forces are converging to make cloud-centric IoT architectures structurally inefficient.
The first is latency. A Cloud round-trip on a good day costs 80–200 milliseconds. For a PLC managing a robotic arm, a turbine governor, or a grid-balancing inverter, that is an eternity. Real-time control loops cannot be built on top of such roundtrips; they need sub-10-millisecond decisions that only local inference can deliver.
The second is bandwidth economics. A single modern CNC machine can emit several gigabytes of vibration and process telemetry per day. Multiply across a plant of three hundred machines, then across a fleet of forty plants, and the egress bill alone will consume the modernisation budget.
The third is data sovereignty and compliance. For regulated industries — healthcare, defence, critical infrastructure — certain classes of data legally cannot leave the facility, the region, or the country of origin. The architecture must honour that constraint at the point of data generation, not at the point of analytics.
What I see most often is a variation on the same anti-pattern: a Cloud data lake used as a transit hub, batching everything overnight, with “real-time” dashboards that are really four-hour-old aggregates in a hurry. That is not real-time intelligence — it is a reporting delay dressed up in a different vocabulary.
The Edge–Cloud continuum: a better mental model
The architectural reset most organisations need is to stop thinking of Edge and Cloud as competing locations, and start thinking of them as tiers on a continuum.
At the device layer, you deal in raw signals and immediate control. At the Edge compute layer — a rugged server or gateway — you filter, aggregate, run inference, and make local decisions. At the regional Cloud, you persist, correlate across sites, and serve near-real-time analytics. At the global Cloud, you train models, run long-horizon analytics, and feed improvements back down to the lower tiers.
The governing principle is data gravity: process data closest to where it is generated and move only what genuinely needs to move.
A vibration waveform sampled at 25 kHz does not need to travel anywhere; a one-second FFT summary and an anomaly flag do. A surveillance feed does not need to stream continuously; the event clips around a detected intrusion does.
The inflection point where tiered architecture pays back is usually one of three: when the latency budget falls below fifty milliseconds, when per-site telemetry volume crosses roughly one terabyte per day, or when regulated data exceeds ten percent of the total pipeline. Hit any one of those and the business case for Edge compute is already made.
Modernising the pipeline: what it looks like in practice
A modern IoT data pipeline has three non-negotiable components.
First, an event-driven transport spine. MQTT 5.0 at the Edge for lightweight device telemetry, Kafka at the regional tier for durable, repayable streams. The pairing matters — MQTT handles the constrained-device side; Kafka handles the enterprise-integration side, and a broker bridges them.
Second, a stream-processing layer close to the data. Apache Flink, Spark Structured Streaming, or the native stream analytics services inside Azure IoT Hub, AWS IoT Core, and GCP IoT are all viable; the choice is less important than the commitment to processing streams, not rest-state batches.
Third, a digital twin as a synchronisation contract. A well-modelled twin gives you a canonical representation of each physical asset’s state, bridging edge observations and cloud analytics without either side having to reason about the other’s schema.
This is where most pipelines either become maintainable or collapse under their own weight.
Teams that get this right treat schema as a first-class artefact. A schema registry, enforced at ingestion, is not administrative overhead — it is the difference between a pipeline that evolves gracefully and one that silently poisons its own downstream analytics.
Governance, IP, and the question most architects skip
This is the section most IoT architecture conversations avoid, and it is the one I find most consequential. In a multi-vendor industrial IoT deployment — where sensor OEMs, integrators, cloud providers, and the enterprise all touch the data — who owns what the devices generate?
The answer is almost never in the contract. It needs to be. Data ownership, derived data rights, training data rights for models built on that telemetry, and the right to extract data on contract termination are all negotiable terms. Most enterprises discover they have given away the valuable ones only when they try to switch vendors. An IP governance framework for IoT starts with explicit data classification at ingest, lineage tracking from device to decision, and contractual clarity on who can train what on which dataset.
On the technical side, that translates to three capabilities: end-to-end data lineage so any downstream insight can be traced back to the exact device, firmware, and calibration state that produced it; zero-trust identity at every edge node, because a compromised gateway is a compromised pipeline; and observability that treats data quality as a first-class signal alongside latency and throughput.
The payoff, and what to do Monday morning
When this architecture lands, the results are tangible: predictive maintenance that triggers on genuine anomaly detection rather than calendar schedules; supply chain adjustments that happen in minutes rather than overnight; energy optimisation driven by live consumption curves rather than monthly bills. These are not hypothetical wins — they are the steady state for organisations that committed to the architectural shift three to five years ago.
For leaders evaluating their own position, the diagnostic is straightforward. Are you processing data at the right tier, or sending raw telemetry to the cloud out of habit? Are your pipelines built on streams, or are your “real-time” dashboards really disguised batch jobs? Do you know, contractually and technically, who owns the data your devices generate — and can you prove lineage from device to decision?
Edge and Cloud data modernisation is not a technology project. It is an architectural commitment — and the enterprises that will lead the next decade of industrial intelligence are the ones making that commitment today, not retrofitting it later.
Author biography:
Chandrakant Deshmukh is Senior Vice President – Architecture & IP Governance at Mastek, a global IT services and digital and Cloud transformation company, headquartered in India.
There’s plenty of other editorial on our sister site, Electronic Specifier! Or you can always join in the conversation by commenting below or visiting our LinkedIn page.
