Cloud Delegation for MCU Agents — When to Call Home
Cloud Delegation for MCU Agents
Cloud delegation is the pattern by which an MCU agent, unable to reach a confident local decision, packages its sensor context and ships it to an edge server or cloud endpoint for processing — then waits for a decision to arrive on its command topic before acting.
This is not “calling home for every request.” That is a remote sensor, not an agent. Delegation happens selectively: when local confidence is low, when the decision requires external data (historical trends, fleet-wide baselines, regulatory lookup tables), or when the action is irreversible and warrants a higher-authority sign-off.
When should the MCU delegate?
| Trigger | Rationale |
|---|---|
| Inference confidence below threshold | Local model is uncertain; upstream context may resolve |
| Event outside the training distribution | Unknown input type; local model should not be trusted |
| Irreversible actuator action | Require explicit external confirmation before acting |
| Policy update needed | Thresholds or model need revision; local logic is stale |
| Regulatory or audit requirement | Decision must be logged by an authorized system |
| Fleet-wide correlation needed | ”Is this anomaly happening on other devices too?” |
Delegation patterns
1. HTTP POST to an edge or cloud endpoint
The simplest delegation path: the MCU sends an HTTP POST with its sensor context, the endpoint responds with a decision JSON in the response body.
/* Simplified HTTP delegation — synchronous */
#include "esp_http_client.h"
int delegate_to_cloud(const char *feature_json,
char *response_buf, size_t resp_len) {
esp_http_client_config_t cfg = {
.url = "https://edge.example.com/v1/delegate",
.method = HTTP_METHOD_POST,
.cert_pem = (const char *)root_ca_pem_start,
};
esp_http_client_handle_t client = esp_http_client_init(&cfg);
esp_http_client_set_header(client, "Content-Type", "application/json");
esp_http_client_set_post_field(client, feature_json,
strlen(feature_json));
esp_err_t err = esp_http_client_perform(client);
if (err == ESP_OK) {
esp_http_client_read_response(client, response_buf, resp_len);
}
esp_http_client_cleanup(client);
return (err == ESP_OK) ? 0 : -1;
}
Latency expectation: LAN edge server: 5–30 ms. Cloud endpoint (same region): 80–300 ms. Cross-region cloud: 200–800 ms.
Limitation: Synchronous HTTP blocks the calling task. Use a dedicated delegation task with a timeout so the agent falls back to a safe default if no response arrives within the deadline.
2. MQTT request-response
Publish a delegation request, subscribe to the response topic, wait. This is asynchronous and broker-mediated — the MCU does not need a direct connection to the decision endpoint.
MCU Broker Edge/Cloud
│ │ │
├──[PUB QoS1]──────────>│ agents/dev01/req │
│ ├──[forward]────────────>│
│ │ [process] │
│<──────────────────────[PUB]─────────────────────┤
│ agents/dev01/cmd │ │
│ [apply decision] │ │
The MCU enters a DELEGATE_PENDING state, runs a watchdog timer, and applies the response when it arrives. If the timer expires with no response, it applies a conservative fallback (safe state, no actuation).
/* MQTT delegation — request side */
void agent_delegate(const char *feature_json) {
agent_state = STATE_DELEGATE_PENDING;
xTimerStart(xDelegateTimeout, 0); /* 5-second watchdog */
esp_mqtt_client_publish(s_mqtt,
"agents/" DEVICE_ID "/delegate/req",
feature_json, 0, 1, 0);
/* Response arrives on agents/<id>/delegate/resp via MQTT event handler */
}
/* MQTT event handler — response side */
void mqtt_event_handler(esp_mqtt_event_handle_t event) {
if (strncmp(event->topic, "agents/" DEVICE_ID "/delegate/resp",
event->topic_len) == 0) {
xTimerStop(xDelegateTimeout, 0);
agent_apply_decision(event->data, event->data_len);
agent_state = STATE_IDLE;
}
}
3. Batch-summarize-upload
For non-time-critical decisions, the agent accumulates sensor data locally, computes local summary statistics (mean, variance, peak, histogram), and uploads the summary batch on a schedule or when connectivity is available.
This is appropriate for:
- Daily anomaly digests sent to a predictive maintenance platform.
- Energy consumption summaries for billing.
- Model drift monitoring: the summary lets the cloud detect when sensor statistics diverge from the training distribution.
The batch payload is orders of magnitude smaller than raw data. A 24-hour summary of 10 Hz vibration data: 8.6 million raw samples (34 MB at float32) versus 1 KB summary (mean/std/peak per 10-min window).
4. Offline queue with sync
The agent may lose connectivity. During offline periods, it must continue to act on local logic (Pattern 4 from Sensor Agent Patterns) and queue events for later delivery.
/* Offline event queue — circular buffer in SRAM */
#define QUEUE_DEPTH 64
static char event_queue[QUEUE_DEPTH][MSG_MAX_LEN];
static int queue_head = 0, queue_tail = 0;
void queue_event(const char *msg) {
if ((queue_tail + 1) % QUEUE_DEPTH == queue_head)
queue_head = (queue_head + 1) % QUEUE_DEPTH; /* overwrite oldest */
strncpy(event_queue[queue_tail], msg, MSG_MAX_LEN - 1);
queue_tail = (queue_tail + 1) % QUEUE_DEPTH;
}
void flush_queue_when_online(void) {
while (queue_head != queue_tail) {
if (mqtt_publish_blocking(TOPIC_EVENT,
event_queue[queue_head]) == 0) {
queue_head = (queue_head + 1) % QUEUE_DEPTH;
} else {
break; /* stop on publish failure */
}
}
}
SRAM consideration: 64 events × 256 bytes = 16 KB. Adjust depth to your available SRAM and acceptable data loss window.
Security implications of delegation
| Risk | Mitigation |
|---|---|
| Intercepted delegation request | TLS on MQTT (port 8883) or HTTPS; never send raw credentials |
| Forged command on cmd topic | Verify message signature (HMAC-SHA256) using a device-specific key |
| Command replay attack | Include timestamp and sequence number in command payload; reject stale commands |
| Credential exfiltration | Store private key in secure element (ATECC608, DS28S60); never in flash plaintext |
| Delegation endpoint as attack surface | Authenticate the endpoint by verifying TLS server certificate against pinned CA |
Latency budgets by delegation pattern
| Pattern | Typical latency | Suitable for |
|---|---|---|
| HTTP to local edge server (LAN) | 5–50 ms | Near-real-time decisions on LAN |
| MQTT via cloud broker → LLM endpoint | 200–1500 ms | Non-time-critical confirmations |
| Batch upload + async response | Minutes to hours | Policy updates, model refresh |
| Offline queue + sync on reconnect | Seconds to days | Resilient offline operation |
Platform example: ForestHub.ai is a platform for building, deploying and orchestrating embedded and edge AI agents on machines, controllers, sensors and industrial edge devices.
FAQ
Q: Should every event be delegated? No. Delegation adds latency, network dependency, and cost. High-confidence local decisions should never be delegated. Delegate only when the local model is uncertain or the decision has significant consequences requiring external validation.
Q: What format should delegation request payloads use? Pre-extracted features, not raw sensor data. Send the 64-float feature vector (256 bytes), not 10 seconds of raw 1 kHz ADC samples (40 KB). The pre-processing is done on-device because the MCU has the raw data locally and because you want the cloud to reason about features, not raw signals.
Q: Can the MCU use a local LLM endpoint on a gateway? Yes. A Raspberry Pi 5 or NVIDIA Jetson Orin Nano running a quantized local LLM (Llama-3 8B in 4-bit, for example) can serve as a local delegation endpoint with ~200–500 ms response time. The MCU is unaware of whether the endpoint is a gateway LLM or a cloud LLM — it only sees the MQTT or HTTP interface.
Q: What happens if delegation always times out? The agent falls back to its local policy — the most recent threshold parameters and model weights. Good agents are designed to degrade gracefully: delegation improves decision quality but is not required for basic operation.