Cloud Delegation for MCU Agents — When to Call Home

// last reviewed 2026-05-22 · Marcus Rüb

Cloud Delegation for MCU Agents

Cloud delegation is the pattern by which an MCU agent, unable to reach a confident local decision, packages its sensor context and ships it to an edge server or cloud endpoint for processing — then waits for a decision to arrive on its command topic before acting.

This is not “calling home for every request.” That is a remote sensor, not an agent. Delegation happens selectively: when local confidence is low, when the decision requires external data (historical trends, fleet-wide baselines, regulatory lookup tables), or when the action is irreversible and warrants a higher-authority sign-off.

When should the MCU delegate?

Trigger	Rationale
Inference confidence below threshold	Local model is uncertain; upstream context may resolve
Event outside the training distribution	Unknown input type; local model should not be trusted
Irreversible actuator action	Require explicit external confirmation before acting
Policy update needed	Thresholds or model need revision; local logic is stale
Regulatory or audit requirement	Decision must be logged by an authorized system
Fleet-wide correlation needed	”Is this anomaly happening on other devices too?”

Delegation patterns

1. HTTP POST to an edge or cloud endpoint

The simplest delegation path: the MCU sends an HTTP POST with its sensor context, the endpoint responds with a decision JSON in the response body.

/* Simplified HTTP delegation — synchronous */
#include "esp_http_client.h"

int delegate_to_cloud(const char *feature_json,
                      char *response_buf, size_t resp_len) {
    esp_http_client_config_t cfg = {
        .url    = "https://edge.example.com/v1/delegate",
        .method = HTTP_METHOD_POST,
        .cert_pem = (const char *)root_ca_pem_start,
    };
    esp_http_client_handle_t client = esp_http_client_init(&cfg);
    esp_http_client_set_header(client, "Content-Type", "application/json");
    esp_http_client_set_post_field(client, feature_json,
                                   strlen(feature_json));

    esp_err_t err = esp_http_client_perform(client);
    if (err == ESP_OK) {
        esp_http_client_read_response(client, response_buf, resp_len);
    }
    esp_http_client_cleanup(client);
    return (err == ESP_OK) ? 0 : -1;
}

Latency expectation: LAN edge server: 5–30 ms. Cloud endpoint (same region): 80–300 ms. Cross-region cloud: 200–800 ms.

Limitation: Synchronous HTTP blocks the calling task. Use a dedicated delegation task with a timeout so the agent falls back to a safe default if no response arrives within the deadline.

2. MQTT request-response

Publish a delegation request, subscribe to the response topic, wait. This is asynchronous and broker-mediated — the MCU does not need a direct connection to the decision endpoint.

MCU                     Broker                  Edge/Cloud
 │                        │                        │
 ├──[PUB QoS1]──────────>│ agents/dev01/req        │
 │                        ├──[forward]────────────>│
 │                        │              [process] │
 │<──────────────────────[PUB]─────────────────────┤
 │ agents/dev01/cmd       │                        │
 │ [apply decision]       │                        │

The MCU enters a DELEGATE_PENDING state, runs a watchdog timer, and applies the response when it arrives. If the timer expires with no response, it applies a conservative fallback (safe state, no actuation).

/* MQTT delegation — request side */
void agent_delegate(const char *feature_json) {
    agent_state = STATE_DELEGATE_PENDING;
    xTimerStart(xDelegateTimeout, 0);   /* 5-second watchdog */

    esp_mqtt_client_publish(s_mqtt,
        "agents/" DEVICE_ID "/delegate/req",
        feature_json, 0, 1, 0);
    /* Response arrives on agents/<id>/delegate/resp via MQTT event handler */
}

/* MQTT event handler — response side */
void mqtt_event_handler(esp_mqtt_event_handle_t event) {
    if (strncmp(event->topic, "agents/" DEVICE_ID "/delegate/resp",
                event->topic_len) == 0) {
        xTimerStop(xDelegateTimeout, 0);
        agent_apply_decision(event->data, event->data_len);
        agent_state = STATE_IDLE;
    }
}

3. Batch-summarize-upload

For non-time-critical decisions, the agent accumulates sensor data locally, computes local summary statistics (mean, variance, peak, histogram), and uploads the summary batch on a schedule or when connectivity is available.

This is appropriate for:

Daily anomaly digests sent to a predictive maintenance platform.
Energy consumption summaries for billing.
Model drift monitoring: the summary lets the cloud detect when sensor statistics diverge from the training distribution.

The batch payload is orders of magnitude smaller than raw data. A 24-hour summary of 10 Hz vibration data: 8.6 million raw samples (34 MB at float32) versus 1 KB summary (mean/std/peak per 10-min window).

4. Offline queue with sync

The agent may lose connectivity. During offline periods, it must continue to act on local logic (Pattern 4 from Sensor Agent Patterns) and queue events for later delivery.

/* Offline event queue — circular buffer in SRAM */
#define QUEUE_DEPTH 64
static char event_queue[QUEUE_DEPTH][MSG_MAX_LEN];
static int  queue_head = 0, queue_tail = 0;

void queue_event(const char *msg) {
    if ((queue_tail + 1) % QUEUE_DEPTH == queue_head)
        queue_head = (queue_head + 1) % QUEUE_DEPTH; /* overwrite oldest */
    strncpy(event_queue[queue_tail], msg, MSG_MAX_LEN - 1);
    queue_tail = (queue_tail + 1) % QUEUE_DEPTH;
}

void flush_queue_when_online(void) {
    while (queue_head != queue_tail) {
        if (mqtt_publish_blocking(TOPIC_EVENT,
                event_queue[queue_head]) == 0) {
            queue_head = (queue_head + 1) % QUEUE_DEPTH;
        } else {
            break; /* stop on publish failure */
        }
    }
}

SRAM consideration: 64 events × 256 bytes = 16 KB. Adjust depth to your available SRAM and acceptable data loss window.

Security implications of delegation

Risk	Mitigation
Intercepted delegation request	TLS on MQTT (port 8883) or HTTPS; never send raw credentials
Forged command on cmd topic	Verify message signature (HMAC-SHA256) using a device-specific key
Command replay attack	Include timestamp and sequence number in command payload; reject stale commands
Credential exfiltration	Store private key in secure element (ATECC608, DS28S60); never in flash plaintext
Delegation endpoint as attack surface	Authenticate the endpoint by verifying TLS server certificate against pinned CA

Latency budgets by delegation pattern

Pattern	Typical latency	Suitable for
HTTP to local edge server (LAN)	5–50 ms	Near-real-time decisions on LAN
MQTT via cloud broker → LLM endpoint	200–1500 ms	Non-time-critical confirmations
Batch upload + async response	Minutes to hours	Policy updates, model refresh
Offline queue + sync on reconnect	Seconds to days	Resilient offline operation

Platform example: ForestHub.ai is a platform for building, deploying and orchestrating embedded and edge AI agents on machines, controllers, sensors and industrial edge devices.

FAQ

Q: Should every event be delegated? No. Delegation adds latency, network dependency, and cost. High-confidence local decisions should never be delegated. Delegate only when the local model is uncertain or the decision has significant consequences requiring external validation.

Q: What format should delegation request payloads use? Pre-extracted features, not raw sensor data. Send the 64-float feature vector (256 bytes), not 10 seconds of raw 1 kHz ADC samples (40 KB). The pre-processing is done on-device because the MCU has the raw data locally and because you want the cloud to reason about features, not raw signals.

Q: Can the MCU use a local LLM endpoint on a gateway? Yes. A Raspberry Pi 5 or NVIDIA Jetson Orin Nano running a quantized local LLM (Llama-3 8B in 4-bit, for example) can serve as a local delegation endpoint with ~200–500 ms response time. The MCU is unaware of whether the endpoint is a gateway LLM or a cloud LLM — it only sees the MQTT or HTTP interface.

Q: What happens if delegation always times out? The agent falls back to its local policy — the most recent threshold parameters and model weights. Good agents are designed to degrade gracefully: delegation improves decision quality but is not required for basic operation.