Default budgets
When a plugin’s manifest does not declare resources, it gets:| Resource | Default cap |
|---|---|
| RAM | 64 MB |
| CPU | 10% of one core |
| PIDs | 4 |
| Disk | 100 MB |
| Inbound topic outbox | 256 messages per topic |
Spotting cgroup throttling
max_cpu_percent is enforced by the cgroup cpu.max controller.
A plugin that breaches it does not crash; it gets throttled. The
symptom is “everything got slower for no reason.”
Read the throttle stats:
nr_throttled. If it climbs every
second, your plugin is over budget. Either optimize the code or
declare a higher max_cpu_percent.
ados plugin profile <id> (see below) surfaces this in a
human-readable table.
Memory pressure
max_ram_mb is enforced by memory.max. A breach OOM-kills the
process. The supervisor marks it crashed and the circuit
breaker counts toward the three-strikes-in-five-minutes rule.
Watch for the warning band:
memory.events shows low, high, max, oom, oom_kill
counters. Any non-zero high or max means the kernel started
applying pressure (reclaiming page cache, swapping if allowed).
To stay under the cap:
- Process events on a bounded queue, not an unbounded list.
- For ML models, prefer quantized weights (int8 over fp32).
- Stream files; do not load whole logs into RAM.
- Use
array.arrayornumpyover a Python list of floats.
The ados plugin profile subcommand
ados plugin profile <id> runs for 60 seconds and prints a
summary:
Profiling Python plugins
The SDK does not bundle a profiler; usecProfile or
pyinstrument. The simplest path:
async handler through pyinstrument:
journalctl -u ados-plugin@com.example.battery.service.
Profiling TypeScript plugins
Open the GCS in developer mode (?dev=1), pick the plugin’s
iframe in the browser devtools (Sources tab), and use the
Performance recorder. The iframe is a real browsing context with
the standard devtools surface.
Common GCS hot paths:
- React re-renders triggered by every telemetry event. Coalesce
with
requestAnimationFrameand a local ref. - Heavy SVG redraws. Switch to canvas if the path count goes above ~200.
- Synchronous JSON parse in the message handler. Move it to a Web Worker or off the main thread.
Event-loop discipline
The biggest cause of agent-plugin crashes is a blocked event loop. Symptoms: missed watchdog ping, supervisorSIGKILL, the
crashed lifecycle event, and journal showing
Watchdog timeout (limit 30s).
Don’t:
time.sleep(...)in async code.- Heavy CPU work directly in
on_start. - Synchronous I/O on a slow disk.
await asyncio.sleep(...).await asyncio.to_thread(heavy_work).aiofilesor chunked reads for slow I/O.
Disk discipline
Disk I/O is the second-most-common cause of throttling. A plugin writing 1 MB samples per second to its data dir on an SD card is already at the SBC’s sustained-write ceiling for the cheap end of the SD card market. To stay under the cap:- Batch writes. One
fsyncper second is plenty. - Compress logs (
zstdis fast and saves a lot on JSONL). - Rotate. The SDK helper
ctx.logging.rotating_file_handlerdoes what it says. - Delete on uninstall. The host wipes your data dir, but in-life cleanup is your job.
Network discipline
Plugins withnetwork.outbound are under no host-imposed
bandwidth cap, but the drone’s link is shared with telemetry and
video. A plugin pulling weights from the cloud during flight is
a UX bug. Pull during ground time and cache.