# Xen Orchestra 6.5's Full REST API — 245 Tools from One DADL

> Xen Orchestra 6.4 stabilized its REST API and 6.5 added PATCH updates and VM controllers. Vates ships a first-party MCP server — 7 tools, read-only by design. We took a different cut: the entire 245-endpoint surface, read and write, as one DADL. Both belong in the same registry. Here is why.

Canonical: https://www.toolmesh.io/en/blog/xen-orchestra-full-rest-api-in-one-dadl/

Xen Orchestra has spent the last two releases turning its REST API into something you can build on. [XO 6.4](https://xen-orchestra.com/blog/) (April 2026) marked `/rest/v0/` as stable and added fine-grained RBAC v2; 6.5 added `PATCH /vms/{id}` for partial VM updates and a separate `/vm-controllers` collection. In February, Vates also shipped a [first-party MCP server](https://xen-orchestra.com/blog/mcp-meets-xen-orchestra/) bundled with XO 6.2 — seven tools, deliberately read-only, so teams can enable it on production from day one.

We took a different cut of the same API. The DADL for Xen Orchestra covers the **entire 245-endpoint REST surface** — read and write — as one YAML file. This post is the build log, and an honest answer to the obvious question: why two MCP surfaces for the same product?

## What the REST API now covers

The XO REST API is no longer a thin read layer. As of 6.5 it exposes the full object graph — VMs (including `PATCH` partial updates and snapshot revert), VM controllers, VM templates, VM/VDI snapshots, hosts, pools, storage (SR/VDI/VBD), networking (VIF/PIF/PBD), hardware passthrough (PCI/PGPU/SM) — plus every lifecycle action: power, snapshot, clone, migrate, export/import, hotplug. On top of the objects sit the operational surfaces: tasks, backups (jobs, logs, repositories, restore, schedules), real-time change events over Server-Sent Events, RBAC v2 (users, groups, acl-roles, acl-privileges), and host/pool maintenance (rolling reboot, rolling update, emergency shutdown).

That is a large API, and almost all of it mutates infrastructure. None of the write side is reachable through the first-party MCP server — by design, not by omission.

## The numbers

| | DADL (`xen-orchestra.dadl`) | `@xen-orchestra/mcp` (first-party) |
|---|---|---|
| Tools | **245** | 7 |
| Scope | full read **and** write | read-only by design |
| Coverage | every REST object + lifecycle action, backups, RBAC v2, SSE events, SDN traffic rules, dashboards, auth tokens | infrastructure summary, list/inspect pools / hosts / VMs, pool dashboard, documentation search |
| Create / start / migrate / delete? | yes | no — intentionally |
| Distribution | one 245-tool YAML file | Node server bundled with XO 6.2+ |
| Tracking a new API release | ~30 lines of YAML per endpoint | ships with the XO release |
| Governance layer | credentials, authorization, audit (via ToolMesh) | none — local, in-process |

The local validation, run from the registry checkout:

```
> cd dadl-registry && npm run validate
✅ xen-orchestra.dadl — 245 tools
```

Source on GitHub: [`dadl-registry/xen-orchestra.dadl`](https://github.com/DunkelCloud/dadl-registry/blob/main/xen-orchestra.dadl). Live in the registry at [dadl.ai/d/xen-orchestra](https://dadl.ai/d/xen-orchestra). The DADL spec itself: [dadl.ai/spec](https://dadl.ai/spec/dadl-spec-v0.1.md).

## A worked example: a VM change, made undoable

The point of the write side is that it actually changes infrastructure — safely, with a trail. Here is a three-call round trip: find a running VM, snapshot it so the change is reversible, then patch it.

```js
// 1. Find the VM by name (XO filter syntax, compact fields)
const [vm] = await toolmesh.xen_orchestra_list_vms({
  filter: "name_label:web-fra-01",
  fields: "uuid,name_label,tags"
});

// 2. Snapshot first, so the change is undoable
await toolmesh.xen_orchestra_snapshot_vm({
  id: vm.uuid,
  name_label: `pre-change ${new Date().toISOString().slice(0, 10)}`
});

// 3. PATCH the live VM — tags is a hot field, applied while running (XO 6.5)
await toolmesh.xen_orchestra_update_vm({
  id: vm.uuid,
  tags: [...vm.tags, "managed:toolmesh"]
});
```

Three calls. No glue code. Steps 2 and 3 are exactly what a read-only MCP server cannot do — and the DADL transparently handles:

- the unusual cookie-based auth (the token is injected server-side as `Cookie: authenticationToken=…` — the model never sees it);
- the async-by-default action model — most writes return a task reference; you add `sync` only for short operations and poll `get_task` for long ones (migrate, export) to avoid HTTP client timeouts;
- the cold-vs-hot field distinction on `update_vm` — `memoryStaticMax`, `cpusStaticMax`, `secureBoot` need a halted VM, while `tags`, `nameLabel`, and `cpus` apply live (the example uses `tags`, so it runs without downtime);
- retries on transient failures and terminal classification for `404` ("no such VM") and RBAC `403`.

Same calls from Claude or any other agent. No Xen-Orchestra-specific code on the client side.

## Read-only is a feature, not a gap

The natural framing — "245 beats 7" — is the wrong one. The two surfaces answer different questions.

The first-party MCP server is read-only *on purpose*. It runs in-process, ships with XO, needs no external infrastructure, and cannot break anything. If the question is *"let an assistant look at my infrastructure and answer questions about it,"* that is the right shape, and you should use it. We are not trying to replace it — the DADL even includes a `get_mcp_status` tool that reports whether XO's own built-in MCP server is enabled.

The DADL answers a different question: *"let an agent operate my infrastructure — create, patch, snapshot, migrate, back up, provision RBAC — without handing it raw admin credentials."* That only makes sense with a control layer underneath. Through ToolMesh, every one of the 245 tools runs behind centralized credentials (the XO token never reaches the model), authorization (an agent can be granted `list_vms` and `snapshot_vm` but not `delete_sr`), and an audit log (every call recorded with what, when, and why). The write side is the whole point, and the governance layer is what makes the write side safe to expose.

Both definitions live in the same registry. They are complements: read-only discovery in-process, full lifecycle management through a governed gateway.

## Tracking the API is a YAML diff

XO 6.4 added RBAC v2; 6.5 added `PATCH /vms/{id}`, `/vm-controllers`, and a REST snapshot-revert endpoint. Reflecting each of those in the DADL is the same small unit of work: open `xen-orchestra.dadl`, add the tool block (~30 lines of YAML per endpoint), run `npm run validate`, open a PR, review the diff, merge. ToolMesh picks the new tools up on next reload. No SDK changes, no release to cut, no npm publish, no "please upgrade" to send users.

Two concrete examples from the current release:

- 6.5 shipped `POST /vms/{id}/actions/revert_snapshot` for automatable snapshot revert. The DADL already carries it as `revert_vm_to_snapshot`, annotated *"requires XO 6.5+ (route returns 404 on older builds)"* — so against an older host the agent gets a clean terminal error instead of a silent failure.
- 6.5 also lays groundwork for a dedicated *VM administrator* role. You do not have to wait for it: the DADL exposes the full RBAC v2 surface (`create_acl_role`, `create_acl_privilege`, `attach_acl_role_to_user`), so you can compose a least-privilege operator role today and grant an agent exactly the privileges it should hold — read on `vm`, `host`, `pool`, and nothing more.

We made the general version of this argument [in an earlier post](/en/blog/dadl-the-end-of-mcp-server-boilerplate); XO is the concrete case study for a write-heavy API.

## The context-window argument

There is a second reason the tool count matters less than it looks: **Code Mode**. Instead of injecting all 245 XO tool descriptions into the model's context window, ToolMesh exposes two meta-tools (`list_tools` and `execute_code`). The model discovers what is available on demand and calls flat tool names from inside a sandboxed JavaScript runtime — `toolmesh.xen_orchestra_list_vms`, `toolmesh.xen_orchestra_snapshot_vm`, not a nested namespace.

The token cost is roughly constant: ~1 k tokens of meta-tool description, whether the registry holds one backend with 7 tools or {{registry.apis}} backends with {{registry.tools}}. A 245-tool backend adds zero tokens to every Claude conversation that does not actively touch Xen Orchestra. That is also why the JavaScript in the worked example above is not an illustration — it is literally what the model writes to call the tools.

## Honest caveats

The DADL is not a universal win. Real limits:

- **The SSE event stream is half-declarative.** `open_events_stream` returns a subscription ID and the DADL describes the subscribe handshake, but the Server-Sent Events stream itself is a streaming protocol — DADL describes REST, not long-lived streams. For polling task state, `get_task?wait=true` long-polls cleanly; for a true real-time feed you are at the edge of what a declarative REST definition models.
- **VDIs cannot be renamed via REST.** There is no `PATCH /vdis/{id}` in the API — so there is no `update_vdi` tool. Rename via tags or drop to the JSON-RPC API. We expose what exists; we cannot expose what does not.
- **SDN traffic rules need a Premium plugin.** The four traffic-rule tools (`add`/`delete_network_traffic_rule`, `add`/`delete_vif_traffic_rule`) drive XO's SDN Controller — a plugin bundled with XOA Premium, not part of the documented core REST API. Without it loaded the routes return `404`, so the agent gets a clean terminal error rather than a silent no-op. The other 241 tools work against a stock REST API.
- **The write side needs privileges.** Full coverage assumes an admin user or an RBAC v2 user with the right `acl-privileges`. If you only need to *look*, the read-only first-party server is less setup and a smaller blast radius — use it.
- **Every call hops through ToolMesh.** For interactive operation this is invisible; for high-throughput batch jobs against XO, calling the REST API directly is faster.

DADL is good at one specific thing — making a REST API surface available to agents quickly, with a real control layer underneath. For Xen Orchestra, that surface happens to be 245 endpoints of mostly-mutating operations, which is exactly the case where a governance layer earns its keep.

## What is actually different

The substantive question is not "DADL vs the first-party MCP server" — ToolMesh runs both, and they coexist in the same registry. The question is: **what does it cost to give an agent the *write* side of an API, safely?**

For a read-only server, the answer is "you don't" — that is the deliberate trade. For a hand-coded read-write server, it costs an engineering cycle per endpoint: SDK shapes, handlers, tests, a release, an npm publish, user upgrades, plus the authorization and audit you have to build yourself. For a DADL behind ToolMesh, it costs one PR with a YAML diff, and the credentials, authorization, and audit come from the runtime.

That is the asymmetry. Not "everyone should ditch the first-party server" — keep it for read-only discovery. Just: when you need an agent to actually operate Xen Orchestra, the unit of work is a YAML diff, and the safety is in the layer underneath rather than in withholding the write tools.
