MCP server - Runflow

The Runflow MCP server is a thin, stateless proxy that exposes the Runflow API to any Model Context Protocol client. Bring your own rf_live_* key, point your agent at mcp.runflow.io, and you have:

A generate tool that runs any Runflow model or Solution and waits for the result.
A list_models tool that filters the live catalog.
One /runflow:<slug> slash command per active Solution, auto-generated from the catalog.
The Runflow agent skill returned as MCP initialize instructions, so the Solutions-first decision rule is in context for the whole session.

The MCP server is the fastest path for agents that already speak MCP. If your runtime does not speak MCP, integrate via the regular REST API — the MCP server is a convenience layer, not a different product.

Connect in 30 seconds

Endpoint	Value
Server URL	`https://mcp.runflow.io/mcp`
Transport	Streamable HTTP (JSON-RPC)
Protocol version	`2025-06-18`
Auth	`Authorization: Bearer rf_live_…`
Health check	`GET /healthz`

Get an rf_live_* key from the dashboard under Settings → API keys. Service keys (rf_svc_*) are not accepted — only rf_live_* can dispatch runs.

Claude Code, Cursor, and VS Code do not currently substitute ${VAR} (or ${env:VAR}) inside MCP HTTP-transport headers. The literal string is sent and the server returns 401. Until the clients ship support (claude-code#6204, #51581, Cursor forum 79296), the working paths are: paste your rf_live_… key directly into the config file and keep it out of version control (gitignore it, never commit it), OR use a stdio wrapper like mcp-remote that does honour env vars.

Claude Code

Easiest reliable path is the stdio wrapper. Add to ~/.claude/config.json:

{
  "mcpServers": {
    "runflow": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "https://mcp.runflow.io/mcp", "--header", "Authorization:Bearer ${RUNFLOW_API_KEY}"],
      "env": { "RUNFLOW_API_KEY": "rf_live_…" }
    }
  }
}

mcp-remote substitutes ${RUNFLOW_API_KEY} from the env block before opening the HTTPS connection. If you prefer the direct HTTP transport, paste the literal key:

{
  "mcpServers": {
    "runflow": {
      "type": "streamable-http",
      "url": "https://mcp.runflow.io/mcp",
      "headers": { "Authorization": "Bearer rf_live_…" }
    }
  }
}

Add the config file to .gitignore if it lives inside a project. Never commit a literal key.

Cursor

Same constraint as Claude Code. Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "runflow": {
      "url": "https://mcp.runflow.io/mcp",
      "headers": { "Authorization": "Bearer rf_live_…" }
    }
  }
}

~/.cursor/mcp.json is user-scoped (not committed). If you want a project-scoped config, prefer the mcp-remote stdio pattern shown above so the key can live in a process env var instead of a JSON file.

VS Code MCP

Use the VS Code MCP extension’s HTTP transport with URL https://mcp.runflow.io/mcp and a single literal-key Authorization header in user settings. Same env-var-substitution limitation as the CLI clients.

Claude.ai connectors

Add a custom connector with URL https://mcp.runflow.io/mcp and paste your bearer when prompted. The connector inherits the initialize.instructions SKILL.md automatically. The bearer is stored encrypted in your account, not in a local file.

Tools

`generate`

Dispatch a run and (by default) wait for the result.

generate(model, input, wait?, client_ref?)

Argument	Type	Notes
`model`	string	`provider/slug` as it appears in `list_models`. Nested slugs are allowed (`alibaba/wan/v2.7/text-to-video`, `runflow/object-removal/prompt`).
`input`	object	Model-specific input. See the per-model schema at `https://docs.runflow.io/models/<provider>/<slug>` or the live `llms.txt` at `https://www.runflow.io/models/<provider>/<slug>/llms.txt`.
`wait`	boolean (default `true`)	When `true`, blocks until terminal status (`succeeded` / `failed` / `cancelled`), emitting `notifications/progress` between polls. When `false`, returns immediately with a `run_id`.
`client_ref`	string (optional)	Idempotency key. Strongly recommended for runs you expect to take more than 60 seconds — see resume semantics below.

Polling backs off adaptively: 2s for the first 30s, 5s through 2 minutes, then 10s. The server enforces MCP_POLL_TIMEOUT_MS (10 minutes by default). If the poll exceeds that ceiling, the tool returns a polling_timeout envelope with the latest run state. The upstream run keeps running.

Resuming after `polling_timeout`

If you supplied client_ref on the original call, re-invoke generate with the same model and the same client_ref. Runflow honours the idempotency key and returns the existing run. Without client_ref, the run still completes upstream but cannot be resumed via MCP — the run_id is returned in the error envelope so you can poll the REST API directly.

Example

{
  "method": "tools/call",
  "params": {
    "name": "generate",
    "arguments": {
      "model": "runflow/object-removal/prompt",
      "input": {
        "image_url": "https://example.com/photo.jpg",
        "prompt": "remove the watermark"
      }
    }
  }
}

`list_models`

Browse the unified catalog. Solutions (provider_slug=runflow) appear alongside raw provider models — no special casing.

list_models(category?, provider?, query?, limit?)

Argument	Type	Notes
`category`	enum	`text-to-image`, `image-to-image`, `text-to-video`, `image-to-video`, `video-to-video`, `text-to-audio`, `solution`
`provider`	string	Filter by provider slug (`runflow`, `openai`, `google`, …).
`query`	string	Substring match on `model_name` or `model_slug`.
`limit`	integer 1-200	Default 50.

Each entry includes a runs_endpoint you can pass straight to generate (strip the /v1/models/ prefix and /runs suffix to get the model argument).

Prompts

The server auto-builds one slash command per active Solution. Solutions are catalog entries with provider_slug=runflow — they encode a complete workflow behind a single endpoint. You will see prompts like:

/runflow:headshots
/runflow:logo-inpaint
/runflow:object-removal-prompt
/runflow:smart-segmentation
/runflow:upscale

Selecting a prompt fills your chat with a templated message containing the Solution’s required inputs. Edit the inputs, send, and the server dispatches the run through generate for you. Prompts are memoised per catalog snapshot, so prompts/list is cheap and initialize does not rebuild on every request.

Auth

The server accepts any bearer matching MCP_ACCEPTED_KEY_PREFIXES (default rf_live). Send the key verbatim:

Authorization: Bearer rf_live_…

401 responses include an RFC 9728 WWW-Authenticate header:

WWW-Authenticate: Bearer realm="runflow-mcp", error="invalid_token",
  resource_metadata="https://mcp.runflow.io/.well-known/oauth-protected-resource"

The /.well-known/oauth-protected-resource endpoint returns a stub resource block in v1.0; a future minor release will add authorization-server pointers for the in-browser MCP OAuth flow.

Errors

Two distinct envelope shapes, depending on where the error originates.

Transport-level (JSON-RPC)

Returned when a request is rejected before the tool handler runs — host check, body cap, auth, rate limit. Lives at HTTP 4xx/5xx.

`error.data.code`	When	HTTP
`host_rejected`	`Host` not in the allowlist.	403
`origin_rejected`	`Origin` not in the allowlist.	403
`body_too_large`	POST body > `MCP_MAX_BODY_BYTES` (256 KB default).	413
`missing_authorization`	No `Authorization` header.	401
`malformed_authorization`	Bearer prefix not in `MCP_ACCEPTED_KEY_PREFIXES`.	401
`rate_limited`	Sliding-window cap exceeded. `retry_after_seconds` included.	429

Default rate-limit window is 60s with caps of 30 req/IP pre-auth, 120 req/IP post-auth, and 600 req per distinct bearer fingerprint.

Tool-result envelope

Returned at HTTP 200 with isError: true when a tool succeeds at the transport layer but fails inside generate or list_models. Clients reading structuredContent.error get a stable shape:

{
  "error": {
    "code": "insufficient_credits",
    "message": "Top up to continue.",
    "upstream_status": 402,
    "next_action": "Top up the Runflow account balance, then retry.",
    "details": { "errors": [{ "type": "insufficient_credits" }], "path": "/v1/models/openai/gpt-image-2/runs" }
  }
}

Clients that only render text get a stable [code] message lead line.

Code	Tool	When
`polling_timeout`	`generate`	Run exceeded `MCP_POLL_TIMEOUT_MS`. `details.run` carries the latest state. Resume by re-calling with the same `client_ref`.
`generate_failed`	`generate`	Unclassified internal failure during dispatch or poll.
`insufficient_credits`	`generate`	Account balance is zero or negative. Top up at `app.runflow.io`.
`unauthorized`	`generate`	Bearer rejected by `api.runflow.io`.
`model_not_found`	`generate`	The `provider/slug` does not match an active catalog entry.
`catalog_fetch_failed`	`list_models`	Marketing-site catalog unreachable.

Troubleshooting

Symptom	Likely cause	Fix
Tools don’t appear after restart	Config file path wrong, or the agent doesn’t pick up MCP changes without a full quit	Verify the path matches your agent (`~/.claude/config.json`, `~/.cursor/mcp.json`); fully quit the agent (not just close the window); re-open.
401 on first call	Bearer is missing, malformed, or has the wrong prefix	`echo $RUNFLOW_API_KEY` in the shell that launched the agent; key must start with `rf_live_`. Service keys (`rf_svc_`) are rejected by design.
`host_rejected` 403	Running a local MCP server whose `MCP_ALLOWED_HOSTS` doesn’t include `localhost`	Set `MCP_ALLOWED_HOSTS=localhost:8080,127.0.0.1:8080` in the MCP server’s `.env` (server-side; the shell env on the client side has no effect). Don’t drop the allowlist back to `*` — it’s a DNS-rebinding defense. Production `mcp.runflow.io` is unaffected.
`polling_timeout` repeatedly on long runs	Run exceeded the 10-minute MCP poll ceiling	Supply `client_ref` on the original call, then re-invoke with the same `client_ref` to resume. For runs you know will exceed 10 min (long video), prefer `wait=false` and poll the REST API directly.
`/runflow:*` prompts list is empty	Catalog fetch failed at `initialize` time	Hit `https://mcp.runflow.io/healthz` to confirm the server is up; check your network for outbound restrictions to `mcp.runflow.io` and `runflow.io`.
`insufficient_credits`	Account balance is zero or negative	Top up at `app.runflow.io`. The error envelope’s `details.errors[].message` carries the exact remaining balance when available.

If none of the above match, run the smoke test below — the raw JSON-RPC response usually identifies the cause.

Smoke test

Two curl steps. The first proves your bearer is accepted and the skill is returned as instructions; the second proves a real tool call round-trips end-to-end.

# 1) initialize — verify auth + skill delivery
curl -X POST https://mcp.runflow.io/mcp \
  -H "Authorization: Bearer $RUNFLOW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"curl","version":"0"}}}'

You should see result.serverInfo.name = "Runflow" and result.instructions containing the agent skill.

# 2) tools/call list_models — verify a real catalog browse round-trips
curl -X POST https://mcp.runflow.io/mcp \
  -H "Authorization: Bearer $RUNFLOW_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"list_models","arguments":{"category":"solution","limit":5}}}'

You should see active Solutions (provider_slug=runflow) in result.content[0].text along with result.structuredContent. The exact count varies as Solutions ship and retire — check the live catalog for the current set.

When to use the MCP server vs. the REST API

Situation	Use
Your agent natively speaks MCP (Claude Code, Cursor, Claude.ai connector).	MCP server — one connection, prompts auto-update with the catalog.
You are building a server-side workflow, batch job, or webhook handler.	REST API — direct, no proxy.
You need cancellation, batches, webhooks, or admin endpoints.	REST API — MCP exposes `generate` + `list_models` only.
You need to share one set of catalog prompts across multiple developers.	MCP server — every connected client sees the same `/runflow:*` set.

Agent skill — the SKILL.md the MCP server returns as instructions.
Quickstart (REST) — direct HTTP integration.
Models — the catalog generate dispatches into.
Authentication — bearer lifecycle, scopes, rotation.
Errors — REST error vocabulary. MCP envelopes use lowercase snake_case codes (e.g. insufficient_credits); the REST API returns the corresponding SCREAMING_SNAKE enum (e.g. INSUFFICIENT_CREDIT). The lowercase MCP form is a direct lowercasing of the REST errors[].type; treat them as the same code under different casing conventions.

​Connect in 30 seconds

​Claude Code

​Cursor

​VS Code MCP

​Claude.ai connectors

​Tools

​generate

​Resuming after polling_timeout

​Example

​list_models

​Prompts

​Auth

​Errors

​Transport-level (JSON-RPC)

​Tool-result envelope

​Troubleshooting

​Smoke test

​When to use the MCP server vs. the REST API

​Related

Connect in 30 seconds

Claude Code

Cursor

VS Code MCP

Claude.ai connectors

Tools

`generate`

Resuming after `polling_timeout`

Example

`list_models`

Prompts

Auth

Errors

Transport-level (JSON-RPC)

Tool-result envelope

Troubleshooting

Smoke test

When to use the MCP server vs. the REST API

Related