Skip to main content
The API enforces a default request-rate limit, sized for typical production load. Exceed it and the call returns 429 Too Many Requests with the standard error envelope:
{ "code": "RATE_LIMITED", "message": "Rate limit exceeded: ..." }
429 responses do not currently carry rate-limit headers, so do not branch on Retry-After or X-RateLimit-*. Treat a 429 as the signal to slow down, and retry with backoff.

Backoff

Retry on 429 with exponential backoff plus jitter. Pseudocode:
import random, time

delay = 1
for attempt in range(5):
    response = call_runflow()
    if response.status != 429:
        break
    time.sleep(min(delay, 32) + random.uniform(0, 1))
    delay = min(delay * 2, 32)

Run capacity

Runs are dispatched asynchronously onto a shared worker fleet. When capacity is busy, a run waits in the queued status until a worker is free rather than being rejected, so a sustained backlog shows up as runs sitting in queued (see Runs for the lifecycle). If queue times grow beyond what your workload can tolerate, get in touch about higher throughput.

Need a higher limit?

Talk to us. Include peak QPS, average payload size, and target concurrency.

Errors

All status codes including 429.

Pricing

Per-request and per-second pricing.