Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.runflow.io/llms.txt

Use this file to discover all available pages before exploring further.

Rate limits exist per organization. Defaults are sized for typical production load. Hit a limit, get 429 Too Many Requests with a Retry-After header in seconds.

Headers

Every response carries:
HeaderMeaning
X-RateLimit-LimitCalls allowed in the current window.
X-RateLimit-RemainingCalls left in the window.
X-RateLimit-ResetUnix timestamp when the window resets.
Retry-AfterSeconds to wait before the next call. Sent only on 429.

Backoff

Honor Retry-After when present, fall back to exponential-with-jitter otherwise. Pseudocode:
import random, time

delay = 1
for attempt in range(5):
    response = call_runflow()
    if response.status != 429:
        break
    retry_after = response.headers.get("Retry-After")
    try:
        wait = float(retry_after) if retry_after else delay
    except ValueError:
        wait = delay  # ignore HTTP-date form, use exponential
    time.sleep(min(wait, 32) + random.uniform(0, 1))
    delay = min(delay * 2, 32)

Concurrent runs

Each model has a per-org concurrency cap. Exceed it and runs queue. Watch the queued status duration and request a higher cap if it grows.

Need a higher limit?

Talk to us. Include peak QPS, average payload size, and target concurrency.

Errors

All status codes including 429.

Pricing

Per-request and per-second pricing.