Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.runflow.io/llms.txt

Use this file to discover all available pages before exploring further.

Run latency varies by model category and by the size/complexity of the input. Use the ballparks below to size your client behavior; for actual measurements per model, see Per-model performance.

Ballparks by category

CategoryTypical (p50)Tail (p95)What drives it
text-to-image4–15 s30 sImage model + resolution + step count. 1K vs 4K is the biggest factor.
image-to-image8–30 s60 sEdit-style models (Nano Banana Pro Edit, GPT Image 2 Edit) sit at the lower end. Workflow Solutions that chain models (object-removal, reference-inpaint, background-replace) sit higher because they run multiple steps + an evaluator.
text-to-video60–300 s600 sDuration × resolution × model. A 5 s 1080p Wan run is ~90 s; a 15 s 4K Veo or HeyGen run can push 5 min.
image-to-video60–300 s600 sSame drivers as text-to-video plus reference processing time.
video-to-video90–600 s900 sWan video edit, Happy Horse video edit. Heaviest in the catalog.
text-to-audio3–10 s20 sElevenLabs v3 TTS, Gemini TTS. Sub-10 s for short utterances.
These are full lifecycle measurements: queueddispatchingrunningsucceeded as observed via GET /v1/runs/{id}. Network round trips not included.
Use the p95 column to size client-side timeouts, not p50. A timeout below p95 will produce false-positive failures on long-tail runs that would have succeeded.

What this means for your client

DecisionRecommendation
Polling interval2 s for image categories, 10 s for video. Polling faster wastes API quota without changing your latency.
Skeleton / “generating” UIShow a progress affordance based on the category’s p50, not a hardcoded value. A user who waits 4 s for a text-to-image run gets a snappier feel than the same user waiting 4 s for a text-to-video.
HTTP request timeoutAt least 2× the p95 for the category, or use a callback URL (Callbacks) to avoid client timeouts entirely.
Retry strategyDon’t retry the same run on timeout. That creates a duplicate. Use client_ref for idempotency, or just poll longer. See Errors for the retry table.
User-facing copyRotate between phrases at intervals matched to category p50 ("Generating...""Touching up...""Almost there..."). Static "Loading..." for a 4-minute video run is bad UX.

Why we don’t publish per-model latency hints (yet)

A p50_seconds / p95_seconds field on GET /v1/public/models is on the public catalog discoverability plan (Phase D). Until that ships, use the category ballparks above. If you need per-model precision, the authenticated /v1/models/{id}/run-performance-stats endpoint returns aggregated stats from your org’s recent runs.

Solutions that include output evaluation

Some Solutions run a quality evaluation step on the output before returning. That step adds 2-4 minutes to the total wall-clock time on top of the underlying model run. Read each solution’s page at www.runflow.io/api for whether the solution evaluates output before returning, and use (category p95) + 4 min to size timeouts for solutions that do.

Runs

Lifecycle, statuses, output shape.

Callbacks

Skip polling entirely for long runs.

Rate limits

Quota and back-off rules.

Errors

Retry table per status code.