Use Case

Provider fallback without retry logic in your agents

Provider outages and rate limits happen. OrchVynt handles failover automatically when triggers are met — 429, 5xx, latency spike, or cost cap — progressing through your configured fallback tiers without any code in your agents.

One chain definition, applied everywhere

Instead of every agent implementing its own retry logic — with inconsistent trigger conditions and varying tier structures — define the fallback chain once in OrchVynt and have it applied consistently across every invocation.

Provider-level: OpenAI fails → route to Anthropic → route to local Ollama model

Model-level: GPT-4o hits cost cap → downgrade to GPT-4o-mini for the session

Latency-triggered: p99 over 8s → switch to faster model automatically

Each tier activation recorded in telemetry with timestamp, trigger, and duration

fallback.yaml

fallback: chain: - tier: primary provider: openai model: gpt-4o - tier: secondary provider: anthropic model: claude-3-5-sonnet - tier: emergency provider: ollama endpoint: http://local-gpu:11434 model: mistral-7b triggers: on_status_code: [429, 500, 503] on_latency_p99_ms: 9000 emit_telemetry: true

What OrchVynt handles that you'd otherwise build yourself

Automatic tier progression

When the primary tier fails, OrchVynt retries on the secondary without any agent-side logic. If secondary also fails, it continues to the next tier. No manual retry loops needed.

Fallback telemetry

Every fallback activation is a structured event in your telemetry — which tier, why it activated, how long the activation lasted, and when it resolved back to primary.

Alert on persistent degradation

Configure alerts when fallback activation rate exceeds a threshold — so your team knows when a provider is behaving abnormally before it becomes an incident.

Build your fallback chain today

Read the Docs Get Early Access