Platform
Use Cases
Multi-Agent Routing Fallback Chains Token Budgets Human-in-the-Loop Integrations Docs Blog Pricing
Sign In Get Early Access
Home / Platform / Multi-Agent Routing
Use Case

Policy-driven routing across multiple agents and models

Different tasks have different cost and quality profiles. OrchVynt routes each invocation to the right model based on workload type, user tier, latency requirements, or any combination of signals — defined declaratively, not hardcoded in agent code.

The problem with hardcoded model selection

When model selection is hardcoded into agent logic, every change requires a code deploy. When a new model comes out, you update 12 files. When you want to A/B test two providers, you add branching logic to agent code that should only contain business logic.

OrchVynt separates routing policy from agent logic entirely. The agent calls orchvynt.route(). The policy lives in YAML. You change the routing strategy without touching your agents.

agent-code.py — before OrchVynt
# ❌ Routing logic inside agent if task_type == "classification": model = "gpt-4o-mini" elif task_type == "synthesis": model = "gpt-4o" elif user_tier == "premium": model = "claude-3-5-sonnet" else: model = "gpt-4o-mini" # fallback # ✅ With OrchVynt result = orchvynt.route( workload="synthesis", context={"user_tier": user_tier} )

What you can express in a routing policy

Workload-based routing

Route classification tasks to cheaper models, synthesis tasks to higher-quality models. Tag each invocation with a workload type and define the mapping in YAML.

Traffic splits & A/B routing

Split traffic across models by weight. Evaluate GPT-4o vs Claude 3.5 Sonnet on 30/70 for a workflow — observe quality metrics in your telemetry backend, then shift the split based on evidence.

Latency-aware routing

Set per-workload latency targets. OrchVynt monitors p99 latency per model and reroutes automatically when a model exceeds the threshold — without waiting for a human to notice.

Context-aware selection

Pass structured context with each invocation — user tier, content sensitivity flag, geographic region. Routing rules can express conditions like "use GPT-4o for enterprise tier users."

See routing in action in the docs