OrchVynt Documentation
OrchVynt is a production control plane for multi-agent AI workflows. It sits between your agents and the model APIs and manages routing, fallback, token budgets, and human-in-the-loop gates declaratively.
Where to start
Platform concepts
OrchVynt exposes four orchestration primitives. Each is configured declaratively in YAML and applied to every invocation that passes through the control plane:
Per-invocation model selection based on workload type, user tier, latency, and A/B split configuration. Learn more
Automatic failover across providers and models on error codes, latency thresholds, or cost caps. Learn more
Hard limits on tokens or cost per workflow, per session, or per user tier. Invocations that would breach the budget are rejected before reaching the model. Learn more
Compliance-driven pause-and-review checkpoints that notify reviewers via Slack or webhook and record a tamper-evident audit trail. Learn more
Deployment options
OrchVynt supports two deployment modes:
- Cloud-hosted (Starter and Production tiers) — No infrastructure to manage. Connect by pointing your agents at the OrchVynt endpoint and configuring API keys.
- Self-hosted (Enterprise tier) — Docker image and Helm chart for Kubernetes. Your traffic never leaves your VPC. Zero phone-home in self-hosted mode.
Language support
The OrchVynt routing API is a REST/HTTP interface compatible with any HTTP client. First-class SDKs are available for Python and TypeScript. Any language with an HTTP client can integrate via the REST API directly.