OrchVynt Documentation

OrchVynt is a production control plane for multi-agent AI workflows. It sits between your agents and the model APIs and manages routing, fallback, token budgets, and human-in-the-loop gates declaratively.

Where to start

Quickstart

Get OrchVynt running in under 10 minutes with Docker and a minimal YAML config.

API Reference

Full reference for the OrchVynt routing API, configuration schema, and telemetry events.

Platform concepts

OrchVynt exposes four orchestration primitives. Each is configured declaratively in YAML and applied to every invocation that passes through the control plane:

Routing Engine

Per-invocation model selection based on workload type, user tier, latency, and A/B split configuration. Learn more

Fallback Chains

Automatic failover across providers and models on error codes, latency thresholds, or cost caps. Learn more

Token Budget Enforcement

Hard limits on tokens or cost per workflow, per session, or per user tier. Invocations that would breach the budget are rejected before reaching the model. Learn more

Human-in-the-Loop Gates

Compliance-driven pause-and-review checkpoints that notify reviewers via Slack or webhook and record a tamper-evident audit trail. Learn more

Deployment options

OrchVynt supports two deployment modes:

Cloud-hosted (Starter and Production tiers) — No infrastructure to manage. Connect by pointing your agents at the OrchVynt endpoint and configuring API keys.
Self-hosted (Enterprise tier) — Docker image and Helm chart for Kubernetes. Your traffic never leaves your VPC. Zero phone-home in self-hosted mode.

Language support

The OrchVynt routing API is a REST/HTTP interface compatible with any HTTP client. First-class SDKs are available for Python and TypeScript. Any language with an HTTP client can integrate via the REST API directly.

OrchVynt is in early access. The API surface is stable for the core primitives. We're actively expanding SDK coverage and documentation. If you hit gaps, contact us directly.