Voice Agents
Voice agents are the realtime orchestration entrypoints for live conversations. They wrap prompt versions, add voice configuration, define handoff targets, and start inspectable realtime sessions.
What voice agents are
A voice agent currently includes:
- name and stable agent key
- status
- versions
- published version state
- a root prompt version for each agent version
- optional voice configuration
- optional handoff description
Current product behavior
The app presents voice agents as a distinct resource from text prompts and multi-step agents:
- prompts define the core behavior
- tools provide external capabilities
- voice agents define realtime entrypoints and handoffs
- realtime sessions record live execution
The backend currently exposes these routes under /api/realtime-agents and alternate /api/voice-agents aliases.
Session flow
Creating a realtime session returns:
- a run id
- a provider/model pair
- an external realtime session id
- a client secret for the realtime provider
- the resolved agent graph snapshot used for the session
That graph includes the prompt composition and the tools available to each participating agent version.
API workflow
Create a voice agent version
curl -X POST "$API_BASE/api/realtime-agents/agent_123/versions" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"organizationId": "org_123",
"promptVersionId": "prompt_version_123",
"voice": "alloy",
"handoffDescription": "Transfer billing questions to the billing specialist"
}'
Create a realtime session
curl -X POST "$API_BASE/api/realtime-agents/agent_123/session" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"organizationId": "org_123"
}'