Evaluations

Evaluations compare outputs across stored cases for a skill.

Public beta API

  • GET /api/skills/:id/evaluation-cases
  • POST /api/skills/:id/evaluation-cases
  • POST /api/skills/:id/evaluations/run
  • GET /api/skills/:id/evaluations/:evaluationRunId

POST /api/skills/:id/evaluations/run is api_key_or_access_token.

Boundary note

Evaluation results may reference underlying run ids, but inspection still follows the family-specific public run-detail routes for agent and skill flows.

See also

Was this page helpful?