Evaluations (Public Beta)

Evaluations provide structured comparison runs for skills.

Public beta scope:

  • list cases
  • create case
  • run evaluation
  • get evaluation

SDK methods

  • listCases(skillId, { organizationId }) -> GET /api/skills/:id/evaluation-cases
  • createCase(skillId, { organizationId, ... }) -> POST /api/skills/:id/evaluation-cases
  • run(skillId, { organizationId, draftVersionId? }) -> POST /api/skills/:id/evaluations/run
  • get(skillId, evaluationRunId, { organizationId }) -> GET /api/skills/:id/evaluations/:evaluationRunId

Beta notes

  • Evaluation surfaces are public beta.
  • Only the operations listed above are supported in this release.

Was this page helpful?