Competency framework evaluation
A score only counts when it's anchored
in the competency framework.
Every roleplay template declares which competencies each scenario tests. The AI scores exactly those criteria, no keyword guessing, no one-size-fits-all catalog imposed on you.
Session report
Medical Visit, Skeptical Cardiologist
Trainee: Marcela R. · Channel: Voice · 12 min
87
passed
Competencies locked when the session started
PROD-001
Product mastery
92
OBJ-003
Objection handling
78
COMP-014
Label compliance
95
Evaluated criteria
AI insights · Strengths
Anchored the pitch in the HCP's hypertensive patient profile by 1:15. Cited a phase-3 study when challenged on efficacy.
Areas to improve
At 4:32 the HCP asked about interaction with beta-blockers and the response was vague ("I'll check and get back to you"). Recommendation: targeted training on drug interactions.
Your company's competency framework
Every company has its own catalog of competencies and criteria. It starts from ready-made industry templates at onboarding, then it's fully editable: you add competencies specific to your business that don't exist in any catalog.
The AI scores. Code decides.
The AI owns scoring. The pass or fail rule is auditable code, including "compliance blockers" that fail the session even with a high score (for example, violating the label fails the session even with 95 overall).
Everything locked for audit
Criteria locked when the session starts. AI instructions pinned to a specific version. Transcript, audio and report stored with configurable retention. Audit comes ready.
From the framework to the report.
The entire chain is predictable and auditable.
01
Framework curation
The company's admin edits competencies, criteria and scenario contexts. Add, edit, deactivate, all versioned.
02
The template declares
In the step-by-step, the author picks which competencies each scenario tests. Each criterion's weight is configurable.
03
Roleplay locks the snapshot
When the session starts, the criteria are snapshotted on the roleplay. Even if the template is edited later, the session runs against that original version.
04
AI scores, code decides
The AI receives the transcript and instructions, returns a structured score per criterion, and the pass-or-fail rules are applied. The full result is persisted for audit.
Why not run several AIs in parallel
Several AIs together don't add up, they diverge.
We tried running four models in parallel and averaging the result. The problem: each model has a different systematic bias, and the average dilutes the signal from whichever one got it right.
Instead we use a single model picked for each function, with versioned instructions measured against the rubric. Predictable, investigable, and comparable across sessions.
Running several AIs in parallel
- ✗ 4x the cost without 4x the confidence
- ✗ Dilutes divergent bias
- ✗ Hard to investigate a single score
- ✗ Inconsistent comparison across sessions
One model per function
- ✓ Cost controlled per call
- ✓ Versioned and auditable instructions
- ✓ Reproducible result
- ✓ Consistent comparison across sessions
Pairs perfectly with
Adaptive Track
Framework gap, automatic roleplay
The framework on this page is what the Adaptive Track uses to map gaps onto competencies.
Learn more →Dashboards
Evolution per competency
Watch every team member rise (or drop) in each framework criterion over time.
Learn more →Compliance
Audit trail of every call
Instructions, model, tokens, cost and latency, all on record for regulatory audit.
Learn more →Ready to transform how your team trains?
For organizations with 50+ employees. Book 45 minutes and we'll think the setup through with you.