The Over-Rating Risk: Why AI Needs Human Calibration in Behavioural Event Interviews

Siân Bennett
Nov 27, 2025
3 min read

Behavioural Event Interviews (BEIs) have long been a cornerstone of leadership assessment, offering deep insights into how individuals think, act, lead and make decisions in real-world scenarios. Traditionally, these interviews rely on skilled assessors to probe for evidence and evaluate behaviours against a structured framework, at GFB, this is our Schroder High Performing Management Competency model.

With the rise of AI, many organisations are asking: Can technology enhance this process without losing its human touch? In fact, a growing number of our clients come to us specifically asking how we can use AI to make interviews more efficient without compromising quality. This question is shaping the future of leadership evaluation.

Where AI Adds Value

AI brings undeniable advantages to BEIs:

• Transcription Accuracy: Tools can transcribe interviews in real time, saving hours of manual effort and allowing interviewers to focus on probing deeper, rather than writing notes.

• Behaviour Tagging: AI can spot patterns linked to leadership attributes, saving substantial analysis time.

In our recent trial, AI demonstrated strong capability in identifying and flagging behaviours aligned with the Schroder framework’s 11 competencies across four behavioural clusters, something that would take a human assessor much longer to do.

The Over-Zealous Rating Problem

However, our research revealed a critical flaw: AI was generous, too generous, with its ratings. While it correctly identified behaviours, it often assigned scores far higher than our Schroder behavioural framework would recommend. This highlights a key risk: AI lacks the nuanced judgment that experienced assessors bring. Competency frameworks are designed to calibrate behaviours against clear standards and overinflated ratings can distort development priorities or even recommend hiring someone whose ability is below the required skillset.

Why This Happens

AI models typically learn from large datasets and optimise for pattern recognition, not for subtlety. They lack Situational Awareness Calibration (SAC) and cannot assess consistency, impact, or how behaviours manifest across contexts. As a result, AI may interpret any positive behaviour as “high performance,” missing the contextual depth that human assessors naturally apply after years of structured training and experience.

The Future: AI as an Assistant, Not a Judge

The takeaway? AI should support, not replace, human expertise in BEIs.

Use AI for transcription, tagging, and summarising.
Rely on consultants for calibration, interpreting nuances, and coaching.

Why Consultants Still Matter

Here’s the bigger picture: consultants do far more than scoring interviews. They bring strategic, developmental and ethical value that makes BEIs meaningful and actionable:

• Strategic Alignment: Consultants ensure assessments link directly to organisational priorities, not just generic leadership traits.

Depth of Insight: They probe for context, consistency and impact, turning raw behavioural data into a clear picture of capability.
Development Translation: Beyond identifying strengths and gaps, consultants craft tailored development plans that drive real growth.
Risk Management: They safeguard against bias and misinterpretation, ensuring decisions are fair and defensible.
Credibility and Trust: Consultants provide transparency and human judgment, building confidence among candidates and stakeholders in high-stakes decisions.

In short, consultants make BEIs more than a diagnostic assessment, they turn them into a strategic tool for leadership.

Ethical and Practical Considerations

We also have to consider the ethical implications of using AI:

Bias: AI systems can amplify unfair biases if training data isn’t representative or is skewed.
Privacy: Interview recordings must comply with GDPR and organisational policies.
Transparency: Candidates should know when AI is involved in assessment.

Final Thought

AI is exciting, but it’s not a silver bullet. It can accelerate the mechanics of BEIs, yet it cannot replace the human judgment that ensures fairness, strategic alignment and developmental impact. The strength of a robust framework lies in combining behavioural science, clear developmental levels and organisational relevance. AI should streamline, not substitute Consultants, making space for experts to safeguard outcomes and uphold values. In leadership assessment, nuance matters and that’s something no algorithm can fully replicate.

If you are interested in discussing how GFB can support in your Leadership Recruitment or want to know more about Behavioural Event Interviews for your organisation please get in touch.