What Is the Oral Defense?
The oral defense is the final component of Level 3 certification (EAS/CAEE) at eval.qa. Unlike the written exam—which tests breadth of knowledge across eval domains—the oral defense assesses your ability to think on your feet, defend technical decisions, and communicate eval concepts to critical audiences.
It is a 30-45 minute panel examination conducted by 2-3 master-level eval practitioners who ask probing questions about your portfolio of eval work, your methodology, your reasoning, and your ability to handle edge cases and criticism. The goal is not to stump you, but to ensure you have genuine expertise—not just memorized answers.
Why does it exist? Because eval expertise is as much about judgment and reasoning as it is about knowledge. Written exams can be passed by smart people who haven't done real work. Oral defenses cannot.
Format Overview: 30-45 Minutes, 2-3 Evaluators
Timeline
- Opening (5-10 min): You deliver a prepared 5-10 minute case study presentation from your eval portfolio.
- Questions (20-30 min): Evaluators ask questions covering technical understanding, methodology justification, stakeholder communication, ethical reasoning, and practical application.
- Closing (2-3 min): You summarize your key insights or ask clarifying questions of evaluators.
Panel Composition
Typically 2-3 experienced evaluators from different backgrounds: a practitioner with deep production eval experience, an academic or researcher with methodological rigor background, possibly a domain specialist (healthcare, finance, AI safety, etc.). This diversity ensures your defense is tested from multiple angles.
Remote or In-Person?
Typically conducted via video conference (Zoom, Google Meet) to accommodate geography. Some in-person defenses available in major cities. Technical requirements: stable internet, quiet room, clear audio/video, ability to share screen.
The oral defense is not designed to be adversarial or trick you. Evaluators want to understand your thinking, test your ability to handle critique, and ensure your expertise is genuine. If you've done real eval work and can articulate your reasoning, you will pass.
The Five Examination Domains
1. Technical Eval Knowledge
What's being tested: Do you understand eval fundamentals deeply? Can you explain metrics, benchmark design, statistical validity, and technical tradeoffs?
Example questions:
- "Walk us through how you designed the eval for your case study. Why those metrics?"
- "What is the difference between your evaluation methodology and a simpler baseline approach? Why is that difference important?"
- "Explain the statistical validity of your results. How confident are you in your findings?"
- "What eval techniques did you explicitly avoid, and why?"
2. Methodology Justification
What's being tested: Can you defend your choices? Do you understand the why behind your eval design, not just the what?
Example questions:
- "You chose to do [specific eval approach]. Another evaluator might have chosen [alternative approach]. Why is yours better?"
- "What were you trying to optimize for? What were you willing to trade off?"
- "How would you respond to criticism that your methodology is too [narrow/strict/lenient]?"
3. Stakeholder Communication
What's being tested: Can you communicate eval findings to non-technical audiences? Do you understand how to present results to build trust and drive decisions?
Example questions:
- "In your case study, you found [result]. How did you communicate this to your CEO/product team/compliance? What did they need to know?"
- "How would you explain your eval methodology to someone with no background in evaluation?"
- "Describe a time when your eval results contradicted what stakeholders wanted to hear. How did you handle that?"
4. Ethical Reasoning
What's being tested: Do you think about the ethical implications of your eval? Do you understand potential harms (false positives, bias, unfair comparison)?
Example questions:
- "Who could be harmed by your eval design or results? What did you do about it?"
- "Did you consider bias in your evaluation methodology? How did you mitigate it?"
- "Your eval might be used to make hiring/financial/medical decisions. How does that change your responsibility?"
- "Describe an ethical tension in your case study and how you resolved it."
5. Practical Application
What's being tested: Can you apply eval knowledge to real problems? Can you iterate and improve based on feedback?
Example questions:
- "Your eval methodology worked well in 2024. How would you adapt it for new models / new domains / new use cases?"
- "If you were to redo this evaluation, what would you do differently?"
- "How would your eval change if constraints changed (smaller budget, faster timeline, different stakeholders)?"
- "Tell us about a failure in your eval work and what you learned from it."
Preparing Your Eval Portfolio for the Defense
What to Bring
- Primary case study (physical copy + digital): One well-documented eval project you've led or significantly contributed to. 15-20 pages max. Include: context, problem statement, methodology, results, implications, reflections on what worked/what didn't.
- Evidence portfolio: Slide deck (10-15 slides) with highlights from 2-3 additional projects. You won't present these in detail, but evaluators can ask about them. Includes: project title, metrics, key findings, any visualizations or dashboards.
- Data/notebooks (optional): If comfortable, have notebooks or datasets ready to share. Shows rigor, but not required.
- Notes on methodology and reasoning: A one-page document articulating your eval philosophy, key decisions, and how you think about tradeoffs.
Constructing Your Primary Case Study
Choose a project where:
- You made real decisions (not just executed prescribed methods)
- You had to navigate tradeoffs or constraints
- Results were surprising or counterintuitive (if possible)
- You can articulate what worked and what you'd do differently
- You understood the broader context and stakeholders
Structure:
- Context (2-3 pages): What was the business/research problem? Who were the stakeholders? What was the timeline/budget/constraints?
- Methodology (4-5 pages): How did you design the evaluation? What metrics, baselines, datasets? Why these choices?
- Results (3-4 pages): What did you find? Visualizations, tables, confidence intervals. Be precise.
- Implications (2-3 pages): What do your findings mean? What decisions did they enable? What are the limitations?
- Reflection (2-3 pages): What surprised you? What would you do differently? What did you learn?
Tone: Honest, not polished marketing. Evaluators respect clarity, humility, and willingness to acknowledge limitations more than perfection.
Presenting a project where you followed a template without making real decisions. Overselling results or hiding limitations. Choosing a project you can't defend in depth. Submitting something from five years ago that you've forgotten the details of. Keep your portfolio recent and genuine.
Opening Presentation Format: 5-10 Minutes, Tight Structure
Your opening presentation is your chance to frame the conversation. Make it punchy and strategic.
Recommended Structure
- "The Question" (1 min): Start with the business/research question that motivated your eval. Not background, not context—the core question. Example: "We built a new medical coding AI. Before deploying to hospitals, we needed to answer: does it reduce clinician burden without increasing error rates?"
- "The Approach" (2 min): High-level description of how you designed the evaluation. Key decisions. Example: "We ran a prospective study with 50 clinicians, comparing the AI against current workflow on three dimensions: speed, accuracy, and human satisfaction."
- "Key Finding" (1 min): Your most important result. Be specific. Example: "The AI reduced documentation time by 35% (p<0.01) with no significant difference in error rates, but clinician satisfaction dropped 20%."
- "So What?" (1 min): What decision did your eval enable? Example: "We decided to deploy with mandatory user training and weekly feedback loops, based on the satisfaction finding."
- "The Tradeoff" (1-2 min): Articulate one key limitation or tension in your eval. Show self-awareness. Example: "Our study was small and geographically limited. We're treating this as a pilot, knowing we'll need follow-up evaluation with larger and more diverse cohorts."
What NOT to Do
- Avoid lengthy background/context. Evaluators read your portfolio. Use the opening to highlight and connect, not to repeat.
- Don't oversell results. Evaluators are trained to detect spin.
- Don't pack the opening with technical jargon unless you know your audience.
- Don't read from slides. Make eye contact. Speak naturally.
Anticipated Question Categories: 20+ Sample Questions
Methodological Questions
- Walk us through your metric selection. How did you decide what to measure?
- Why did you choose [specific evaluation approach] over [alternative]?
- How did you handle [identified bias/confound/limitation]?
- What would change if you had 10x the budget? 1/10th the budget?
- Explain the statistical analysis. Why that test rather than another?
Scenario-Based Questions
- A stakeholder disagrees with your findings. How do you respond?
- Six months after your evaluation, new data suggests the opposite result. How do you investigate?
- You're evaluating a system for hiring/medical/financial decisions. How does that change your methodology?
- What evaluation would you run if the system were to be deployed in a completely different context?
Reflection & Growth Questions
- Tell us about an eval project that didn't go as planned. What happened?
- What's the most important thing you learned from this case study?
- If you were to redo this evaluation from scratch, what's one thing you'd do completely differently?
- How has your eval philosophy evolved over your career?
Ethical & Professional Questions
- Who are the stakeholders who could be harmed by your evaluation or findings?
- Did you consider bias in your evaluation methodology? How?
- How would you respond if your findings contradicted your organization's business goals?
- What does ethical evaluation mean to you?
Communication Questions
- Explain your methodology to someone with no background in evaluation.
- How did you communicate uncertainty to stakeholders?
- Describe a time you had to deliver bad news to leadership based on your eval. How did you frame it?
Common Mistakes Candidates Make
Over-Scripting
The mistake: Memorizing answer word-for-word. When a question is phrased slightly differently, you get thrown off.
The fix: Prepare themes and key points, not scripts. Practice articulating the same idea in different ways. Evaluators can tell when you're reciting versus thinking.
Defensive Posture
The mistake: Taking critical questions as attacks. Answering with justifications rather than exploration. "That's not a fair question because..."
The fix: Treat tough questions as invitations to demonstrate your thinking. "That's a great point. Here's how I thought about that, and here are the tradeoffs..." Evaluators are impressed by intellectual humility, not defensiveness.
Inability to Admit Uncertainty
The mistake: Overconfident answers. Pretending you knew the answer when you didn't or claiming certainty where it doesn't exist.
The fix: "I don't know" is a strong answer if followed by "Here's how I'd approach finding out..." or "Here's what I should have done..." Show your reasoning process, not just your knowledge.
Losing the Thread
The mistake: Getting lost in technical details during the opening presentation. Forgetting to anchor back to the original question or stakeholder need.
The fix: Before each answer, pause and ask yourself: "How does this connect to the core eval question?" Explicitly articulate that connection.
Ignoring Non-Technical Dimensions
The mistake: Focusing entirely on methodology and metrics, ignoring stakeholder communication, ethical dimensions, or practical constraints.
The fix: In your opening and throughout, weave in evidence that you understand eval as a human/organizational endeavor, not just a technical one. "Our stakeholders needed to understand this result in 2 weeks, which is why we chose..."
You can articulate the business/research question clearly. You can defend every methodological choice. You acknowledge limitations without apologizing for them. You can pivot between technical and non-technical language. You show curiosity about evaluator feedback and questions. You're comfortable saying "I don't know, but here's how I'd find out."
How Evaluators Score the Defense: The Rubric
Passing threshold: Evaluators must rate you at least "Proficient" on 4/5 dimensions and no more than one "Developing" rating. Anything below "Developing" is an automatic fail.
Practice Session Guide: How to Prepare Effectively
Phase 1: Solo Preparation (2-3 weeks before defense)
Week 1: Finalize your case study. Write it out. Read it aloud. Time yourself on the opening (aim for 7-8 minutes).
Week 2: Generate anticipated questions (use the list above). For each, write a 2-3 sentence core answer. Practice pivoting to different angles.
Week 3: Record yourself presenting the opening. Listen back. Refine clarity, pacing, eye contact cues (even though you're recording).
Phase 2: Peer Mock Defenses (1-2 weeks before)
Find 2-3 peer evaluators: Ideally other eval professionals or people who've passed the Level 3 defense. If not available, anyone with critical thinking skills works.
Structure each mock (45 min total):
- 5-7 min: Your opening presentation
- 25-30 min: Evaluators ask questions (use the anticipated list)
- 10 min: Feedback. What worked? Where were you unclear? Where could you be stronger?
Do 2-3 mocks minimum. First one will be rough. Second and third will reveal patterns in your weaker areas.
Phase 3: Refinement (1 week before)
Focus on the weaknesses revealed in mocks. If you struggled with ethical questions, prepare deeper answers on ethics. If communication was unclear, practice simplifying explanation. If you were defensive, practice responding to tough questions with curiosity.
Do one final mock with critical feedback.
What Feedback to Prioritize
- Clarity: "I didn't follow your explanation." Address immediately.
- Coherence: "I don't see how X connects to Y." Explicitly articulate connections.
- Depth: "You said that but didn't justify it." Add reasoning.
- Confidence: "You seemed unsure of that answer." Practice the answer until it's solid.
Day-of Protocol: Logistics and Nerves
Technical Setup (30 min before)
- Test your internet connection, camera, microphone, speaker.
- Open your slides or materials in a separate window.
- Have your case study document easily accessible (but don't read from it).
- Quiet room, no distractions, door closed.
- Wear business casual (you're being recorded/observed).
Mental Preparation (15 min before)
- Review your opening one time. Don't memorize—just familiarize.
- Remember: evaluators want you to pass. They're rooting for you.
- Nerves are normal and expected. Evaluators know you'll be nervous.
- Focus on authenticity over perfection. A genuine answer with some stumbling is better than a polished non-answer.
During the Defense
- Opening: Make eye contact with the camera (imagine evaluators are there). Speak slowly and clearly.
- Questions: Pause before answering. Take your time. If you need a moment to think, say so.
- If you don't know: "That's a great question. I didn't encounter that in my case study, but here's how I'd approach it..." or "I'm not sure, but here's what I would need to research."
- If you misspeak: "Let me rephrase that..." and continue. Don't apologize profusely.
- Take notes: Jotting down evaluator questions shows you're engaged and helps you remember to follow up if time permits.
Managing Nerves
- Deep breathing before the call and between segments.
- Remember: you've prepared. You know this material. Evaluators know you're nervous.
- Nerves show you care. That's a good sign.
- If your voice shakes, keep going. It's normal.
Preparation Timeline: 6 Weeks to Defense
- Weeks 1-2: Finalize case study. Write it up. Get feedback from mentor or peer.
- Weeks 3-4: Develop anticipated questions list. Write core answers. Practice opening presentation 5+ times.
- Weeks 5-6: Conduct 2-3 mock defenses with peers. Get critical feedback. Refine weak areas. Do final preparation.
- Day before: Light review. Get good sleep. Avoid cramming.
- Day of: Tech check 30 min before. Breathe. You're ready.
Prepare for Your Level 3 Oral Defense
Get access to mock defense prompts, rubric breakdowns, and mentor pairing through eval.qa Certification Labs.
Exam Coming Soon