AI agrees with you too much. Ask it to review your paper and it'll tell you it's great. Ask about a clinical decision and it'll find reasons you're right. This is called sycophancy, and it makes AI nearly useless for the one thing it could be best at: telling you where you're wrong.
The Challenge Circle is a practice. You deliberately set up AI to go hard on you -- to challenge your reasoning, surface hidden assumptions, and argue the other side. Not because AI is always right, but because the process of defending your thinking against pushback makes you sharper.
How It Works
You read a paper. You think you understand it. Now you tell the AI: tear it apart. What assumptions did the authors make that I'm not seeing? What's the weakest link in their methodology? If I cited this paper to support my argument, how would a skeptic dismantle it?
Or you're making a clinical decision. You feel good about the plan. So you ask: argue against this. What am I underweighting? What's the most likely way this goes wrong? What would a colleague who disagrees with me say, and why would they be right?
The point isn't that AI will always find real problems. Sometimes your reasoning is solid and the AI's pushback is weak. That's fine -- now you know your argument holds up. But more often than you'd expect, it surfaces something you genuinely missed. An assumption you didn't realize you were making. A confounder you overlooked. A counterargument you hadn't considered.
The Rule
If you're confident you're right, that's exactly when you need the Challenge Circle. Confidence is where blind spots hide.
Prompts That Actually Work
The key is being explicit. If you just say "review this," you'll get polite suggestions. You have to tell the AI what you want.
I just read [paper title/description]. I'm going to share the key findings and my interpretation. Your job is to challenge everything:
- What assumptions are the authors making that aren't stated?
- What are the weakest parts of their methodology?
- What alternative explanations for their results am I ignoring?
- If I'm wrong about my interpretation, why?
Don't hedge. Don't validate. Go hard.
For clinical reasoning:
I'm planning [treatment/decision] for [situation]. Argue against this choice as strongly as you can.
What risks am I underestimating? What would the root cause analysis say if this goes wrong? What's the strongest case for a completely different approach?
For research design, same idea -- ask the AI to be the hostile reviewer before you submit and get one for real.
Multiple Models, Multiple Angles
Andrej Karpathy talks about a "Council of AIs" -- asking multiple models the same question. This pairs well with the Challenge Circle. Run the same critique through ChatGPT, Claude, and Gemini. They have different training, different tendencies, and they'll find different problems.
The disagreements between models are often where the real insights are. If all three flag the same issue, listen. If they disagree, that's where the interesting uncertainty lives.
Making It Stick
A few things I've learned from doing this:
Prime for criticism, not helpfulness. Open with "your job is to critique" or "argue against." This overrides the default agreeable mode.
Specify what harsh means. Tell it not to hedge, not to sandwich criticism with praise. "Don't be polite" is a useful instruction.
Give it a role. "You're a skeptical IRB reviewer" or "you're a senior attending who thinks I'm wrong." Role framing changes the output significantly.
Iterate. If the first response is still too soft, push back. "That's not hard enough. What are you holding back?"
Separate critique from revision. Get the full critique first. Fix things later, separately. Mixing them dilutes the criticism.
When to Use This
Not for everything. You don't need a challenge circle to draft an email. But you need it when stakes are real: a manuscript before submission, a study design before IRB, a clinical decision with genuine uncertainty, a talk before you give it.
And especially when you've already made up your mind. That's when your thinking is most vulnerable to blind spots, and that's when the Challenge Circle earns its keep.
