Not a chatbot. Not a side-by-side comparison. A deliberation engine where two AI models argue opposing sides, reject weak proposals, and only reach consensus when they've actually earned it — supervised by an independent judge with no allegiance to either side.
The engine doesn't follow a script. It reacts to what the models actually say — escalating when they disagree, resolving when they converge, and guaranteeing you always get a final answer. No two debates take the same path.
Inspired by "AI Safety via Debate" (Irving, Christiano & Amodei, 2018), which proposed that two AI agents debating adversarially produce more truthful answers than either could alone.
The engine is designed for genuine disagreement. Each backstop catches a specific failure and escalates to the next. They fire in order, and each one protects against the previous one being insufficient.
The debate engine is a numbered-step state machine. Each turn, models respond in parallel via server-sent events with real-time streaming to your browser. The engine tracks rejection counts, convergence scores, budget consumption, vote state, merge rounds, and revision history — reacting dynamically to what the models actually produce.
Provider-agnostic by design. Each debater can be any combination of OpenAI GPT, xAI Grok, Google Gemini, or Anthropic Claude. The engine automatically picks the best model pair based on question complexity — routing simple questions to faster models and complex ones to heavier reasoning tiers. The judge is always from a different provider than either debater. Real-time cost tracking keeps every debate within budget, and models have live web search and code execution so they argue with current data and verifiable calculations.
Not every question needs a debate. Sometimes you just want to see what four different AI providers think. Use the right tool for the question.
A real debate about consciousness. Two AI models with genuinely irreconcilable philosophical positions, fighting through backstops, judge challenges, and status checks until the judge steps in.
No subscriptions. Flat pricing per debate. The engine automatically picks the right models based on your question's complexity — you just ask and pay the same price every time.
AI models are confident. They're articulate. They're often wrong. The only reliable way to find the truth is the same way humans have always done it — put two smart minds in a room and let them argue until what's left is what actually holds up.
Asking one model to double-check itself searches the same training data twice. Different providers means different training — different blind spots, different gaps. What one misses, another was trained on. The debate is what filters the signal from the noise.
Start a DebateLeading the development of cross-model deliberation systems — orchestrating structured debate between AI models from competing providers (OpenAI, Google, Anthropic, xAI) through a single, unified interface.