DevDebug Team
Team led by an Insight Centre emerging technologies researcher with an MSc from DCU, skilled in Digital Twins, IoT integration, GIS, and data analytics.
Project Description
DevDebug — an AI debugging assistant that argues with itself before giving you a fix. Developers have lost hours to an error message that turned out to be a one-line fix. The deeper problem isn’t the lack of AI debuggers, it’s that single-LLM debuggers hallucinate confidently. They propose a fix, you trust it, and it makes things worse.
DevDebug solves this by implementing Lyzr’s ACE-V (Proposer–Challenger–Judge) tri-agent governance protocol for code debugging. A developer pastes an error message and optional config file. Four agents execute in strict sequence:
- Intake parses the error and config into structured data (language, runtime, packages, error class).
- Proposer (Perplexity sonar-pro) generates a root-cause hypothesis and concrete fix, verified against the npm registry and live web search.
- Challenger (Claude Sonnet, different model family by design) actively tries to falsify the Proposer’s fix — checking for version conflicts, deprecated APIs, runtime mismatches, and transitive dependency issues.
- Judge (GPT-5.1, a third independent model) issues a final verdict with a confidence score, changing the Proposer’s recommendation only when the Challenger produces new evidence — never on rhetoric.
The user watches the deliberation live: agent outputs stream into a transcript, evidence cards expand inline, and a final Verdict card auto-scrolls into view with a copyable fix. Because each agent runs on a different model provider, the disagreement is genuine, not theater.
Why this matters: The ACE-V pattern means the output can be trusted in a way a single-LLM debugger’s output cannot. The architecture is general, any high-stakes agent decision benefits from an adversarial challenger and an evidence-only judge. DevDebug is a deployable instance of a pattern that scales to any domain where hallucinated confidence is dangerous.