ClarityAI
Team consisting of an AI Software Engineer from Coso.ai (RAG, LLMs, Python/TypeScript) and a GTM Account Executive from Stripe.
Project Description
ClarityAI is like combining Grammarly with the Apple AirPods Pro 3 Live Translation feature — but instead of translating languages or correcting writing, it “translates” accented or unclear English speech into clear, natural-sounding English in real time during Zoom calls.
Non-native speakers talk normally in their own accent; ClarityAI processes their voice live, improves pronunciation, clarity, pacing, and filler words, and sends a clean, native-level English version to the listener while keeping their original voice identity.
Just like Grammarly fixes your writing instantly, ClarityAI fixes your speaking instantly — in real time. This gives non-native speakers a massive advantage in meetings, interviews, sales calls, and presentations without needing coaching or practice.
Technically, it orchestrates multiple modalities — voice input, browser UI, cloud LLM reasoning, and custom rule-based analysis — into one cohesive agent loop. This combination is both innovative and impactful: it brings Grammarly-style intelligence to live conversations, helping people communicate more clearly and confidently. The real-world impact is broad, from better sales calls and interviews to stronger team communication.
Theme Alignment & Technologies Used:
ClarityAI aligns with the theme by merging browsers, voices, clouds, and tools into a single agent:
- Browsers: Floating UI overlay for live transcript and suggestions
- Voices: Real-time mic capture with WebRTC
- Clouds: OpenAI Realtime API for transcription, rewriting, and summaries
- Tools: Rule-based detectors, pacing counters, overlay controls, and WebSockets
Technologies:
- 🎤 Real-time Speech Recognition - Uses Mozilla’s native Web Speech API
- ✨ Text Refinement - OpenAI GPT-4o-mini for accent normalization
- 🎭 Voice Cloning - ElevenLabs for maintaining speaker identity
- 🎨 Clean UI - ShadCN-inspired black and white design
- ⚡ Fast Setup - Uses uv for Python package management
- ElevenLabs API
- GPT-4.1/GPT-5 for rewriting & summaries
- WebRTC for audio capture
- JavaScript/TypeScript
- HTML/CSS browser overlay
- Node.js + WebSockets
- Basic regex logic for filler-word & pace detection