All projects
AI2025ShippedClient work

AI Voice Charting for a Dental SaaS

Hands-free periodontal charting — the clinician speaks, and the chart fills itself in real time.

Role: Feature Engineer (contract)
AI Voice Charting for a Dental SaaS

A voice-driven periodontal charting agent I built for a dental-practice AI product (client work, shown anonymized). The clinician charts entirely by voice — "tooth 14 buccal three four five", "mobility two on thirty", "furcation class two on 19", "scratch that" — and the chart updates instantly, while small talk is recognized and ignored. It runs on a single WebRTC connection to the OpenAI Realtime API, collapsing speech-to-text, intent parsing, function-calling and spoken confirmation into one loop at roughly 300–500 ms voice-to-voice latency. A function-calling tool layer drives an in-memory store and an instant UI, and I shipped the whole thing to production on AWS with a full CI/CD pipeline.

Highlights

  • Architected a real-time voice agent on the OpenAI Realtime API over WebRTC — one connection handling speech-to-text, intent, function-calling and spoken confirmation at ~300–500 ms voice-to-voice latency.
  • Designed a rich charting grammar backed by 11 function-calling tools — single-site, triplet and bulk updates, BOP/mobility/furcation flags, and a 'sweep mode' that fills an entire quadrant from rapid-fire numbers.
  • Tuned server-side voice-activity detection so a natural mid-phrase pause ("tooth 14 buccal… 3") still reads as one command, with terse verbal echo and one-word 'undo' for eyes-free correction.
  • Modeled teeth on canonical Universal numbering (FDI handled purely in the UI) and added a keyboard fallback that parses the same grammar.
  • Shipped to production on AWS — Dockerized, with ECR images, GitHub Actions CI/CD, nginx and automated SSL across feature environments.

Tech

Next.js 14OpenAI Realtime APIWebRTCFunction CallingZustandTypeScriptAWSDockerGitHub Actions