Om Sharma.

Engineer building real-time AI systems. Voice, LLM orchestration, the infrastructure that makes models feel alive inside two seconds.

FIG · TYPICAL CONVERSATIONAL PIPELINESTT → LLM → TTS

AUDIO

STT

LLM

TTS

“The seconds live in the gaps where you wait for certainty.”

Scroll

§ 1 — Introduction

I'm an engineer at Jobtwine, where our team builds real-time voice AI for enterprise interview screening — the kind of system where latency, reliability, and conversation quality decide whether the product works. Earlier I worked on RAG pipelines and LLM fine-tuning at Darwix AI and VDOIT Technologies.

On my own time I build small tools that scratch real itches — JiraGenie, a natural-language CLI for Jira with a self-healing workflow engine, is the current one. I write about what I learn shipping models into production: the gaps where the model second-guesses itself, the network hops that quietly cost you a turn.

§ 2 — Experience

Where I've shipped.

Jobtwine

Jul 2025 — Present

Associate Software Engineer · Bangalore

→Architected a real-time AI interviewer with a streaming STT → LLM → TTS pipeline at sub-2-second p50 latency, used by enterprise clients including Meesho, Brillio, and Deutsche Bank.
→Implemented streaming orchestration over Twilio Programmable Voice (inbound and outbound calls) bridged to LiveKit (WebRTC / SIP) with token-level TTS streaming and endpoint prediction, reducing per-turn latency by ~3 seconds.
→Built queue-based dispatch with fallback across multiple LLM providers, supporting peak loads of 300+ interviews per day with reliability under provider failures.

Darwix AI

May 2025 — Jul 2025

Software Engineer, AI Systems · Gurugram

→Built a scalable document ingestion engine using Pinecone and PostgreSQL with automated document-type detection, improving chunking accuracy by 40%.
→Developed a real-time sales call analysis platform with speaker diarization, live transcription, and performance scoring for coaching workflows.
→Engineered a cross-platform Windows / Linux desktop client streaming dual-channel call audio over WebSocket pipelines.

VDOIT Technologies

Jan 2024 — Apr 2025

Software Engineer, AI / ML · Gurugram

→Built backend systems supporting 100K+ concurrent users using Django and multithreaded architecture.
→Designed retrieval-augmented generation (RAG) pipelines with vector databases for contextual document search.
→Fine-tuned large language models using LoRA and QLoRA techniques for domain-specific use cases. Awarded STAR Performer.

§ 3 — Personal projects

What I've built.

Now · June 2026

Speculative
decoding,
in progress.

Building a toy implementation of the Leviathan et al. paper — draft model + target model + measured speedup. Companion blog post in July.

2026 · Open source

JiraGenie

Natural-language CLI for Jira. LLM intent parser. Self-healing workflow transition engine — walks workflow graphs with a loop guard, auto-repairs failed transitions.

2024 · PyTorch