A friend asked me to build him a meeting recorder. His exact words: "I just want to record meetings and ask questions about them later. Why does every tool have seventeen features I don't need?"
Fair point. Have you seen Otter.ai lately? It's like they're trying to solve world hunger through meeting transcription. Every update adds more buttons, more features, more reasons to pay more money.
The Problem
The meeting tool space has a bloat problem. What started as "record meetings, get transcripts" has evolved into enterprise platforms with AI assistants, team collaboration, project management integrations, and probably a built-in CRM.
The worst part? They force you to use their AI. Don't like how GPT-4 summarises things? Too bad. Prefer Claude's reasoning? Not an option. Want to know exactly what you're paying for AI processing? Good luck finding that information.
My friend wanted something simple: record a meeting, get a transcript, ask questions about it later. That's it. No team dashboards. No third AI assistant feature. Just meetings.
The Twist
Here's the interesting bit: BYOK. Bring Your Own Key.
Instead of paying a subscription that includes hidden AI processing fees, users plug in their own API keys. Want to use Claude? Add your Anthropic key. Prefer Gemini? Google key. OpenAI loyalist? That works too.
The benefits are surprisingly significant. Full cost transparency (you see exactly what you're paying the AI provider). Privacy control (you choose which company processes your data). No vendor lock-in (switch providers anytime). And frankly, it's usually cheaper than bundled subscriptions.
The downside: slightly higher friction to get started. But the people who care about these things are exactly the people willing to spend five minutes getting an API key.
The Stack
Frontend: React 19 with Vite. Went with Vite over Next.js because I didn't need server-side rendering and wanted faster builds. TailwindCSS and Shadcn UI for the interface. Minimal black-and-white theme because the world has enough gradient-heavy SaaS dashboards.
Backend: Firebase everything again. Firestore for meetings and user data, Firebase Auth for accounts, Cloud Storage for audio files, Cloud Functions for server-side processing. I've accepted that Firebase is my default backend now.
Transcription: Deepgram handles the speech-to-text. Supports WebM, WAV, MP3, MP4, Opus. Good speaker diarisation. Reasonably priced.
AI Analysis: Multi-provider architecture. Claude, Gemini, or OpenAI based on user preference. Unified interface so switching providers doesn't require code changes. The AI generates summaries, extracts action items, identifies key takeaways, and powers a chat interface for asking questions about meeting content.
Payments: Razorpay for subscriptions. Four tiers from free to enterprise. Usage-based limits on meeting hours and AI calls. Indian market focus (hence Razorpay over Stripe).
The Hard Parts
Demo Mode Friction: The BYOK model created a chicken-and-egg problem. Users couldn't try the product without setting up API keys first, but they wouldn't set up keys without seeing if the product was worth it.
Solution: demo mode with system-provided API keys. Users can try everything without configuration. Keys are handled server-side (never exposed to the frontend). Clear path from trial to paid with their own keys.
UI Transparency Issues: The original UI had modals with transparent backgrounds. Looked nice in isolation, became unreadable over actual content. Every dialog, sheet, and tooltip needed fixing.
The fix was tedious but straightforward: solid backgrounds everywhere. Less visually interesting, but actually usable. Design is about function, not just aesthetics.
Transcript vs Analysis Timing: Initially, users had to wait for AI analysis before seeing the transcript. Terrible UX. Transcription takes seconds, but analysis can take 10-15 seconds depending on meeting length.
Separated the two processes. Transcript appears immediately after upload. AI analysis runs in the background with progress indicators. Users can review the raw transcript while waiting.
Subscription Enforcement: Building a tiered system with real-time usage tracking is more complex than it sounds. Need to track meeting hours, count AI calls, handle monthly resets, show warnings when approaching limits, and block usage gracefully when exceeded.
Razorpay webhooks handle subscription updates. Firestore documents track usage in real-time. Frontend widgets show current status. The unglamorous plumbing that makes paid software actually work.
Current Status
Mint is live and accepting users. The core workflow is solid: record in browser or upload existing audio, get a transcript with speaker identification, run AI analysis, chat with your meeting history.
Free tier gets 2 hours of meetings per month and 5 AI calls. Enough to try things out. Paid tiers scale up from there. Enterprise is unlimited everything.
GitHub Actions handles deployment automatically. Push to main, everything builds and deploys. Reduced deployment friction from "remember all the Firebase commands" to "commit and forget."
My friend uses it for all his meetings now. Mission accomplished, I suppose. Though he keeps asking for new features, which defeats the entire premise of building something minimal. Classic feature creep, but from the person who specifically asked for no features.
The irony is not lost on me.