The original is one click away. Open original ↗
How to earn money and build a business with voice AI
Executive overview
Most businesses still don't know voice agents exist — let alone that they can deploy one without a developer. ElevenLabs has built a full voice platform covering text-to-speech, speech-to-text, agent orchestration, and a voice marketplace where anyone can earn passive income by sharing their cloned voice.
The opportunity right now is the gap between what's technically possible and what local businesses have actually deployed.
Anyone can earn from their voice or sell voice agent deployments to small businesses — no coding required.
Voice agents in business: what's working now
- Voice agents now handle the full customer journey: support, inbound/outbound sales, onboarding, and appointment booking
- ElevenLabs uses its own agents to qualify leads and convert business-tier customers without human handoff
- Agents can switch languages mid-call and speak in the original creator's cloned voice
- Omnichannel follow-up — links, email checkout — handles the purchase step when agents can't close directly
- Deployment cost for a small business starts at a few hundred dollars per month
Setting up a voice agent on ElevenLabs
- The platform abstracts the technical stack: speech, LLM, text-to-speech, and latency handling are pre-built
- Bring your own business logic: knowledge base, question flows, and trigger conditions
- Pre-built workflows handle common actions like calendar booking or routing to a human
- Integrate with Twilio or any existing telephony system; port your current phone number
- Embed an agent directly on a website to guide visitors through a checkout flow
The voice marketplace: earning passive income
- Record 30+ minutes of your voice to create a high-fidelity replica that speaks in 30–70 languages
- List your voice on the ElevenLabs marketplace and earn each time it is used
- ElevenLabs has paid out close to $10 million across roughly 10,000 listed voices
- Average earnings: a few hundred dollars per month with some community promotion effort
- Unique voices with distinctive prosody or accent perform significantly better than generic ones
- Promote your voice on Discord, Reddit, or relevant forums to break through early
Voice cloning quality and current limitations
- A cloned voice is the average of the full recording sample — scene-level intonation shifts won't match exactly
- For patch inserts, use a short clip from the exact scene rather than the full video sample
- Background noise mixed into source audio degrades insert quality; keep source audio clean
- ElevenLabs is building pre-conditioning from surrounding video context to improve insert accuracy (not yet released)
The business opportunity: deploying voice agents for small businesses
- The gap between available voice agent technology and actual SMB deployment is large and largely untapped
- Local doctor's offices, dentists, and mechanics lose bookings because no one answers the phone
- Each deployment can be worth thousands of dollars per month per client
- A few clients puts you in the range of a sustainable income; no coding required
- Start with English-speaking markets, then localise — most markets have had little outreach
- Large AI companies are focused on enterprise; the SMB segment is open
Impersonation risks and the three-layer defence model
- Assume that any voice can be cloned — safeguard design must start from that premise
- Layer 1 — verify the device: encode the sending device so the receiver can confirm it is the authenticated hardware
- Layer 2 — watermark authenticated AI: embed provenance metadata at generation time so trusted AI content can be identified
- Layer 3 — default to AI: treat any content that passes neither layer as AI by default; require positive proof of human or permissioned AI origin
- ElevenLabs moderates account-level misuse and traces generated content internally
The future of voice interfaces
- Personalised voice selection will let callers choose their preferred voice for any service
- Businesses will serve different voices by demographic: slower and calmer for older callers, faster for younger
- Authenticated personal voice agents will act on your behalf — booking restaurants, following up appointments — with permissioned access to your data
- Language learning shifts from necessity to hobby as real-time translation removes the communication barrier
- The "default to AI" mindset will replace today's "maybe this is AI" assumption as AI-generated content becomes ubiquitous
Founder advice: how ElevenLabs was built
- Start with a problem you know personally — ElevenLabs began with the poor dubbing experience in Polish media
- Build a prototype while talking to customers, not after
- Validate willingness to pay before investing months of engineering; ElevenLabs pivoted from dubbing to voice repair based on user feedback
- The most urgent user problem is often a sub-component of the one you think you are solving
- Pick co-founders and early team as carefully as any other decision — culture is set by the first few hires
Top AI tools recommended
- ElevenLabs — voice generation, agents, and the voice marketplace
- Black Forest Labs — image generation with strong realism
- Anthropic Claude — coding assistance, even for non-engineers
- Lovable / v0 / Replit — no-code prototyping for go-to-market and demos
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.