The original is one click away. Open original ↗
How fal built a $1.5B AI inference platform for generative media
Executive overview
Most AI infrastructure startups tried to serve every model type. fal stayed narrow: image and video inference only, optimised from day one for diffusion models. That bet compounded into a $1.5B valuation and $90M ARR in under three years.
The co-founders identified two early signals — rapid user growth after foundation model releases, and a latency problem that was killing developer creativity — then built an inference engine around both. Their rule: go specific first, general later.
Niche markets that grow fast are the only niche markets worth picking.
Finding the right market
- Language models got all the hype; fal saw the same trajectory in image models
- Post-foundation model releases: user growth went from baseline to potentially 100x–1,000,000x
- Key insight: off-the-shelf models removed the training barrier, multiplying potential builders overnight
- Image and video inference was small but growing at extraordinary speed
- Staying specific let them understand customer problems in depth and compound their technical edge
Building the inference platform
- Core product: host generative media models (image, video, 3D, audio) as APIs for developers
- Built a proprietary inference engine optimised specifically for diffusion models — 2-3x faster than standard
- Latency kills creativity — reducing generation time is a direct product value, not just infrastructure
- Day-zero model releases: new models go live immediately, keeping developers on the platform
- Rigorous model vetting: every model is benchmarked against its advertised claims before deployment; cherry-picked demos are discarded
Staying focused when revenue stalled
- Revenue plateaued for a couple of months early on
- Temptation: expand into LLM inference (larger, established market)
- Decision: hold the line on image and video — harder to go from general back to specific than the reverse
- Outcome: technical advantage compounded; they were ahead when demand exploded
Speed as a core operating principle
- "Move fast" — multiply by 100; wrong decisions can be revisited, slow decisions can't
- Remained a team of ~6 for nearly two years before product-market fit
- Small teams make faster decisions and run tighter experiment loops
- Wrong bets are recoverable; moving slowly is not
Monetisation from day one
- AI buyers pay immediately — unlike previous internet eras that required building a user base first
- MVP must be good enough to generate revenue on day one
- Revenue is the clearest signal a product idea is working
- Customers include Adobe, Canva, Shopify, Perplexity
What's next: the video moment
- The "ChatGPT moment" for video has not yet arrived — models like VO3 are close but not there
- AI-generated video already accounts for roughly a third of social feeds, growing in slow motion
- Next inflection: real-time editable video with interactive characters
- fal's goal: be the infrastructure layer for all generative media builders when that moment hits
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.