Windsurf CEO: how a GPU startup pivoted twice to build an AI coding tool

Executive overview

Most AI developer tools iterated on GitHub Copilot. Windsurf's founders scrapped their GPU infrastructure business in a weekend, bet on agentic coding before the models could support it, and shipped a forked VS Code IDE in under three months.

The core insight is that every competitive advantage is a depreciating asset — continuous, compounding insight execution is the only durable moat.

Startups die slowly when they stop generating and testing new hypotheses, not when individual bets fail.

The pivot: from GPU virtualisation to Codeium

Company started in 2021 as ExaFunction — VMware-style GPU virtualisation for deep learning workloads
By mid-2022: managing 10,000+ GPUs, $2M+ revenue, 8 people, Series A raised
Saw that transformer models would commoditise custom ML pipelines — eliminating the rationale for their product
Decision made in a single weekend; entire team told Monday; coding on Codeium started the same day
Key conditions enabling the pivot: lean team, cash-flow positive, $28M raised, full team conviction on the new direction
Shipped a VS Code extension within two months and posted on Hacker News

Building a better autocomplete: early technical edges

First version used a free open-source model — materially worse than Copilot, but free
Retained inference speed advantage from prior GPU runtime infrastructure
Trained their own fill-in-the-middle model within months — outperformed Copilot on that specific capability by early 2023
Expanded to all major IDEs (JetBrains, Eclipse, Vim) early, driven by enterprise need — JP Morgan, Dell were early customers
Shared cross-IDE infrastructure meant low marginal cost to add each new editor

Why they built their own IDE (Windsurf)

By mid-2024 enterprise revenue was well over eight figures from the Codeium extension product
VS Code's ceiling limited the agentic UX they wanted to ship
Forked VS Code; shipped across all operating systems in under three months with an engineering team of fewer than 25
Core bet: agents, not chat+autocomplete, were the right paradigm — Windsurf was the first agentic editor
Deliberately avoided over-configuration; invested instead in deep code-base understanding and fast in-place edits
Maintained a unified timeline of developer and agent actions so the AI always has current context

Context retrieval: going beyond RAG

Rejected pure vector-database RAG as insufficient for code — precision and recall need to be very high
Built a multi-signal retrieval system: keyword search, vector search, abstract syntax tree parsing, and GPU-powered real-time reranking
Motivation: a query like "upgrade all instances of this API" fails if embedding search misses even a few hits
Prior GPU infrastructure enabled running reranking across large code chunks in under one second
Design principle: strive for what works, not complexity; added AST parsing only after evals proved it necessary

Evaluation infrastructure

Used a property unique to code: it can be run
Core eval: take open-source commits with tests, delete implementation, ask the model to reproduce it; measure retrieval accuracy, intent accuracy, and test-pass rate
Also masks partial changes to test intent prediction (similar to Google's autocomplete task)
Evals drove investment decisions — complexity was added only when evals showed it helped
Combination of eval-driven and vibe-driven improvement: evals suited to opaque retrieval systems; user data and intuition suited to UI-level changes

Using Windsurf in production

Internal teams commit code changes frequently as the primary safety net when using agents
Agents can change more than intended if intent is underspecified — surgical prompting and frequent commits reduce this
Boilerplate and repetitive tasks (e.g. server deployments) now handled entirely within Windsurf workflows
Non-technical employees build internal tools directly — removing the PM/engineer backlog bottleneck
Non-technical users represent a meaningful share of active users; many never open the code editor, living entirely in the Cascade agent panel

Hiring and interviews in the AI coding era

Still maintain a high technical bar — problem-solving skill remains the key proxy
Interviews include AI-assisted sessions to screen out people who resist the tools
Also include in-person sessions without AI — basic coding without AI assistance indicates reasoning ability
Open-ended system design questions with trade-offs replacing pure algorithmic questions; no single correct answer
Engineering headcount is growing, not shrinking — the problem ceiling is high and the 99% time-reduction mission requires many more capabilities (design, deploy, debug)

Opportunities for new AI coding startups

Large legacy migration market: COBOL-to-Java, JVM version upgrades, Rails upgrades — billions spent annually, few specialised tools
Automated alert and bug resolution: significant enterprise spend, no clearly dominant product yet
Both are niches deep enough to support large companies, not just one winner
General principle: pick one thing and do it exceptionally well rather than building another general coding assistant

Compounding advantage and the competition

Every insight depreciates; Nvidia-style compounding requires continuous new bets
Comfortable with a 50% failure rate on internal bets — 100% success signals insufficient ambition
Competitor landscape has shifted repeatedly (Copilot → Devin → Cursor); long-term strategy + execution flexibility matters more than reacting to rivals
Alpha over base models must grow proportionally as base models improve — the gap between foundation model output and 100% is the product opportunity
Treat pivots as a badge of honour; most founders fail because they prefer consistent failure over the discomfort of changing course

Windsurf CEO: how a GPU startup pivoted twice to build an AI coding tool

Executive overview

The pivot: from GPU virtualisation to Codeium

Building a better autocomplete: early technical edges

Why they built their own IDE (Windsurf)

Context retrieval: going beyond RAG

Evaluation infrastructure

Using Windsurf in production

Hiring and interviews in the AI coding era

Opportunities for new AI coding startups

Compounding advantage and the competition

More like this — when you're ready for early access.

Get early access to the full library.

Be among the first to get personalised recommendations tailored to your stage in business.

Be among the first to get personalised recommendations tailored to your stage in business.

Executive overview

The pivot: from GPU virtualisation to Codeium

Building a better autocomplete: early technical edges

Why they built their own IDE (Windsurf)

Context retrieval: going beyond RAG

Evaluation infrastructure

Using Windsurf in production

Hiring and interviews in the AI coding era

Opportunities for new AI coding startups

Compounding advantage and the competition

More like this — when you're ready for early access.

More in Founder Stories

What a $7B founder learned building Glean from scratch

From four failed co-founder splits to a $66M solo startup

The real cost of avoiding hard conversations in leadership

Get early access to the full library.

Be among the first to get personalised recommendations tailored to your stage in business.

Be among the first to get personalised recommendations tailored to your stage in business.