The original is one click away. Open original ↗
Four AI Capabilities That Crossed the Threshold in Six Months
Executive overview
AI capabilities cross a usability threshold unpredictably — tasks that failed months ago may now work out of the box. Missing that moment means competitors gain a lead while you play catch-up. The fix is a systematic retesting process tied to model releases and quarterly reviews.
The competitive edge is not adopting AI early — it is knowing when a failed test is worth retrying.
Three capabilities that crossed the threshold
- Complex PDF extraction: a project that took five weeks of engineering and six models six months ago now works in one prompt with Gemini Flash.
- Image consistency: generating images with a consistent face, product, or logo across outputs was nearly impossible; current models handle it accurately.
- Handwriting extraction: accuracy on cursive and poor handwriting has jumped from ~60% to ~100% with models like Gemini Pro.
Why tests fail for the wrong reason
- Vague prompts and missing context make AI appear incapable when it is not.
- Before writing off a capability, confirm the prompt has: specific task context, clear expectations, and concrete success criteria.
- Only after a high-quality prompt still fails can you judge the model's actual limit.
The AI wish list system
- Keep a running list of tasks AI cannot yet do: record the task, the date tested, and the result.
- Store it anywhere you will actually use — Google Doc, Apple Notes, or similar.
- This list is a retesting queue, not a graveyard.
When to retest
- New model release from Anthropic, OpenAI, or Google — test relevant wish-list items immediately.
- New feature release from any of the big three — test if it addresses your use case.
- Quarterly calendar reminder — review all items even without a major announcement, as models improve silently in the background.
Retesting protocol
- Run the wish-list task with a basic prompt against the new model or feature.
- If it works, remove it from the list.
- If it fails, improve the prompt: add context, sharpen expectations, check current best practices.
- Only if it still fails with a strong prompt does the item go back on the list with an updated date.
More like this — when you're ready for early access.
Join the waitlist for a personal account and content recommendations based on what you're working on.
No spam. Unsubscribe at any time.
You're on the list. We'll be in touch before launch.