Practical techniques to improve RAG retrieval without complexity

Executive overview

RAG systems are easy to set up but hard to master. Most teams jump to complex solutions — fine-tuning, agent routing, multi-doc agents — when simpler techniques would give them 80% of the results.

Focusing on beginner-mode fundamentals plus re-ranking delivers most of the retrieval gains. Nightmare-mode complexity is only worth pursuing once these foundations are solid.

The 80/20 of RAG: data quality, smart chunking, metadata filtering, and re-ranking outperform almost every advanced technique.

Cleaning data before ingestion

Remove junk: ads, redundant headers, cover pages, irrelevant boilerplate.
Fix or remove poorly formatted content — weird layouts confuse retrieval.
Strip broken text: garbled ASCII, non-language symbols, encoding artifacts.
Goal: only meaningful, clean text enters the vector database.

Choosing and using embedders

Use recent, widely adopted embedding models — quality improves with model recency.
OpenAI's text-embedding-3-small is a strong default: cheap and near-large-model performance.
For specialist domains (e.g. dermatology), test embedders on your specific vocabulary.
Good embedders capture semantic neighbours — "money back", "return", "refund" cluster together; unrelated words do not.

Chunking strategy

Cut at meaning boundaries — sentence ends, paragraph ends, section ends — not at arbitrary character counts.
Overlap chunks by 20–30% so the LLM can see the connection between adjacent pieces.
Adjust chunk size to the content: a short paragraph warrants a smaller chunk than a long section; don't force a static size.
Never cut mid-sentence — it destroys the LLM's ability to interpret the chunk.

Metadata and filtering

Attach structured tags to each chunk before vectorising: date, topic, entity names, location, document section.
Add a one-sentence LLM-generated summary as a metadata field on each chunk.
At query time, combine vector similarity with metadata filters — this dramatically narrows the result set.
Example: "Tesla earnings" query extracts topic: earnings, company: Tesla, period: Q3 as filters before hitting the vector DB.

Source tracking

Prompt the LLM to return the source reference alongside every answer.
Format: inline citation (e.g. [Section 2.1 — Warranty Policy]) so the user can verify.
Prevents hallucinated answers from going unchecked.

Query rewriting before retrieval

Insert an intermediate LLM that rewrites the user's raw query before it hits the vector database.
Vague queries ("tell me about Tesla earnings") become specific ones with added context, time period, and relevant keywords.
Improves retrieval precision without changing the user-facing interface.

Re-ranking results

A typical retrieval returns 10–50 chunks; not all are equally relevant.
Score each chunk using a combination of vector similarity and metadata filter matches.
Re-rank by combined score; pass only the top 5 (or similar cutoff) to the final LLM.
Avoids returning the first result blindly — surfaces the most contextually accurate chunks instead.

Adventure and nightmare mode (brief reference)

Adventure mode techniques worth considering when fundamentals are solid:

Recursive retrieval: iteratively fetch more context based on each prior retrieval step.
Embedded tables: preserve relationships between cells, not just raw text.
Small-to-big: start with a small seed chunk, expand outward until context is sufficient.

Nightmare mode (high cost and complexity — avoid unless necessary):

LLM fine-tuning for domain-specific tasks.
Embedding fine-tuning for specialist terminology.
Agent routing across multiple vector databases.
Query planning: decompose complex queries into sub-queries, search each, aggregate.
Multi-doc agents: parallel agents search separate document sets and consolidate answers.

Building $10,000 software MVPs with AI in under an hour

Brett Malinowski May 14, 2026

AI tools & automation 9

MVP & prototyping 8

Automation & tools 6

One person with Claude Code can replace a three-person agency team
Partner with niche creators who already have audience and distribution
Use pre-built components for payments and chat — don't build infrastructure from scratch

AI strategy & adoption

YouTube

How to actually make money with AI: five brutal truths

Dan Martell May 14, 2026

AI strategy & adoption 9

Business models 8

Automation & tools 5

AI is a hammer — you still need to find the nail
Validate with manual "Wizard of Oz" delivery before automating anything
Future orgs are workflow-based; humans own outcomes, agents own tasks

AI strategy & adoption

YouTube

How to choose the right home for your AI workflow

Dylan Davis May 13, 2026

AI strategy & adoption 9

Automation & tools 6

AI defaults to building apps — that's usually the wrong choice
85–90% of workflows belong inside a project or skill, not deployed code
Deploying an app triggers per-token API costs that subscriptions don't cover

Practical techniques to improve RAG retrieval without complexity

Executive overview

Cleaning data before ingestion

Choosing and using embedders

Chunking strategy

Metadata and filtering

Source tracking

Query rewriting before retrieval

Re-ranking results

Adventure and nightmare mode (brief reference)

More like this — when you're ready for early access.

Get early access to the full library.

Be among the first to get personalised recommendations tailored to your stage in business.

Be among the first to get personalised recommendations tailored to your stage in business.

Executive overview

Cleaning data before ingestion

Choosing and using embedders

Chunking strategy

Metadata and filtering

Source tracking

Query rewriting before retrieval

Re-ranking results

Adventure and nightmare mode (brief reference)

More like this — when you're ready for early access.

More in AI

Building $10,000 software MVPs with AI in under an hour

How to actually make money with AI: five brutal truths

How to choose the right home for your AI workflow

Get early access to the full library.

Be among the first to get personalised recommendations tailored to your stage in business.

Be among the first to get personalised recommendations tailored to your stage in business.