M0
The Problem
Ranking audit
You can articulate why relevance is hard and what signals a ranker needs
IndexZero
Ten modules. One Python codebase that grows. You start with ranking judgment and end with a working search API. Along the way, you build every piece yourself: tokenizer, inverted index, BM25 scorer, eval harness, vector retrieval, and hybrid search.
Build arc
Each module changes the system state. The next module starts from the artifact you just made.
Modules
Free modules let you start the system. The cohort continues through indexing, ranking, evaluation, query processing, and deployment shape.
M0
Ranking audit
You can articulate why relevance is hard and what signals a ranker needs
M1
Tokenizer + vocabulary
A working tokenizer that turns raw text into clean token streams
M2
Inverted index + lookup
A searchable inverted index that maps terms to documents
M3
BM25 scorer
A BM25 scorer that ranks documents by relevance to a query
M4
Eval harness
An evaluation harness that measures ranking quality with labeled queries
M5
Query processor
A query processor that handles multi-word and structured queries
M6
Vector retrieval
Vector embeddings and approximate nearest neighbor search over the same corpus
M7
Hybrid retrieval
A hybrid retriever that combines lexical and semantic signals
M8
Index pipeline
An index that handles document changes without full rebuilds
M9
Search API
A complete search API serving queries over your index
Start free
Start with M0 and M1. You define the ranking problem, build the tokenizer, and see how the corpus changes before indexing starts.
Join the cohort
Continue with M2 and beyond through the guided cohort. You get code reviews, deadline structure, discussion, and a workshop after the ranking modules.
Evidence
One codebase across 10 modules. M1: 44 tests, 7 assessments. M2: 26 tests, 6 assessments. M3: 27 tests, 5 assessments. Real product dataset. Each module feeds the next.
Who built this
IndexZero is built by Sumit Garg. Sumit spent years building search infrastructure at Microsoft, working on the systems behind Azure AI Search that handle billions of queries. He built this course because most developers use hosted search without understanding the retrieval mechanics underneath. IndexZero makes those mechanics visible in code you write yourself.