M2
The Index
Inverted index + lookup
A searchable inverted index that maps terms to documents
Available
Cohort
- Effort
- 4-6 hours
- Prerequisite
- M1
- Core concept
- Postings lists and lookup cost
What you have
Token streams from M1
What you gain
Searchable postings lists on disk
What you build
This module moves from token streams to disk-backed postings lists that later ranking code can score.
- A build_index() function that writes an inverted index from M1 token streams
- A lookup() function that returns postings lists for any indexed term
- JSON files on disk that persist document statistics and postings
What you learn
- How an inverted index turns term lookup into a practical retrieval primitive
- What postings lists store and how lookup cost changes with term frequency
- Why persistence choices matter once the index leaves memory
- How to structure document IDs, term stats, and index metadata
Artifact and workload
Primary artifact: build_index() and lookup() functions with JSON persistence
Tests26
Assessments6
Estimated time4-6 hours
Access
This module is part of the cohort. Join the guided path for reviews, deadlines, and the workshop sequence after the ranking modules.