- Python
- TypeScript
- Rust
k: Controls term frequency saturation (default: 1.2)b: Controls document length normalization (default: 0.75)avg_doc_length: Average document length in tokens (default: 256.0)token_max_length: Maximum token length (default: 40)stopwords: Optional list of stopwords to exclude
BM25 is a classic information retrieval algorithm that works well for keyword-based search. For semantic search, consider using dense embedding functions instead.