Skip to main content

Index Types Overview

Schema recognizes six value types, each with associated index types. Without providing a Schema, collections use these built-in defaults:
Config ClassValue TypeDefault BehaviorUse Case
StringInvertedIndexConfigstringEnabled for all metadataFilter on string values
FtsIndexConfigstringEnabled for K.DOCUMENT onlyFull-text search on documents
VectorIndexConfigfloat_listEnabled for K.EMBEDDING onlySimilarity search on embeddings
SparseVectorIndexConfigsparse_vectorDisabled (requires config)Keyword-based search
IntInvertedIndexConfigint_valueEnabled for all metadataFilter on integer values
FloatInvertedIndexConfigfloat_valueEnabled for all metadataFilter on float values
BoolInvertedIndexConfigbooleanEnabled for all metadataFilter on boolean values

Simple Index Configs

These index types have no configuration parameters.

FtsIndexConfig

Use Case: Full-text search and regular expression search on documents (e.g., where(K.DOCUMENT.contains("search term"))). Limitations: Cannot be deleted. Applies to K.DOCUMENT only.

StringInvertedIndexConfig

Use Case: Exact and prefix string matching on metadata fields (e.g., where(K("category") == "science")).

IntInvertedIndexConfig

Use Case: Range and equality queries on integer metadata (e.g., where(K("year") >= 2020)).

FloatInvertedIndexConfig

Use Case: Range and equality queries on float metadata (e.g., where(K("price") < 99.99)).

BoolInvertedIndexConfig

Use Case: Filtering on boolean metadata (e.g., where(K("published") == True)).

VectorIndexConfig

Use Case: Semantic similarity search on dense embeddings for finding conceptually similar content. Parameters:
ParameterTypeRequiredDescription
spacestringNoDistance function: l2 (geometric), ip (inner product), or cosine (angle-based, most common for text). Default: l2
embedding_functionEmbeddingFunctionNoFunction to auto-generate embeddings from K.DOCUMENT. If not provided, supply embeddings manually
source_keystringNoReserved for future use. Currently always uses K.DOCUMENT
hnswHnswConfigNoAdvanced: HNSW algorithm tuning for single-node deployments
spannSpannConfigNoAdvanced: SPANN algorithm tuning (clustering, probing) for Chroma Cloud
Limitations:
  • Cannot be deleted
  • Applies to K.EMBEDDING only

SparseVectorIndexConfig

Use Case: Keyword-based search for exact term matching, domain-specific terminology, and technical terms. Ideal for hybrid search when combined with dense embeddings. Parameters:
ParameterTypeRequiredDescription
source_keystringNoField to generate sparse embeddings from. Typically K.DOCUMENT, but can be any text field
embedding_functionSparseEmbeddingFunctionNoSparse embedding function (e.g., ChromaCloudSpladeEmbeddingFunction, HuggingFaceSparseEmbeddingFunction, Bm25EmbeddingFunction)
bm25booleanNoSet to true when using Bm25EmbeddingFunction to enable inverse document frequency (IDF) scaling for queries. Not applicable for SPLADE
Limitations:
  • Must specify a metadata key name (per-key configuration required)
  • Sparse vector indices must be declared at collection creation and cannot be added later
  • Cannot be deleted once created

Next Steps