Skip to main content
Search dictionaries define filtering, ranking, grouping, pagination, and field selection for Chroma queries. Each SDK provides a DSL, but they compile to the same JSON format that you can construct directly. For example, SDK code like this:
from chromadb import Search, K, Knn, GroupBy, MinK

search = Search(
    where=K("status") == "active",
    rank=Knn(query="machine learning research", limit=100),
    group_by=GroupBy(keys=K("category"), aggregate=MinK(keys=K.SCORE, k=2)),
    limit=10,
    select=[K.DOCUMENT, K.SCORE, "category"]
)
Gets compiled to this JSON:
{
  "where": {"status": {"$eq": "active"}},
  "rank": {"$knn": {"query": "machine learning research", "limit": 100}},
  "group_by": {
    "keys": ["category"],
    "aggregate": {"$min_k": {"keys": ["#score"], "k": 2}}
  },
  "limit": {"limit": 10, "offset": 0},
  "select": {"keys": ["#document", "#score", "category"]}
}
This reference describes the Search dictionary format and rules. For related dictionary references, see Where Filters.

JSON Format

Basic Structure

A Search dictionary is an object with optional keys:
{
  "where": { /* where filter dictionary */ },
  "rank": { /* rank expression dictionary */ },
  "group_by": { /* group by dictionary */ },
  "limit": {"limit": 10, "offset": 0},
  "select": {"keys": ["#document", "#score"]}
}
All keys are optional. Omitted keys use Search defaults.

Component Schemas

where

where uses the Where Filter dictionary schema.
{
  "where": ...
}
See Where Filters for full operator and rule definitions.

rank

rank must be a dictionary with exactly one top-level operator.
{
  "rank": RankExpr
}
{
  "RankExpr": {"$val": "number"}
}
{
  "RankExpr": {
    "$knn": {
      "query": "string | number[] | SparseVector",
      "key": "string (optional)",
      "limit": "positive integer (optional)",
      "default": "number (optional)",
      "return_rank": "boolean (optional)"
    }
  }
}
{
  "RankExpr": {
    "$op": ...
  }
}
OperatorFormat
$sum["RankExpr", "RankExpr", "... (min 2)"]
$mul["RankExpr", "RankExpr", "... (min 2)"]
$max["RankExpr", "RankExpr", "... (min 2)"]
$min["RankExpr", "RankExpr", "... (min 2)"]
$sub (l-r){ "left": "RankExpr", "right": "RankExpr" }
$div (l/r){ "left": "RankExpr", "right": "RankExpr" }
$abs"RankExpr"
$exp (ex)"RankExpr"
$log (Natural logarithm)"RankExpr"

group_by

group_by can be omitted or provided as a dictionary with both keys and aggregate.
{
  "group_by": {
    "keys": ["metadata_field", "... (min 1)"],
    "aggregate": {
      "$min_k": { // Or $max_k
        "keys": ["metadata_field_or_#score", "... (min 1)"],
        "k": "positive integer"
      }
    }
  }
}

limit

Controls pagination.
{
  "limit": {
    "limit": 10, (optional, default 0)
    "offset": 20 (optional)
  }
}

select

Controls returned fields. Use built-ins (#id, #document, #embedding, #metadata, #score) and/or metadata field names.
{
  "select": {
    "keys": ["#id", "#document", "#metadata", "#score", "author"]
  }
}