The Key/K Class
TheKey class (aliased as K for brevity) provides a fluent interface for building filter expressions. Use K to reference document fields, IDs, and metadata properties.
Filterable Fields
| Field | Usage | Description |
|---|---|---|
K.ID | K.ID.is_in(["id1", "id2"]) | Filter by document IDs |
K.DOCUMENT | K.DOCUMENT.contains("text") | Filter by document content |
K("field_name") | K("status") == "active" | Filter by any metadata field |
Comparison Operators
Supported operators:==- Equality (all types: string, numeric, boolean)!=- Inequality (all types: string, numeric, boolean)>- Greater than (numeric only)>=- Greater than or equal (numeric only)<- Less than (numeric only)<=- Less than or equal (numeric only)
Chroma supports three data types for metadata: strings, numbers (int/float), and booleans. Order comparison operators (
>, <, >=, <=) currently only work with numeric types.Set and String Operators
Supported operators:is_in()- Value matches any in the listnot_in()- Value doesn’t match any in the listcontains()- OnK.DOCUMENT: substring search (case-sensitive). On metadata fields: checks if an array contains a scalar value.not_contains()- OnK.DOCUMENT: excludes by substring. On metadata fields: checks that an array does not contain a scalar value.regex()- String matches regex pattern (currently K.DOCUMENT only)not_regex()- String doesn’t match regex pattern (currently K.DOCUMENT only)
String operations like
contains() and regex() on K.DOCUMENT are case-sensitive by default. When used on metadata fields, contains() checks array membership rather than substring matching. The is_in() operator is efficient even with large lists.Array Metadata
Chroma supports storing arrays of values in metadata fields. You can usecontains() / not_contains() (or $contains / $not_contains in dictionary syntax) to filter records based on whether an array includes a specific scalar value.
Storing Array Metadata
Arrays can contain strings, numbers, or booleans. All elements in an array must be the same type. Empty arrays are not allowed.Filtering Arrays
Usecontains() to check if a metadata array includes a value, and not_contains() to check that it does not.
Supported Array Types
| Type | Python | TypeScript | Rust |
|---|---|---|---|
| String | ["a", "b"] | ["a", "b"] | MetadataValue::StringArray(...) |
| Integer | [1, 2, 3] | [1, 2, 3] | MetadataValue::IntArray(...) |
| Float | [1.5, 2.5] | [1.5, 2.5] | MetadataValue::FloatArray(...) |
| Boolean | [true, false] | [true, false] | MetadataValue::BoolArray(...) |
The
$contains value must be a scalar that matches the array’s element type. All elements in an array must be the same type, and nested arrays are not supported.Logical Operators
Supported operators:&- Logical AND (all conditions must match)|- Logical OR (any condition can match)
Always use parentheses around each condition when using logical operators. Python’s operator precedence may not work as expected without them.
Dictionary Syntax (MongoDB-style)
You can also use dictionary syntax instead of K expressions. This is useful when building filters programmatically. Supported dictionary operators:- Direct value - Shorthand for equality
$eq- Equality$ne- Not equal$gt- Greater than (numeric only)$gte- Greater than or equal (numeric only)$lt- Less than (numeric only)$lte- Less than or equal (numeric only)$in- Value in list$nin- Value not in list$contains- On#document: substring search. On metadata fields: array contains value.$not_contains- On#document: excludes by substring. On metadata fields: array does not contain value.$regex- Regex match$not_regex- Regex doesn’t match$and- Logical AND$or- Logical OR
Each dictionary can only contain one field or one logical operator (
$and/$or). For field dictionaries, only one operator is allowed per field.Common Filtering Patterns
Edge Cases and Important Behavior
Missing Keys
When filtering on a metadata field that doesn’t exist for a document:- Most operators (
==,>,<,>=,<=,is_in()) evaluate tofalse- the document won’t match !=evaluates totrue- documents without the field are considered “not equal” to any valuenot_in()evaluates totrue- documents without the field are not in any list
Mixed Types
Avoid storing different data types under the same metadata key across documents. Query behavior is undefined when comparing values of different types.String Pattern Matching Limitations
regex() and not_regex() only work on K.DOCUMENT. These operators do not yet support metadata fields.
contains() and not_contains() have different behavior depending on the field:
- On
K.DOCUMENT: substring search (the pattern must have at least 3 literal characters) - On metadata fields: array membership check (see Array Metadata above)
regex() and not_regex() currently only work on K.DOCUMENT. Substring matching on metadata scalar fields is not yet available. Also, patterns with fewer than 3 literal characters may return incorrect results.Substring and regex matching on metadata scalar fields is not currently supported. Full support is coming in a future release, which will allow users to opt-in to additional indexes for string pattern matching on specific metadata fields.
Complete Example
Here’s a practical example combining different filter types:Tips and Best Practices
- Use parentheses liberally when combining conditions with
&and|to avoid precedence issues - Filter before ranking when possible to reduce the number of vectors to score
- Be specific with ID filters - using
K.ID.is_in()with a small list is very efficient - String matching is case-sensitive - normalize your data if case-insensitive matching is needed
- Use the right operator -
is_in()for multiple exact matches,contains()for substring search
Next Steps
- Learn about ranking and scoring to order your filtered results
- See practical examples of filtering in real-world scenarios
- Explore batch operations for running multiple filtered searches