Skip to main content

Installation

pip install chromadb

Clients

EphemeralClient

Creates an in-memory instance of Chroma. This is useful for testing and development, but not recommended for production use.
settings
Optional[Settings]
tenant
str
The tenant to use for this client. Defaults to the default tenant.
database
str
The database to use for this client. Defaults to the default database.

PersistentClient

Creates a persistent instance of Chroma that saves to disk. This is useful for testing and development, but not recommended for production use.
path
Union[str, Path]
The directory to save Chroma’s data to. Defaults to ”./chroma”.
settings
Optional[Settings]
tenant
str
The tenant to use for this client. Defaults to the default tenant.
database
str
The database to use for this client. Defaults to the default database.

HttpClient

Creates a client that connects to a remote Chroma server. This supports many clients connecting to the same server, and is the recommended way to use Chroma in production.
host
str
The hostname of the Chroma server. Defaults to “localhost”.
port
int
The port of the Chroma server. Defaults to 8000.
ssl
bool
Whether to use SSL to connect to the Chroma server. Defaults to False.
headers
Optional[Dict[str, str]]
A dictionary of headers to send to the Chroma server. Defaults to .
settings
Optional[Settings]
A dictionary of settings to communicate with the chroma server.
tenant
str
The tenant to use for this client. Defaults to the default tenant.
database
str
The database to use for this client. Defaults to the default database.

AsyncHttpClient

Creates an async client that connects to a remote Chroma server. This supports many clients connecting to the same server, and is the recommended way to use Chroma in production.
host
str
The hostname of the Chroma server. Defaults to “localhost”.
port
int
The port of the Chroma server. Defaults to 8000.
ssl
bool
Whether to use SSL to connect to the Chroma server. Defaults to False.
headers
Optional[Dict[str, str]]
A dictionary of headers to send to the Chroma server. Defaults to .
settings
Optional[Settings]
A dictionary of settings to communicate with the chroma server.
tenant
str
The tenant to use for this client. Defaults to the default tenant.
database
str
The database to use for this client. Defaults to the default database.

CloudClient

Creates a client to connect to a tenant and database on Chroma cloud.
tenant
Optional[str]
The tenant to use for this client. Optional. If not provided, it will be inferred from the API key if the key is scoped to a single tenant. If provided, it will be validated against the API key’s scope.
database
Optional[str]
The database to use for this client. Optional. If not provided, it will be inferred from the API key if the key is scoped to a single database. If provided, it will be validated against the API key’s scope.
api_key
Optional[str]
The api key to use for this client.
settings
Optional[Settings]
cloud_host
str
cloud_port
int
enable_ssl
bool

AdminClient

Creates an admin client that can be used to create tenants and databases.
settings
Settings

Client Methods

heartbeat

Get the current time in nanoseconds since epoch. Used to check if the server is alive. Returns: The current time in nanoseconds since epoch

list_collections

List all collections.
limit
Optional[int]
The maximum number of entries to return. Defaults to None.
offset
Optional[int]
The number of entries to skip before returning. Defaults to None.
Returns: A list of collections

count_collections

Count the number of collections. Returns: The number of collections.

create_collection

Create a new collection with the given name and metadata.
name
str
required
The name of the collection to create.
schema
Optional[Schema]
configuration
Optional[CreateCollectionConfiguration]
metadata
Optional[Dict[str, Any]]
Optional metadata to associate with the collection.
embedding_function
Optional[EmbeddingFunction[Optional[Embeddings]]]
Optional function to use to embed documents. Uses the default embedding function if not provided.
data_loader
Optional[DataLoader[Optional[Embeddings]]]
Optional function to use to load records (documents, images, etc.)
get_or_create
bool
If True, return the existing collection if it exists.
Returns: The newly created collection. Raises:
  • ValueError: If the collection already exists and get_or_create is False.
  • ValueError: If the collection name is invalid.

get_collection

Get a collection with the given name.
name
str
required
The name of the collection to get
embedding_function
Optional[EmbeddingFunction[Optional[Embeddings]]]
Optional function to use to embed documents. Uses the default embedding function if not provided.
data_loader
Optional[DataLoader[Optional[Embeddings]]]
Optional function to use to load records (documents, images, etc.)
Returns: The collection Raises:
  • ValueError: If the collection does not exist

get_or_create_collection

Get or create a collection with the given name and metadata. Args: name: The name of the collection to get or create metadata: Optional metadata to associate with the collection. If the collection already exists, the metadata provided is ignored. If the collection does not exist, the new collection will be created with the provided metadata. embedding_function: Optional function to use to embed documents data_loader: Optional function to use to load records (documents, images, etc.) Returns: The collection Examples:
client.get_or_create_collection("my_collection")
# collection(name="my_collection", metadata={})
name
str
required
schema
Optional[Schema]
configuration
Optional[CreateCollectionConfiguration]
metadata
Optional[Dict[str, Any]]
embedding_function
Optional[EmbeddingFunction[Optional[Embeddings]]]
data_loader
Optional[DataLoader[Optional[Embeddings]]]

delete_collection

Delete a collection with the given name.
name
str
required
The name of the collection to delete.
Raises:
  • ValueError: If the collection does not exist.

reset

Resets the database. This will delete all collections and entries. Returns: True if the database was reset successfully.

get_version

Get the version of Chroma. Returns: The version of Chroma

get_settings

Get the settings used to initialize. Returns: The settings used to initialize.

get_max_batch_size

Return the maximum number of records that can be created or mutated in a single call.

Admin Client Methods

create_tenant

Create a new tenant. Raises an error if the tenant already exists.
name
str
required

get_tenant

Get a tenant. Raises an error if the tenant does not exist.
name
str
required

create_database

Create a new database. Raises an error if the database already exists.
name
str
required
tenant
str

get_database

Get a database. Raises an error if the database does not exist.
name
str
required
tenant
str
The tenant of the database to get.

delete_database

Delete a database. Raises an error if the database does not exist.
name
str
required
tenant
str
The tenant of the database to delete.

list_databases

List all databases for a tenant. Raises an error if the tenant does not exist.
limit
Optional[int]
offset
Optional[int]
tenant
str
The tenant to list databases for.

Collection Methods

count

The total number of embeddings added to the database Returns: The total number of embeddings added to the database

add

Add embeddings to the data store.
ids
Union[str, IDs]
required
The ids of the embeddings you wish to add
embeddings
Optional[Embeddings]
The embeddings to add. If None, embeddings will be computed based on the documents or images using the embedding_function set for the Collection. Optional.
metadatas
Union[Metadata, List[Metadata], None]
The metadata to associate with the embeddings. When querying, you can filter on this metadata. Optional.
documents
Union[str, IDs, None]
The documents to associate with the embeddings. Optional.
images
Optional[Embeddings]
The images to associate with the embeddings. Optional.
uris
Union[str, IDs, None]
The uris of the images to associate with the embeddings. Optional.
Returns: None Raises:
  • ValueError: If you don’t provide either embeddings or documents
  • ValueError: If the length of ids, embeddings, metadatas, or documents don’t match
  • ValueError: If you don’t provide an embedding function and don’t provide embeddings
  • ValueError: If you provide both embeddings and documents
  • ValueError: If you provide an id that already exists

get

Get embeddings and their associate data from the data store. If no ids or where filter is provided returns all embeddings up to limit starting at offset.
ids
Union[str, IDs, None]
The ids of the embeddings to get. Optional.
where
Optional[Where]
A Where type dict used to filter results by. E.g. {"$and": [{"color" : "red"}, {"price": {"$gte": 4.20}}]}. Optional.
limit
Optional[int]
The number of documents to return. Optional.
offset
Optional[int]
The offset to start returning results from. Useful for paging results with limit. Optional.
where_document
Optional[Dict[Where, Union[str, List[Dict[Where, Union[str, List[WhereDocument]]]]]]]
A WhereDocument type dict used to filter by the documents. E.g. {"$contains": "hello"}. Optional.
include
List[Literal[documents, embeddings, metadatas, distances, uris, data]]
A list of what to include in the results. Can contain "embeddings", "metadatas", "documents". Ids are always included. Defaults to ["metadatas", "documents"]. Optional.
Returns: A GetResult object containing the results.

peek

Get the first few results in the database up to limit
limit
int
The number of results to return.
Returns: A GetResult object containing the results.

query

Get the n_results nearest neighbor embeddings for provided query_embeddings or query_texts.
query_embeddings
Optional[Embeddings]
The embeddings to get the closes neighbors of. Optional.
query_texts
Union[str, IDs, None]
The document texts to get the closes neighbors of. Optional.
query_images
Optional[Embeddings]
The images to get the closes neighbors of. Optional.
query_uris
Union[str, IDs, None]
The URIs to be used with data loader. Optional.
ids
Union[str, IDs, None]
A subset of ids to search within. Optional.
n_results
int
The number of neighbors to return for each query_embedding or query_texts. Optional.
where
Optional[Where]
A Where type dict used to filter results by. E.g. {"$and": [{"color" : "red"}, {"price": {"$gte": 4.20}}]}. Optional.
where_document
Optional[Dict[Where, Union[str, List[Dict[Where, Union[str, List[WhereDocument]]]]]]]
A WhereDocument type dict used to filter by the documents. E.g. {"$contains": "hello"}. Optional.
include
List[Literal[documents, embeddings, metadatas, distances, uris, data]]
A list of what to include in the results. Can contain "embeddings", "metadatas", "documents", "distances". Ids are always included. Defaults to ["metadatas", "documents", "distances"]. Optional.
Returns: A QueryResult object containing the results. Raises:
  • ValueError: If you don’t provide either query_embeddings, query_texts, or query_images
  • ValueError: If you provide both query_embeddings and query_texts
  • ValueError: If you provide both query_embeddings and query_images
  • ValueError: If you provide both query_texts and query_images

modify

Modify the collection name or metadata
name
Optional[str]
The updated name for the collection. Optional.
metadata
Optional[Dict[str, Any]]
The updated metadata for the collection. Optional.
configuration
Optional[UpdateCollectionConfiguration]
Returns: None

update

Update the embeddings, metadatas or documents for provided ids.
ids
Union[str, IDs]
required
The ids of the embeddings to update
embeddings
Optional[Embeddings]
The embeddings to update. If None, embeddings will be computed based on the documents or images using the embedding_function set for the Collection. Optional.
metadatas
Union[Metadata, List[Metadata], None]
The metadata to associate with the embeddings. When querying, you can filter on this metadata. Optional.
documents
Union[str, IDs, None]
The documents to associate with the embeddings. Optional.
images
Optional[Embeddings]
The images to associate with the embeddings. Optional.
uris
Union[str, IDs, None]
Returns: None

upsert

Update the embeddings, metadatas or documents for provided ids, or create them if they don’t exist.
ids
Union[str, IDs]
required
The ids of the embeddings to update
embeddings
Optional[Embeddings]
The embeddings to add. If None, embeddings will be computed based on the documents using the embedding_function set for the Collection. Optional.
metadatas
Union[Metadata, List[Metadata], None]
The metadata to associate with the embeddings. When querying, you can filter on this metadata. Optional.
documents
Union[str, IDs, None]
The documents to associate with the embeddings. Optional.
images
Optional[Embeddings]
uris
Union[str, IDs, None]
Returns: None

delete

Delete the embeddings based on ids and/or a where filter
ids
Optional[IDs]
The ids of the embeddings to delete
where
Optional[Where]
A Where type dict used to filter the delection by. E.g. {"$and": [{"color" : "red"}, {"price": {"$gte": 4.20}]}}. Optional.
where_document
Optional[Dict[Where, Union[str, List[Dict[Where, Union[str, List[WhereDocument]]]]]]]
A WhereDocument type dict used to filter the deletion by the document content. E.g. {"$contains": "hello"}. Optional.
Returns: None Raises:
  • ValueError: If you don’t provide either ids, where, or where_document

Embedding Functions

EmbeddingFunction

A protocol for embedding functions. To implement a new embedding function, Methods __init__(), build_from_config(), default_space(), embed_query(), embed_with_retries(), get_config(), is_legacy(), name(), supported_spaces(), validate_config(), validate_config_update()

SparseEmbeddingFunction

A protocol for sparse vector functions. To implement a new sparse vector function, Methods __init__(), build_from_config(), embed_query(), embed_with_retries(), get_config(), name(), validate_config(), validate_config_update()

Types

Embedding

Embedding[Tuple[Any, Ellipsis], dtype[Union[int32, float32]]]

SparseVector

Represents a sparse vector using parallel arrays for indices and values. Properties
indices
List[int]
values
List[float]
labels
Optional[IDs]
Methods __init__(), from_dict(), to_dict()

Schema

Collection schema for configuring indexes and encryption. Properties
defaults
ValueTypes
keys
Dict[str, ValueTypes]
cmek
Optional[Cmek]
Methods __init__(), create_index(), delete_index(), deserialize_from_json(), serialize_to_json(), set_cmek() Payload for hybrid search operations. Methods __init__(), group_by(), limit(), rank(), select(), select_all(), to_dict(), where()

Select

Selection configuration for search results. Properties
keys
Set[Union[Key, str]]
Methods __init__(), from_dict(), to_dict()

Knn

KNN-based ranking Properties
query
Optional[Embeddings]
key
Union[Key, str]
limit
int
default
Optional[float]
return_rank
bool
Methods __init__(), abs(), exp(), from_dict(), log(), max(), min(), to_dict()

Rrf

Reciprocal Rank Fusion for combining multiple ranking strategies. Properties
ranks
List[Rank]
k
int
weights
Optional[List[float]]
normalize
bool
Methods __init__(), abs(), exp(), from_dict(), log(), max(), min(), to_dict()

GetResult

Properties
ids
IDs
embeddings
Optional[Embeddings]
documents
Optional[IDs]
uris
Optional[IDs]
data
Optional[Optional[Embeddings]]
metadatas
Optional[List[Metadata]]
included
List[Literal[documents, embeddings, metadatas, distances, uris, data]]

QueryResult

Properties
ids
List[IDs]
embeddings
Optional[Embeddings]
documents
Optional[List[IDs]]
uris
Optional[List[IDs]]
data
Optional[List[Optional[Embeddings]]]
metadatas
Optional[List[List[Metadata]]]
distances
Optional[List[List[float]]]
included
List[Literal[documents, embeddings, metadatas, distances, uris, data]]

SearchResult

Column-major response from the search API with conversion methods. Properties
ids
List[IDs]
documents
List[Optional[List[Optional[str]]]]
embeddings
List[Optional[List[Optional[List[float]]]]]
metadatas
List[Optional[List[Optional[Dict[str, Any]]]]]
scores
List[Optional[List[Optional[float]]]]
select
List[IDs]
Methods rows()