Skip to main content

Sharding

Distributed Chroma shards data across collections. Individual collections have isolated cold starts and rate limits, which prevents the workload of one collection from interfering with the workload of another. If you have data that can be sharded, you are strongly encouraged to do so. It will usually cost less and perform better. For example, if an AI platform is using Chroma to store customers’ isolated knowledge bases, it should put each customer’s data in its own collection.

Indexes

By default, Chroma builds indexes for all data, including full-text and regex search on the document, as well as inverted indexes on all metadata values. These indexes add overhead when writing to Chroma. If you are not using FTS or regex, or if you are not filtering by a metadata value, you can disable these indexes using the Schema.