Distributed Chroma shards data across collections. Individual collections have
isolated cold starts and rate limits, which prevents the workload of one
collection from interfering with the workload of another.If you have data that can be sharded, you are strongly encouraged to do so. It
will usually cost less and perform better. For example, if an AI platform is
using Chroma to store customers’ isolated knowledge bases, it should put each
customer’s data in its own collection.
By default, Chroma builds indexes for all data, including full-text and regex
search on the document, as well as inverted indexes on all metadata values.
These indexes add overhead when writing to Chroma.If you are not using FTS or regex, or if you are not filtering by a metadata
value, you can disable these indexes using the
Schema.