☁️ GCP Deployment

A Simple GCP Deployment#

You can deploy Chroma on a long-running server, and connect to it remotely.

For convenience, we have provided a very simple Terraform configuration to experiment with deploying Chroma to Google Compute Engine.

Step 1: Set up your GCP credentials#

In your GCP project, create a service account for deploying Chroma. It will need the following roles:

  • Service Account User
  • Compute Admin
  • Compute Network Admin
  • Storage Admin

Create a JSON key file for this service account, and download it. Set the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of your JSON key file:

shell

Step 2: Install Terraform#

Download Terraform and follow the installation instructions for you OS.

Step 3: Configure your GCP Settings#

Create a chroma.tfvars file. Use it to define the following variables for your GCP project ID, region, and zone:

text

Step 4: Initialize and deploy with Terraform#

Download our GCP Terraform configuration to the same directory as your chroma.tfvars file. Then run the following commands to deploy your Chroma stack.

Initialize Terraform:

shell

Plan the deployment, and review it to ensure it matches your expectations:

shell

If you did not customize our configuration, you should be deploying an e2-small instance.

Finally, apply the deployment:

shell

Customize the Stack (optional)#

If you want to use a machine type different from the default e2-small, in your chroma.tfvars add the machine_type variable and set it to your desired machine:

text

After a few minutes, you can get the IP address of your instance with

shell

Step 5: Chroma Client Set-Up#

Once your Compute Engine instance is up and running with Chroma, all you need to do is configure your HttpClient to use the server's IP address and port 8000. Since you are running a Chroma server on GCP, our thin-client package may be enough for your application.

python

Step 5: Clean Up (optional).#

To destroy the stack and remove all GCP resources, use the terraform destroy command.

shell

Authentication with GCP#

By default, the Compute Engine instance created by our Terraform configuration will run with no authentication. There are many ways to secure your Chroma instance on GCP. In this guide we will use a simple set-up using Chroma's native authentication support.

You can learn more about authentication with Chroma in the Auth Guide.

Static API Token Authentication#

Customize Chroma's Terraform Configuration#

If, for example, you want the static API token to be "test-token", set the following variables in your chroma.tfvars. This will set Authorization: Bearer test-token as your authentication header.

text

To use X-Chroma-Token: test-token type of authentication header you can set the ChromaAuthTokenTransportHeader parameter:

text

Client Set-Up#

Add the CHROMA_CLIENT_AUTH_CREDENTIALS environment variable to your local environment, and set it to the token you provided the server (test-token in this example):

shell

We will use Chroma's Settings object to define the authentication method on the client.

python

If you are using a custom CHROMA_AUTH_TOKEN_TRANSPORT_HEADER (like X-Chroma-Token), add it to your Settings:

python

Observability with GCP#

Chroma is instrumented with OpenTelemetry hooks for observability. We currently only exports OpenTelemetry traces. These should allow you to understand how requests flow through the system and quickly identify bottlenecks.

Tracing is configured with four environment variables:

  • CHROMA_OTEL_COLLECTION_ENDPOINT: where to send observability data. Example: api.honeycomb.com.
  • CHROMA_OTEL_SERVICE_NAME: Service name for OTel traces. Default: chromadb.
  • CHROMA_OTEL_COLLECTION_HEADERS: Headers to use when sending observability data. Often used to send API and app keys. For example {"x-honeycomb-team": "abc"}.
  • CHROMA_OTEL_GRANULARITY: A value from the OpenTelemetryGranularity enum. Specifies how detailed tracing should be.

To enable tracing on your Chroma server, simply define the following variables in your chroma.tfvars:

text