☁️ AWS Deployment

A Simple AWS Deployment#

You can deploy Chroma on a long-running server, and connect to it remotely.

There are many possible configurations, but for convenience we have provided a very simple AWS CloudFormation template to experiment with deploying Chroma to EC2 on AWS.

Step 1: Get an AWS Account#

You will need an AWS Account. You can use one you already have, or create a new one.

Step 2: Get credentials#

For this example, we will be using the AWS command line interface. There are several ways to configure the AWS CLI, but for the purposes of these examples we will presume that you have obtained an AWS access key and will be using environment variables to configure AWS.

Export the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables in your shell:

shell

You can also configure AWS to use a region of your choice using the AWS_REGION environment variable:

shell

Step 3: Run CloudFormation#

Chroma publishes a CloudFormation template to S3 for each release.

To launch the template using AWS CloudFormation, run the following command line invocation.

Replace --stack-name my-chroma-stack with a different stack name, if you wish.

Command Line

Wait a few minutes for the server to boot up, and Chroma will be available! You can get the public IP address of your new Chroma server using the AWS console, or using the following command:

Command Line

Note that even after the IP address of your instance is available, it may still take a few minutes for Chroma to be up and running.

Customize the Stack (optional)#

The CloudFormation template allows you to pass particular key/value pairs to override aspects of the stack. Available keys are:

  • InstanceType - the AWS instance type to run (default: t3.small)
  • KeyName - the AWS EC2 KeyPair to use, allowing to access the instance via SSH (default: none)

To set a CloudFormation stack's parameters using the AWS CLI, use the --parameters command line option. Parameters must be specified using the format ParameterName={parameter},ParameterValue={value}.

For example, the following command launches a new stack similar to the above, but on a m5.4xlarge EC2 instance, and adding a KeyPair named mykey so anyone with the associated private key can SSH into the machine:

shell

Step 4: Chroma Client Set-Up#

Once your EC2 instance is up and running with Chroma, all you need to do is configure your HttpClient to use the server's IP address and port 8000. Since you are running a Chroma server on AWS, our thin-client package may be enough for your application.

python

Step 5: Clean Up (optional).#

To destroy the stack and remove all AWS resources, use the AWS CLI delete-stack command.

shell

Authentication with AWS#

By default, the EC2 instance created by our CloudFormation template will run with no authentication. There are many ways to secure your Chroma instance on AWS. In this guide we will use a simple set-up using Chroma's native authentication support.

You can learn more about authentication with Chroma in the Auth Guide.

Static API Token Authentication#

Customize Chroma's CloudFormation Stack#

If, for example, you want the static API token to be "test-token", pass the following parameters when creating your Chroma stack. This will set Authorization: Bearer test-token as your authentication header.

shell

To use X-Chroma-Token: test-token type of authentication header you can set the ChromaAuthTokenTransportHeader parameter:

shell

Client Set-Up#

Add the CHROMA_CLIENT_AUTH_CREDENTIALS environment variable to your local environment, and set it to the token you provided the server (test-token in this example):

shell

We will use Chroma's Settings object to define the authentication method on the client.

python

If you are using a custom CHROMA_AUTH_TOKEN_TRANSPORT_HEADER (like X-Chroma-Token), add it to your Settings:

python

Observability with AWS#

Chroma is instrumented with OpenTelemetry hooks for observability. We currently only exports OpenTelemetry traces. These should allow you to understand how requests flow through the system and quickly identify bottlenecks.

Tracing is configured with four environment variables:

  • CHROMA_OTEL_COLLECTION_ENDPOINT: where to send observability data. Example: api.honeycomb.com.
  • CHROMA_OTEL_SERVICE_NAME: Service name for OTel traces. Default: chromadb.
  • CHROMA_OTEL_COLLECTION_HEADERS: Headers to use when sending observability data. Often used to send API and app keys. For example {"x-honeycomb-team": "abc"}.
  • CHROMA_OTEL_GRANULARITY: A value from the OpenTelemetryGranularity enum. Specifies how detailed tracing should be.

To enable tracing on your Chroma server, simply pass your desired values as parameters when creating your Cloudformation stack:

shell

Troubleshooting#

Error: No default VPC for this user#

If you get an error saying No default VPC for this user when creating ChromaInstanceSecurityGroup, head to AWS VPC section and create a default VPC for your user.