Braintrust

Braintrust is an enterprise-grade stack for building AI products including: evaluations, prompt playground, dataset management, tracing, etc.

Braintrust provides a Typescript and Python library to run and log evaluations and integrates well with Chroma.

Example evaluation script in Python: (refer to the tutorial above to get the full implementation)

python

Learn more: docs.