n_results, distance_function, embedding_model, chunk_size, etc.
For more information on how to use DeepEval, see the DeepEval docs.
Getting Started
Step 1: Installation
Step 2: Preparing a Test Case
Prepare a query, generate a response using your RAG pipeline, and store the retrieval context from your Chroma retriever to create anLLMTestCase for evaluation.
Step 3: Evaluation
Define retriever metrics likeContextual Precision, Contextual Recall, and Contextual Relevancy to evaluate test cases. Recall ensures enough vectors are retrieved, while relevancy reduces noise by filtering out irrelevant ones.
Balancing recall and relevancy is key.
distance_function and embedding_model affects recall, while n_results and chunk_size impact relevancy.4. Visualize and Optimize
To visualize evaluation results, log in to the Confident AI (DeepEval platform) by running:evaluate will automatically send evaluation results to Confident AI, where you can visualize and analyze performance metrics, identify failing retriever hyperparameters, and optimize your Chroma retriever for better accuracy.
To learn more about how to use the platform, please see this Quickstart Guide.