Haystack

Haystack is an open-source LLM framework in Python. It provides embedders, generators and rankers via a number of LLM providers, tooling for preprocessing and data preparation, connectors to a number of vector databases including Chroma and more. Haystack allows you to build custom LLM applications using both components readily available in Haystack and custom components. Some of the most common applications you can build with Haystack are retrieval-augmented generation pipelines (RAG), question-answering and semantic search.

Installation

pip install chroma-haystack

Usage

Write documents into a ChromaDocumentStore

import os
from pathlib import Path

from haystack import Pipeline
from haystack.components.converters import TextFileToDocument
from haystack.components.writers import DocumentWriter
from chroma_haystack import ChromaDocumentStore

file_paths = ["data" / Path(name) for name in os.listdir("data")]

document_store = ChromaDocumentStore()

indexing = Pipeline()
indexing.add_component("converter", TextFileToDocument())
indexing.add_component("writer", DocumentWriter(document_store))

indexing.connect("converter", "writer")
indexing.run({"converter": {"sources": file_paths}})

Build RAG on top of Chroma

from chroma_haystack.retriever import ChromaQueryRetriever
from haystack.components.generators import HuggingFaceTGIGenerator
from haystack.components.builders import PromptBuilder

prompt = """
Answer the query based on the provided context.
If the context does not contain the answer, say 'Answer not found'.
Context:
{% for doc in documents %}
  {{ doc.content }}
{% endfor %}
query: {{query}}
Answer:
"""
prompt_builder = PromptBuilder(template=prompt)

llm = HuggingFaceTGIGenerator(model="mistralai/Mixtral-8x7B-Instruct-v0.1", token='YOUR_HF_TOKEN')
llm.warm_up()
retriever = ChromaQueryRetriever(document_store)

querying = Pipeline()
querying.add_component("retriever", retriever)
querying.add_component("prompt_builder", prompt_builder)
querying.add_component("llm", llm)

querying.connect("retriever.documents", "prompt_builder.documents")
querying.connect("prompt_builder", "llm")

results = querying.run({"retriever": {"queries": [query], "top_k": 3},
                        "prompt_builder": {"query": query}})

Dense Embedding Models

Sparse Embedding Models

Frameworks

Installation

Usage

Write documents into a ChromaDocumentStore

Build RAG on top of Chroma

Dense Embedding Models

Sparse Embedding Models

Frameworks

​Installation

​Usage

​Write documents into a ChromaDocumentStore

​Build RAG on top of Chroma

Installation

Usage

Write documents into a ChromaDocumentStore

Build RAG on top of Chroma