The instructor-embeddings library is another option, especially when running on a machine with a cuda-capable GPU. They are a good local alternative to OpenAI (see the Massive Text Embedding Benchmark rankings). The embedding function requires the InstructorEmbedding package. To install it, run pip install InstructorEmbedding
.
There are three models available. The default is hkunlp/instructor-base
, and for better performance you can use hkunlp/instructor-large
or hkunlp/instructor-xl
. You can also specify whether to use cpu
(default) or cuda
. For example:
or
Keep in mind that the large and xl models are 1.5GB and 5GB respectively, and are best suited to running on a GPU.