向量数据库Chroma教程

Chroma 是 AI 原生的开源矢量数据库。Chroma 使知识、事实和技能可插入 LLM，从而可以轻松构建 LLM 应用程序。

Chroma 可以提供以下工具：

存储嵌入及其元数据
嵌入文档和查询
搜索嵌入

Chroma 优先考虑：

简单性和开发人员生产力
它也恰好非常快

Chroma快速上手

官方示例

Python

在 Python 中，Chroma 可以在 Python 脚本中运行或作为服务器运行。

pip install chromadb

JavaScript

在 JavaScript 中，使用 Chroma JS/TS 客户端连接到 Chroma 服务器。

yarn install chromadb chromadb-default-embed

完整示例

这里我自己的demo是采用python完成的

pip install chromadb

直接运行如下代码，便是一个完整的Demo：

import chromadb
chroma_client = chromadb.Client()

collection = chroma_client.create_collection(name="my_collection")

collection.add(
    documents=["This is a document about engineer", "This is a document about steak"],
    metadatas=[{"source": "doc1"}, {"source": "doc2"}],
    ids=["id1", "id2"]
)

results = collection.query(
    query_texts=["Which food is the best?"],
    n_results=2
)

print(results)

可以看到这里我提交了两个文档：“doc1”，“doc2”，这里是为了简单文档中只有两个字符串：”This is a document about engineer”, “This is a document about steak”。提问内容为：Which food is the best?

返回结果

{
 'ids': [
  ['id2', 'id1']
 ],
 'distances': [
  [1.5835548639297485, 2.1740970611572266]
 ],
 'metadatas': [
  [{
   'source': 'doc2'
  }, {
   'source': 'doc1'
  }]
 ],
 'embeddings': None,
 'documents': [
  ['This is a document about steak', 'This is a document about engineer']
 ]
}

这样就被正确的返回了