🐍 Embeddings
SurrealDB offers comprehensive support for vector embeddings, enabling powerful semantic search and machine learning capabilities across your data. Through integrations with leading embedding providers, you can easily store, index and query high-dimensional vectors alongside your regular data.
More details and providers in LangChain Embedding models documentation.
Ollama
from langchain_ollama import OllamaEmbeddings
vector_store = SurrealDBVectorStore(
OllamaEmbeddings(model="all-minilm:22m"),
conn
)
More Ollama embedding models in their documentation.
OpenAI
Requires OPENAI_API_KEY environment variable.
from langchain_openai import OpenAIEmbeddings
vector_store = SurrealDBVectorStore(
OpenAIEmbeddings(model="text-embedding-3-large"),
conn
)
Mistral
Requires MISTRALAI_API_KEY environment variable.
from langchain_mistralai import MistralAIEmbeddings
vector_store = SurrealDBVectorStore(
MistralAIEmbeddings(model="mistral-embed"),
conn
)
from langchain_huggingface import HuggingFaceEmbeddings
vector_store = SurrealDBVectorStore(
HuggingFaceEmbeddings(
model_name="sentence-transformers/all-MiniLM-L6-v2"
),
conn
)
More SentenceTransformer models in their documentation.
AWS Bedrock
from langchain_aws import BedrockEmbeddings
vector_store = SurrealDBVectorStore(
BedrockEmbeddings(model_id="amazon.titan-embed-text-v2:0"),
conn
)
Gemini
Requires GOOGLE_API_KEY environment variable.
from langchain_google_genai import GoogleGenerativeAIEmbeddings
vector_store = SurrealDBVectorStore(
GoogleGenerativeAIEmbeddings(model="models/embedding-001"),
conn
)
Then, to query the vector store using similarity search:
doc1 = Document(
page_content="SurrealDB is the ultimate multi-model database for AI applications",
metadata={"key": "sdb"},
)
doc2 = Document(
page_content="Surrealism is an artistic and cultural movement that emerged in the early 20th century",
metadata={"key": "surrealism"},
)
vector_store.add_documents(documents=[doc1, doc2], ids=["1", "2"])
results = vector_store.similarity_search_with_score(query=q, k=2)
for doc, score in results:
print(f"• [{score:.0%}]: {doc.page_content}")
top_match = results[0][0]
Find an example in Minimal LangChain chatbot example with vector and graph.
Ollama
import ollama
embedding = ollama.embed(model="all-minilm:22m", input=text)
conn.create("documents", { "content": text, "embedding": embedding })
More Ollama embedding models in their documentation.
OpenAI
from openai import OpenAI
client = OpenAI()
text = "Your text string"
response = client.embeddings.create(
input=text,
model="text-embedding-3-small"
)
conn.create("documents", { "content": text, "embedding": response.data[0].embedding })
More info in OpenAI embeddings documentation.
from sentence_transformers import SentenceTransformer
st = SentenceTransformer("all-MiniLM-L6-v2")
embedding = st.encode(text).tolist()
conn.create("documents", { "content": text, "embedding": embedding })
More SentenceTransformer models in their documentation.
To query the vector store using similarity search:
k = 5
query_embedding = st.encode("this is your query text")
res = conn.query(
f"""
SELECT
*,
vector::similarity::knn() AS dist
FROM documents
WHERE embedding <|{k}|> $vector;
""",
{
"vector": query_embedding.tolist(),
},
)
This requires an index the be created beforehand. Refer to the vector search cheat sheet.
Examples above assume you have a DB connection like this:
conn = Surreal("localhost")
conn.signin({"username": "root", "password": "secret"})
conn.use("test_ns", "test_db")