Use Case
From raw docs to semantic search in five minutes
Most RAG setups require stitching together an embedding model, a vector database, chunking logic, and a retrieval layer. Lux collapses all of it into one database. Ingest, embed, search, retrieve -- one connection string, zero infrastructure overhead.
The problem
You want your LLM to answer questions about your data. Simple enough in theory. In practice, you are now choosing between OpenAI, Cohere, and Voyage for embeddings. You are evaluating Pinecone vs Weaviate vs Chroma vs pgvector for storage. You are writing chunking logic that splits on paragraphs but respects sentence boundaries and handles markdown headers. You are building a retrieval API that embeds the query, searches the vector store, and formats results for the prompt.
Most teams spend days on this plumbing before writing a single prompt. And once it is running, you are managing two separate services (your app database and your vector store), paying two separate bills, and debugging failures across two separate systems. The vector store goes down? Your app still runs, but every answer is hallucinated garbage because there is no retrieval context.
It does not have to be this complicated. The core operation is: store text as vectors, find similar vectors at query time, return the text. That should be a database feature, not an architecture diagram.
Without Lux vs With Lux
Before
LangChain / LlamaIndex
Orchestration framework
Pinecone / Weaviate
Vector storage & search
OpenAI / Cohere
Embedding generation
Custom chunking logic
Text splitting & overlap
Retrieval API
Query, rank, format glue code
After
Just Lux
Ingest + Embed + Search + Retrieve
One database. One connection string.
LUX_DIRECT_URL or lux://localhost:6379
Store document chunks
Embed your text, store it as a vector with the original content as metadata. Each chunk is instantly searchable.
import { Lux } from "@luxdb/sdk" const db = new Lux(process.env.LUX_DIRECT_URL ?? "lux://localhost:6379") const chunks = [ { text: "Lux supports over 200 Redis commands including strings, lists, sets, sorted sets, and hashes.", source: "docs/overview.md", section: "compatibility" }, { text: "Vector search uses cosine similarity to find the closest matching documents in sub-millisecond time.", source: "docs/vectors.md", section: "search" }, { text: "The embedding pipeline in the dashboard handles chunking and vector generation automatically.", source: "docs/cloud.md", section: "embedding" }, ] for (const [i, chunk] of chunks.entries()) { const vec = await embed(chunk.text) await db.vset(`doc:${{i}{'}}`, vec, { metadata: { text: chunk.text, source: chunk.source, section: chunk.section } }) }
How it works
Lux's vector engine stores embeddings in-memory alongside their metadata. When you call VSEARCH, it computes cosine similarity against every stored vector and returns the top-k matches with their metadata. For RAG, this means your original document text comes back with each result, ready to be injected into a prompt. No secondary lookup. No joining across tables. One command, full context.
The Lux Cloud dashboard includes a built-in embedding pipeline that handles the entire ingestion flow. Paste your text or upload a document, configure your chunk size and overlap, pick your embedding provider (OpenRouter or OpenAI), and the pipeline chunks, embeds, and stores everything automatically. You can go from a raw PDF to a searchable knowledge base without writing any code. For programmatic ingestion, the SDK gives you full control: generate your own embeddings, attach whatever metadata you want, and store vectors with VSET.
For many app-scale RAG workloads, the bottleneck is not ANN index sophistication. It is crossing services, serializing metadata, and performing secondary lookups. Lux keeps vectors, cache, and application state in one process, so small-to-mid-sized knowledge bases can be searched without another network hop. Very large corpora still need workload-specific benchmarking and may prefer dedicated vector infrastructure.
And because Lux speaks the Redis protocol, you can use it as your cache and your vector store simultaneously. Store your LLM response cache, your rate limiting counters, your session data, and your document vectors in the same database. One fewer service to deploy, monitor, and pay for.
Full retrieval-augmented generation flow
Embed the user's question, search for relevant chunks, extract the text, and feed it to your LLM as context.
import { Lux } from "@luxdb/sdk" const db = new Lux(process.env.LUX_DIRECT_URL ?? "lux://localhost:6379") async function askWithContext(question: string) { const cacheKey = `rag:cache:${{question}{'}}` const cached = await db.get(cacheKey) if (cached) return JSON.parse(cached) const queryVec = await embed(question) const results = await db.vsearch(queryVec, { k: 5, meta: true }) const context = results .filter(r => r.similarity > 0.65) .map(r => `[${{r.metadata.source}{'}}] ${{r.metadata.text}{'}}`) .join("\n\n") const response = await llm.chat({ system: `Answer based on the following documentation. If the docs don't cover it, say so.\n\n${{context}{'}}`, messages: [{ role: "user", content: question }] }) const answer = { text: response.text, sources: results.map(r => r.metadata.source), confidence: results[0]?.similarity || 0 } await db.set(cacheKey, JSON.stringify(answer), { ex: 3600 }) return answer }
Feature deep-dive
One-Step Embedding
The dashboard embedding pipeline chunks and embeds your documents automatically. Upload a file or paste text, configure chunk size and overlap, pick your provider, and start searching immediately. No embedding code required for the common case.
Metadata Filtering
Attach structured metadata to every vector: source file, section, date, category, version, or any field your pipeline needs. Filter at query time to scope searches to specific documents, date ranges, or content types without re-embedding.
Semantic + Keyword Hybrid
Coming soon: combine vector similarity with keyword matching for higher precision retrieval. Semantic search catches meaning, keyword search catches exact terms. The hybrid approach ensures you never miss a relevant document, even when the user's query uses domain-specific terminology.
Performance
<1ms
Vector search latency
@ 10K document chunks
Auto
Chunking & embedding
Configurable size & overlap
0
Infrastructure beyond Lux
Cache + vectors + state in one
Ship your RAG pipeline in minutes, not days. Start free on Lux Cloud.