RAG (Retrieval-Augmented Generation) in Mastra
Trust: ★★★☆☆ (0.90) · 0 validations · developer_reference
Published: 2026-05-10 · Source: crawler_authoritative
Tình huống
Guide for implementing Retrieval-Augmented Generation in Mastra to enhance LLM outputs with relevant context from custom data sources, targeting developers building RAG-powered applications.
Insight
Mastra’s RAG system provides standardized APIs to process and embed documents, supporting multiple vector stores including pgvector, Pinecone, Qdrant, and MongoDB. The system offers chunking and embedding strategies for optimal retrieval, along with observability features for tracking embedding and retrieval performance. Key components include: MDocument class from @mastra/rag for document handling; embedMany from the ‘ai’ package for generating embeddings; ModelRouterEmbeddingModel from @mastra/core/llm for routing to embedding providers; and vector store classes like PgVector from @mastra/pg. Document processing supports various chunking strategies including recursive and sliding window approaches, with configurable chunk size and overlap parameters. The workflow involves: initializing documents via MDocument.fromText() or MDocument.fromPdf(), creating chunks with doc.chunk({strategy, size, overlap}), generating embeddings using embedMany with the embedding model, and storing vectors in the chosen vector database. Similarity search is performed via the vector store’s query() method with topK parameter to retrieve the most relevant chunks.
Hành động
To implement RAG in Mastra: 1) Initialize a document using MDocument.fromText() or MDocument.fromPdf(). 2) Create chunks by calling doc.chunk({strategy: ‘recursive’, size: 512, overlap: 50}) with desired strategy, size, and overlap parameters. 3) Generate embeddings by passing chunk texts to embedMany() with a ModelRouterEmbeddingModel instance (e.g., ‘openai/text-embedding-3-small’). 4) Initialize and configure your vector database (PgVector, Pinecone, Qdrant, or MongoDB) with appropriate connection parameters. 5) Store embeddings using the vector store’s upsert() method with indexName and vectors. 6) At query time, embed the user query and use the vector store’s query() method with queryVector, topK, and indexName to retrieve similar chunks. Prerequisites: Set POSTGRES_CONNECTION_STRING environment variable for PgVector. Install required packages: @mastra/rag, @mastra/pg, zod, and ensure ai package is available.
Kết quả
Returns an array of similar chunks from the vector database query, ordered by relevance. Each result contains the chunk text and associated metadata that can be used as context for LLM prompts.
Điều kiện áp dụng
Requires Node.js environment. Uses @mastra/rag, @mastra/pg, @mastra/core/llm packages. For pgvector, requires PostgreSQL with pgvector extension and POSTGRES_CONNECTION_STRING environment variable.
Nội dung gốc (Original)
RAG (Retrieval-Augmented Generation) in Mastra
RAG in Mastra helps you enhance LLM outputs by incorporating relevant context from your own data sources, improving accuracy and grounding responses in real information.
Mastra’s RAG system provides:
- Standardized APIs to process and embed documents
- Support for multiple vector stores
- Chunking and embedding strategies for optimal retrieval
- Observability for tracking embedding and retrieval performance
Example
To implement RAG, you process your documents into chunks, create embeddings, store them in a vector database, and then retrieve relevant context at query time.
import { embedMany } from 'ai'
import { PgVector } from '@mastra/pg'
import { MDocument } from '@mastra/rag'
import { z } from 'zod'
// 1. Initialize document
const doc = MDocument.fromText(`Your document text here...`)
// 2. Create chunks
const chunks = await doc.chunk({
strategy: 'recursive',
size: 512,
overlap: 50,
})
// 3. Generate embeddings; we need to pass the text of each chunk
import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
const { embeddings } = await embedMany({
values: chunks.map(chunk => chunk.text),
model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})
// 4. Store in vector database
const pgVector = new PgVector({
id: 'pg-vector',
connectionString: process.env.POSTGRES_CONNECTION_STRING,
})
await pgVector.upsert({
indexName: 'embeddings',
vectors: embeddings,
}) // using an index name of 'embeddings'
// 5. Query similar chunks
const results = await pgVector.query({
indexName: 'embeddings',
queryVector: queryVector,
topK: 3,
}) // queryVector is the embedding of the query
console.log('Similar chunks:', results)This example shows the essentials: initialize a document, create chunks, generate embeddings, store them, and query for similar content.
Document processing
The basic building block of RAG is document processing. Documents can be chunked using various strategies (recursive, sliding window, etc.) and enriched with metadata. See the chunking and embedding doc.
Vector storage
Mastra supports multiple vector stores for embedding persistence and similarity search, including pgvector, Pinecone, Qdrant, and MongoDB. See the vector database doc.
More resources
Liên kết
- Nền tảng: Dev Framework · Mastra
- Nguồn: https://mastra.ai/docs/rag/overview
Xem thêm: