RAG (Retrieval-Augmented Generation) in Mastra

Trust: ★★★☆☆ (0.90) · 0 validations · developer_reference

Published: 2026-05-10 · Source: crawler_authoritative

Tình huống

Guide for implementing Retrieval-Augmented Generation in Mastra to enhance LLM outputs with relevant context from custom data sources, targeting developers building RAG-powered applications.

Insight

Mastra’s RAG system provides standardized APIs to process and embed documents, supporting multiple vector stores including pgvector, Pinecone, Qdrant, and MongoDB. The system offers chunking and embedding strategies for optimal retrieval, along with observability features for tracking embedding and retrieval performance. Key components include: MDocument class from @mastra/rag for document handling; embedMany from the ‘ai’ package for generating embeddings; ModelRouterEmbeddingModel from @mastra/core/llm for routing to embedding providers; and vector store classes like PgVector from @mastra/pg. Document processing supports various chunking strategies including recursive and sliding window approaches, with configurable chunk size and overlap parameters. The workflow involves: initializing documents via MDocument.fromText() or MDocument.fromPdf(), creating chunks with doc.chunk({strategy, size, overlap}), generating embeddings using embedMany with the embedding model, and storing vectors in the chosen vector database. Similarity search is performed via the vector store’s query() method with topK parameter to retrieve the most relevant chunks.

Hành động

To implement RAG in Mastra: 1) Initialize a document using MDocument.fromText() or MDocument.fromPdf(). 2) Create chunks by calling doc.chunk({strategy: ‘recursive’, size: 512, overlap: 50}) with desired strategy, size, and overlap parameters. 3) Generate embeddings by passing chunk texts to embedMany() with a ModelRouterEmbeddingModel instance (e.g., ‘openai/text-embedding-3-small’). 4) Initialize and configure your vector database (PgVector, Pinecone, Qdrant, or MongoDB) with appropriate connection parameters. 5) Store embeddings using the vector store’s upsert() method with indexName and vectors. 6) At query time, embed the user query and use the vector store’s query() method with queryVector, topK, and indexName to retrieve similar chunks. Prerequisites: Set POSTGRES_CONNECTION_STRING environment variable for PgVector. Install required packages: @mastra/rag, @mastra/pg, zod, and ensure ai package is available.

Kết quả

Returns an array of similar chunks from the vector database query, ordered by relevance. Each result contains the chunk text and associated metadata that can be used as context for LLM prompts.

Điều kiện áp dụng

Requires Node.js environment. Uses @mastra/rag, @mastra/pg, @mastra/core/llm packages. For pgvector, requires PostgreSQL with pgvector extension and POSTGRES_CONNECTION_STRING environment variable.

Nội dung gốc (Original)

RAG (Retrieval-Augmented Generation) in Mastra

RAG in Mastra helps you enhance LLM outputs by incorporating relevant context from your own data sources, improving accuracy and grounding responses in real information.

Mastra’s RAG system provides:

Standardized APIs to process and embed documents
Support for multiple vector stores
Chunking and embedding strategies for optimal retrieval
Observability for tracking embedding and retrieval performance

Example

To implement RAG, you process your documents into chunks, create embeddings, store them in a vector database, and then retrieve relevant context at query time.

import { embedMany } from 'ai'
import { PgVector } from '@mastra/pg'
import { MDocument } from '@mastra/rag'
import { z } from 'zod'
 
// 1. Initialize document
const doc = MDocument.fromText(`Your document text here...`)
 
// 2. Create chunks
const chunks = await doc.chunk({
  strategy: 'recursive',
  size: 512,
  overlap: 50,
})
 
// 3. Generate embeddings; we need to pass the text of each chunk
import { ModelRouterEmbeddingModel } from '@mastra/core/llm'
 
const { embeddings } = await embedMany({
  values: chunks.map(chunk => chunk.text),
  model: new ModelRouterEmbeddingModel('openai/text-embedding-3-small'),
})
 
// 4. Store in vector database
const pgVector = new PgVector({
  id: 'pg-vector',
  connectionString: process.env.POSTGRES_CONNECTION_STRING,
})
await pgVector.upsert({
  indexName: 'embeddings',
  vectors: embeddings,
}) // using an index name of 'embeddings'
 
// 5. Query similar chunks
const results = await pgVector.query({
  indexName: 'embeddings',
  queryVector: queryVector,
  topK: 3,
}) // queryVector is the embedding of the query
 
console.log('Similar chunks:', results)

This example shows the essentials: initialize a document, create chunks, generate embeddings, store them, and query for similar content.

Document processing

The basic building block of RAG is document processing. Documents can be chunked using various strategies (recursive, sliding window, etc.) and enriched with metadata. See the chunking and embedding doc.

Vector storage

Mastra supports multiple vector stores for embedding persistence and similarity search, including pgvector, Pinecone, Qdrant, and MongoDB. See the vector database doc.

More resources

Chain of Thought RAG Example

Liên kết

Nền tảng: Dev Framework · Mastra
Nguồn: https://mastra.ai/docs/rag/overview

Xem thêm:

KakaFlow KB

Nội dung

RAG (Retrieval-Augmented Generation) in Mastra

RAG (Retrieval-Augmented Generation) in Mastra

Tình huống

Insight

Hành động

Kết quả

Điều kiện áp dụng

Nội dung gốc (Original)

RAG (Retrieval-Augmented Generation) in Mastra

Example

Document processing

Vector storage

More resources

Liên kết

Sơ đồ

Mục lục

Liên kết ngược