RAG (Retrieval-Augmented Generation)

Empire Chain provides a powerful RAG implementation that combines document processing, vector stores, and LLMs for enhanced question-answering capabilities.

Basic Usage

from empire_chain.vector_stores import QdrantVectorStore
from empire_chain.embeddings import OpenAIEmbeddings
from empire_chain.llms.llms import GroqLLM
from empire_chain.tools.file_reader import DocumentReader
from empire_chain.stt.stt import GroqSTT

# Initialize components
vector_store = QdrantVectorStore(":memory:")
embeddings = OpenAIEmbeddings("text-embedding-3-small")
llm = GroqLLM("llama3-8b-8192")
reader = DocumentReader()

# Read and process document
file_path = "input.pdf"
text = reader.read(file_path)

# Create and store embeddings
text_embedding = embeddings.embed(text)
vector_store.add(text, text_embedding)

# Process query
text_query = "What is the main topic of this document?"
query_embedding = embeddings.embed(text_query)
relevant_texts = vector_store.query(query_embedding, k=3)

# Generate response
context = "\n".join(relevant_texts)
prompt = f"Based on the following context, {text_query}\n\nContext: {context}"
response = llm.generate(prompt)

Audio Input Support

Empire Chain's RAG system also supports audio input through speech-to-text conversion:

# Initialize STT
stt = GroqSTT()

# Convert audio to text
audio_query = stt.transcribe("audio.mp3")
query_embedding = embeddings.embed(audio_query)

# Process as before...

Supported Components

Vector Stores: Qdrant, ChromaDB
Embeddings: OpenAI, HuggingFace
LLMs: OpenAI, Anthropic, Groq
Document Types: PDF, DOCX, TXT, JSON, CSV, Google Drive files

For more examples and advanced usage, check out the RAG cookbooks in the repository.