Core Concepts
Empire Chain is built around several core concepts that work together to provide a comprehensive AI development framework.
Language Models (LLMs)
Empire Chain supports multiple LLM providers through a unified interface:
from empire_chain.llms import OpenAILLM, AnthropicLLM, GroqLLM
# OpenAI
openai_llm = OpenAILLM("gpt-4")
# Anthropic
anthropic_llm = AnthropicLLM("claude-3-sonnet")
# Groq
groq_llm = GroqLLM("mixtral-8x7b")
Each LLM implementation provides consistent methods:
- generate()
: Generate text based on a prompt
- Error handling and retry logic
- Streaming support where available
Vector Stores
Vector stores are used for efficient similarity search and retrieval:
from empire_chain.vector_stores import QdrantVectorStore, ChromaVectorStore
# In-memory Qdrant store
qdrant_store = QdrantVectorStore(":memory:")
# Persistent ChromaDB store
chroma_store = ChromaVectorStore()
Common operations:
- add()
: Add text and embeddings
- query()
: Retrieve similar documents
- delete()
: Remove documents
- clear()
: Reset the store
Embeddings
Embeddings convert text into vector representations:
from empire_chain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings("text-embedding-3-small")
vector = embeddings.embed("Your text here")
Features: - Batched processing - Caching support - Error handling
Document Processing
The document processing system handles various file formats:
from empire_chain.file_reader import DocumentReader
reader = DocumentReader()
text = reader.read("document.pdf") # Supports PDF, DOCX, etc.
Capabilities: - PDF processing with PyPDF2 - Word document processing with python-docx - Text extraction and cleaning - Metadata handling
Speech Processing
Speech-to-Text capabilities are provided through various models:
Features: - Audio file support - Real-time transcription - Multiple language support
Web Crawling
Web content extraction is handled through crawl4ai:
from empire_chain.crawl4ai import Crawler
crawler = Crawler()
data = crawler.crawl("https://example.com")
Capabilities: - HTML parsing - Content extraction - Rate limiting - Error handling
Data Visualization
The visualization system provides tools for data analysis:
from empire_chain.visualizer import DataAnalyzer, ChartFactory
analyzer = DataAnalyzer()
data = analyzer.analyze(your_data)
chart = ChartFactory.create_chart('Bar Graph', data)
Chart types: - Bar graphs - Line charts - Scatter plots - Custom visualizations
Interactive Interfaces
Streamlit-based interfaces for various applications:
from empire_chain.streamlit import Chatbot, VisionChatbot, PDFChatbot
# Text chatbot
chatbot = Chatbot(llm=OpenAILLM("gpt-4"))
# Vision chatbot
vision_bot = VisionChatbot()
# PDF chatbot
pdf_bot = PDFChatbot(
llm=OpenAILLM("gpt-4"),
vector_store=QdrantVectorStore(":memory:")
)
Features: - File upload - Interactive chat - Real-time responses - Error handling
PhiData Agents
Specialized agents for specific tasks:
from empire_chain.phidata_agents import PhiWebAgent, PhiFinanceAgent
web_agent = PhiWebAgent()
finance_agent = PhiFinanceAgent()
Capabilities: - Web search and analysis - Financial data processing - Task automation - Structured output
Document Analysis
Advanced document analysis with Docling:
from empire_chain.docling import Docling
docling = Docling()
analysis = docling.generate("Analyze this document")
Features: - Content analysis - Topic extraction - Summary generation - Key point identification
Processing Pipeline
The processing pipeline consists of several stages:
- Input Processing
- Document loading
- Format detection
-
Initial preprocessing
-
Content Extraction
- Text extraction
- Structure analysis
-
Metadata collection
-
Analysis
- Content analysis
- Feature extraction
-
Entity recognition
-
Output Generation
- Response formatting
- Result compilation
- Export handling
Visualization System
The visualization system provides tools for:
- Data plotting
- Process monitoring
- Result analysis
- Interactive dashboards
RAG Architecture
The RAG (Retrieval Augmented Generation) system consists of:
Components
- Document Indexer
- Processes and indexes documents
-
Creates searchable representations
-
Retriever
- Searches for relevant information
-
Ranks and filters results
-
Generator
- Combines retrieved information
- Generates coherent responses
Flow
graph LR
A[Input Query] --> B[Retriever]
B --> C[Context Selection]
C --> D[Generator]
D --> E[Response]
Error Handling
Empire Chain uses a hierarchical error system:
Configuration System
Levels of Configuration
- Global Configuration
- System-wide settings
-
Default behaviors
-
Component Configuration
- Component-specific settings
-
Override capabilities
-
Runtime Configuration
- Dynamic settings
- Session-specific overrides
Event System
The event system allows for:
- Progress monitoring
- Status updates
- Error tracking
- Custom callbacks
from empire_chain.events import EventHandler
def on_document_processed(event):
print(f"Processed: {event.document_id}")
handler = EventHandler()
handler.subscribe("document_processed", on_document_processed)
Extension System
Empire Chain can be extended through:
- Custom Processors
- Model Adapters
- Pipeline Stages
- Visualization Components
Example of a custom processor: