Features

Dual Memory System

Conscious Ingest Mode

  • One-shot Working Memory: Essential memories injected once per session
  • Background Analysis: Automatic analysis of conversation patterns every 6 hours
  • Essential Memory Promotion: Key personal facts promoted for instant access
  • Human-like Memory: Like short-term memory for immediate access to important info
  • Performance Optimized: Minimal token usage, fast response times

Auto Ingest Mode

  • Dynamic Context Search: Analyzes each query for relevant memories
  • Full Database Search: Searches entire memory database intelligently
  • Context-Aware Injection: 3-5 most relevant memories per LLM call
  • Retrieval Agent: AI-powered search strategy and ranking
  • Rich Context: Higher token usage for maximum context awareness

Combined Mode

# Best of both worlds
memori = Memori(
    conscious_ingest=True,  # Essential working memory
    auto_ingest=True,       # Dynamic context search
    database_connect="postgresql://..."
)

Three-Layer Intelligence

graph TD
    A[Retrieval Agent] --> B[Dynamic Search & Ranking]
    C[Conscious Agent] --> D[Essential Memory Promotion]
    E[Memory Agent] --> F[Structured Processing]

    B --> G[Auto-Ingest Context]
    D --> G[Conscious Context]
    F --> H[Categorized Storage]
    G --> I[Intelligent Context Injection]

Memory Types & Categories

Automatic Categorization

CategoryDescriptionExamples
FactsObjective information"I use PostgreSQL for databases"
PreferencesPersonal choices"I prefer clean, readable code"
SkillsAbilities & expertise"Experienced with FastAPI"
ContextProject information"Working on e-commerce platform"
RulesGuidelines & constraints"Always write tests first"

Retention Policies

  • Short-term: Recent activities, temporary information (7 days)
  • Long-term: Important information, learned skills, preferences
  • Permanent: Critical rules, core preferences, essential facts

Universal Integration

Works with ANY LLM Library via LiteLLM

LiteLLM (Recommended)

from litellm import completion
from memori import Memori

memori = Memori(
    conscious_ingest=True,
    auto_ingest=True
)
memori.enable()

# Automatic context injection with dual modes
response = completion(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Help me code"}]
)

OpenAI Direct

import openai
from memori import Memori

memori = Memori(conscious_ingest=True)
memori.enable()

client = openai.OpenAI()
# All conversations automatically recorded
response = client.chat.completions.create(...)

Azure OpenAI

from memori import Memori
from memori.core.providers import ProviderConfig

azure_provider = ProviderConfig.from_azure(
    api_key="your-azure-key",
    azure_endpoint="https://your-resource.openai.azure.com/",
    azure_deployment="gpt-4o"
)

memori = Memori(
    provider_config=azure_provider,
    conscious_ingest=True
)
memori.enable()

Anthropic

import anthropic
from memori import Memori

memori = Memori(conscious_ingest=True)
memori.enable()

client = anthropic.Anthropic()
# All conversations automatically recorded
response = client.messages.create(...)

Custom/Ollama

from memori import Memori
from memori.core.providers import ProviderConfig

ollama_provider = ProviderConfig.from_custom(
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    model="llama3.2:3b"
)

memori = Memori(
    provider_config=ollama_provider,
    conscious_ingest=True
)

Production Architecture

Modular Design

memori/
├── core/              # Main Memori class, providers, memory modes
├── agents/            # AI-powered memory processing agents
├── database/          # Multi-database support with cloud options
├── integrations/      # LLM provider integrations (LiteLLM native)
├── config/            # Configuration management with Pydantic
├── utils/             # Helpers, validation, logging
└── tools/             # Memory search and function calling tools

Database Support

  • SQLite: Perfect for development and small applications
  • PostgreSQL: Production-ready with full-text search and JSON support
  • MySQL: Enterprise database support with modern features
  • Cloud Databases: Neon, Supabase, GibsonAI serverless options
  • Connection Pooling: Optimized performance with connection management

Configuration Management

from memori import ConfigManager

# Auto-load from multiple sources
config = ConfigManager()
config.auto_load()

# Loads from (in priority order):
# 1. Environment variables
# 2. memori.json/yaml files
# 3. Default Pydantic settings

memori = Memori()  # Uses loaded configuration

Performance Features

Dual Mode Token Optimization

Conscious Mode (Working Memory):
- Essential memories: 150-200 tokens
- One-shot injection per session
- Minimal overhead, maximum relevance

Auto Mode (Dynamic Search):
- Relevant context: 200-300 tokens
- Per-query intelligent search
- Rich context, performance optimized

Traditional Context Injection:
- Full conversation history: 2000+ tokens
- No intelligence, maximum waste

Efficiency Metrics

  • LiteLLM Native Integration: No monkey-patching overhead
  • Async Background Processing: Analysis doesn't block conversations
  • Intelligent Caching: Smart caching of search results and promotions
  • Provider Optimization: Efficient client management and connection reuse

Memory Mode Comparison

FeatureConscious ModeAuto ModeCombined
Token Usage~150 tokens~250 tokens~300 tokens
Response TimeFastestFastMedium
Context RichnessEssential onlyQuery-specificBest of both
Use CaseQuick accessDeep contextProduction apps

Security & Reliability

Data Protection

  • Input Sanitization: Protection against injection attacks
  • Credential Safety: Secure handling of API keys and secrets
  • Error Context: Detailed logging without exposing sensitive data
  • Graceful Degradation: Continues operation when components fail

Production Ready

  • Connection Pooling: Automatic database connection management
  • Resource Cleanup: Proper cleanup of resources and connections
  • Error Handling: Comprehensive exception handling with context
  • Monitoring: Built-in logging and performance metrics

Developer Experience

Simple Setup

# One line to enable dual memory modes
memori = Memori(
    conscious_ingest=True,
    auto_ingest=True
)
memori.enable()

# No more repeating context!

Advanced Configuration

# Production configuration with provider support
from memori.core.providers import ProviderConfig

azure_provider = ProviderConfig.from_azure(
    api_key="your-azure-key",
    azure_endpoint="https://your-resource.openai.azure.com/",
    azure_deployment="gpt-4o"
)

memori = Memori(
    database_connect="postgresql://user:pass@localhost/memori",
    provider_config=azure_provider,
    conscious_ingest=True,
    auto_ingest=True,
    namespace="production_app"
)

Memory Tools & Function Calling

from memori import create_memory_tool

# Create memory search tool
memory_tool = create_memory_tool(memori)

# Use with AI agents and function calling
def search_memory(query: str) -> str:
    """Search agent's memory for past conversations"""
    result = memory_tool.execute(query=query)
    return str(result) if result else "No relevant memories found"

# Function calling integration
tools = [{
    "type": "function",
    "function": {
        "name": "search_memory",
        "description": "Search memory for relevant information",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"]
        }
    }
}]

completion(model="gpt-4o", messages=[...], tools=tools)

Memory Analytics

Real-time Statistics

# Get comprehensive memory insights
stats = memori.get_memory_stats()
print(f"Total conversations: {stats.get('chat_history_count', 0)}")
print(f"Short-term memories: {stats.get('short_term_count', 0)}")
print(f"Long-term memories: {stats.get('long_term_count', 0)}")

# Get essential conversations (conscious mode)
essential = memori.get_essential_conversations()

# Trigger manual analysis
memori.trigger_conscious_analysis()

# Search by category
skills = memori.search_memories_by_category("skill")
preferences = memori.search_memories_by_category("preference")

Memory Mode Monitoring

# Check which modes are enabled
print(f"Conscious mode: {memori.conscious_ingest}")
print(f"Auto mode: {memori.auto_ingest}")

# Monitor performance
config_info = memori.memory_manager.get_config_info()
print(f"Provider: {memori.provider_config.api_type if memori.provider_config else 'default'}")

Debug Mode

# See what's happening behind the scenes
memori = Memori(
    conscious_ingest=True,
    auto_ingest=True,
    verbose=True  # Shows agent activity and mode switching
)

Extensibility

Custom Memory Processing

  • Create specialized agents for specific domains
  • Extend memory processing with custom logic
  • Domain-specific categorization and entity extraction
  • Custom retention policies and importance scoring

Provider System

# Extend provider support
class CustomProviderConfig(ProviderConfig):
    @classmethod
    def from_custom_service(cls, endpoint, auth):
        return cls(base_url=endpoint, api_key=auth, ...)

Memory Tools Extensions

  • Custom search functions for specific use cases
  • Domain-specific memory tools
  • Integration with AI agent frameworks
  • Function calling extensions for complex workflows

Plugin Architecture

  • Memory processing plugins for different domains
  • Custom database adapters for specialized storage
  • Integration with external knowledge systems
  • Event-driven architecture for real-time processing

Scalability

Enterprise Features

  • Multi-tenant Support: Separate memory spaces with namespaces
  • Horizontal Scaling: Distributed database support and load balancing
  • Provider Flexibility: Support for Azure, AWS, custom endpoints
  • Configuration Management: Centralized config with environment-specific settings
  • Monitoring: Comprehensive observability for production deployments

Performance Optimization

  • Indexed Search: Full-text search with proper indexing and ranking
  • Memory Compression: Intelligent consolidation over time
  • Adaptive Analysis: Dynamic frequency based on usage patterns
  • Connection Pooling: Optimized database connections for high throughput
  • Provider Caching: Smart caching for frequently accessed memories

Cloud-Native Support

# Serverless database integration
memori = Memori(
    database_connect="postgresql://user:pass@neon-serverless:5432/memori",
    provider_config=azure_provider,
    conscious_ingest=True,
    auto_ingest=True
)

# Environment-based configuration
# MEMORI_DATABASE__CONNECTION_STRING=postgresql://...
# MEMORI_AGENTS__OPENAI_API_KEY=sk-...
config = ConfigManager()
config.auto_load()

Future Roadmap

Planned Features

  • Enhanced Provider Support: Claude, Gemini, and more structured outputs
  • Vector Search: Semantic similarity search with embeddings
  • Memory Relationships: Understanding connections between facts and entities
  • Team Memory: Shared memory spaces for collaborative AI applications
  • Memory Migration: Easy import/export of memory data between instances
  • Advanced Analytics: Memory insights, conversation patterns, and usage analytics
  • Real-time Sync: Multi-instance memory synchronization for distributed systems