- Implement Combat Service - Implement Damage Calculator - Implement Effect Processor - Implement Combat Actions - Created Combat API Endpoints
14 KiB
Vector Database Strategy
Overview
This document outlines the strategy for implementing layered knowledge systems using vector databases to provide NPCs and the Dungeon Master with contextual lore, regional history, and world knowledge.
Status: Planning Phase Last Updated: November 26, 2025 Decision: Use Weaviate for vector database implementation
Knowledge Hierarchy
Three-Tier Vector Database Structure
-
World Lore DB (Global)
- Broad historical events, mythology, major kingdoms, legendary figures
- Accessible to all NPCs and DM for player questions
- Examples: "The Great War 200 years ago", "The origin of magic", "The Five Kingdoms"
- Scope: Universal knowledge any educated NPC might know
-
Regional/Town Lore DB (Location-specific)
- Local history, notable events, landmarks, politics, rumors
- Current town leadership, recent events, local legends
- Trade routes, neighboring settlements, regional conflicts
- Scope: Knowledge specific to geographic area
-
NPC Persona (Individual, YAML-defined)
- Personal background, personality, motivations
- Specific knowledge based on profession/role
- Personal relationships and secrets
- Scope: Character-specific information (already implemented in
/api/app/data/npcs/*.yaml)
How Knowledge Layers Work Together
Contextual Knowledge Layering
When an NPC engages in conversation, build their knowledge context by:
- Always include: NPC persona + their region's lore DB
- Conditionally include: World lore (if the topic seems historical/broad)
- Use semantic search: Query each DB for relevant chunks based on conversation topic
Example Interaction Flow
Player asks tavern keeper: "Tell me about the old ruins north of town"
- Check NPC persona: "Are ruins mentioned in their background?"
- Query Regional DB: "old ruins + north + [town name]"
- If no hits, query World Lore DB: "ancient ruins + [region name]"
- Combine results with NPC personality filter
Result: NPC responds with appropriate lore, or authentically says "I don't know about that" if nothing is found.
Knowledge Boundaries & Authenticity
NPCs Have Knowledge Limitations Based On:
- Profession: Blacksmith knows metallurgy lore, scholar knows history, farmer knows agricultural traditions
- Social Status: Nobles know court politics, commoners know street rumors
- Age/Experience: Elder NPCs might reference events from decades ago
- Travel History: Has this NPC been outside their region?
Implementation of "I don't know"
Add metadata to vector DB entries:
required_profession: ["scholar", "priest"]social_class: ["noble", "merchant"]knowledge_type: "academic" | "common" | "secret"region_id: "thornhelm"time_period: "ancient" | "recent" | "current"
Filter results before passing to the NPC's AI context, allowing authentic "I haven't heard of that" responses.
Retrieval-Augmented Generation (RAG) Pattern
Building AI Prompts for NPC Dialogue
[NPC Persona from YAML]
+
[Top 3-5 relevant chunks from Regional DB based on conversation topic]
+
[Top 2-3 relevant chunks from World Lore if topic is broad/historical]
+
[Conversation history from character's npc_interactions]
→ Feed to Claude with instruction to stay in character and admit ignorance if uncertain
DM Knowledge vs NPC Knowledge
DM Mode (Player talks directly to DM, not through NPC):
- DM has access to ALL databases without restrictions
- DM can reveal as much or as little as narratively appropriate
- DM can generate content not in databases (creative liberty)
NPC Mode (Player talks to specific NPC):
- NPC knowledge filtered by persona/role/location
- NPC can redirect: "You should ask the town elder about that" or "I've heard scholars at the university know more"
- Creates natural quest hooks and information-gathering gameplay
Technical Implementation
Technology Choice: Weaviate
Reasons for Weaviate:
- Self-hosted option for dev/beta
- Managed cloud service (Weaviate Cloud Services) for production
- Same API for both self-hosted and managed (easy migration)
- Rich metadata filtering capabilities
- Multi-tenancy support
- GraphQL API (fits strong typing preference)
- Hybrid search (semantic + keyword)
Storage & Indexing Strategy
Where Each DB Lives:
- World Lore: Single global vector DB collection
- Regional DBs: One collection with region metadata filtering
- Could use Weaviate multi-tenancy for efficient isolation
- Lazy-load when character enters region
- Cache in Redis for active sessions
- NPC Personas: Remain in YAML (structured data, not semantic search needed)
Weaviate Collections Structure:
Collections:
- WorldLore
- Metadata: knowledge_type, time_period, required_profession
- RegionalLore
- Metadata: region_id, knowledge_type, social_class
- Rumors (optional: dynamic/time-sensitive content)
- Metadata: region_id, expiration_date, source_npc
Semantic Chunk Strategy
Chunk lore content by logical units:
- Events: "The Battle of Thornhelm (Year 1204) - A decisive victory..."
- Locations: "The Abandoned Lighthouse - Once a beacon for traders..."
- Figures: "Lord Varric the Stern - Current ruler of Thornhelm..."
- Rumors/Gossip: "Strange lights have been seen in the forest lately..."
Each chunk gets embedded and stored with rich metadata for filtering.
Development Workflow
Index-Once Strategy
Rationale:
- Lore is relatively static (updates only during major version releases)
- Read-heavy workload (perfect for vector DBs)
- Cost-effective (one-time embedding generation)
- Allows thorough testing before deployment
Workflow Phases
Development:
- Write lore content (YAML/JSON/Markdown)
- Run embedding script locally
- Upload to local Weaviate instance (Docker)
- Test NPC conversations
- Iterate on lore content
Beta/Staging:
- Same self-hosted Weaviate, separate instance
- Finalize lore content
- Generate production embeddings
- Performance testing
Production:
- Migrate to Weaviate Cloud Services
- Upload final embedded lore
- Players query read-only
- No changes until next major update
Self-Hosted Development Setup
Docker Compose Example:
services:
weaviate:
image: semitechnologies/weaviate:latest
ports:
- "8080:8080"
environment:
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true' # Dev only
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
volumes:
- weaviate_data:/var/lib/weaviate
Hardware Requirements (Self-Hosted):
- RAM: 4-8GB sufficient for beta
- CPU: Low (no heavy re-indexing)
- Storage: Minimal (vectors are compact)
Migration Path: Dev → Production
Zero-Code Migration
- Export data from self-hosted Weaviate (backup tools)
- Create Weaviate Cloud Services cluster
- Import data to WCS
- Change environment variable:
WEAVIATE_URL - Deploy code (no code changes required)
Environment Configuration:
# /api/config/development.yaml
weaviate:
url: "http://localhost:8080"
api_key: null
# /api/config/production.yaml
weaviate:
url: "https://your-cluster.weaviate.network"
api_key: "${WEAVIATE_API_KEY}" # From .env
Embedding Strategy
One-Time Embedding Generation
Since embeddings are generated once per release, prioritize quality over cost.
Embedding Model Options:
| Model | Pros | Cons | Recommendation |
|---|---|---|---|
OpenAI text-embedding-3-large |
High quality, good semantic understanding | Paid per use | Production |
| Cohere Embed v3 | Optimized for search, multilingual | Paid per use | Production Alternative |
| sentence-transformers (OSS) | Free, self-host, fast iteration | Lower quality | Development/Testing |
Recommendation:
- Development: Use open-source models (iterate faster, zero cost)
- Production: Use OpenAI or Replicate https://replicate.com/beautyyuyanli/multilingual-e5-large (quality matters for player experience)
Embedding Generation Script
Will be implemented in /api/scripts/generate_lore_embeddings.py:
- Read lore files (YAML/JSON/Markdown)
- Chunk content appropriately
- Generate embeddings using chosen model
- Upload to Weaviate with metadata
- Validate retrieval quality
Content Management
Lore Content Structure
Storage Location: /api/app/data/lore/
/api/app/data/lore/
world/
history.yaml
mythology.yaml
kingdoms.yaml
regions/
thornhelm/
history.yaml
locations.yaml
rumors.yaml
silverwood/
history.yaml
locations.yaml
rumors.yaml
Example Lore Entry (YAML):
- id: "thornhelm_founding"
title: "The Founding of Thornhelm"
content: |
Thornhelm was founded in the year 847 by Lord Theron the Bold,
a retired general seeking to establish a frontier town...
metadata:
region_id: "thornhelm"
knowledge_type: "common"
time_period: "historical"
required_profession: null # Anyone can know this
social_class: null # All classes
tags:
- "founding"
- "lord-theron"
- "history"
Version Control for Lore Updates
Complete Re-Index Strategy (Simplest, recommended):
- Delete old collections during maintenance window
- Upload new lore with embeddings
- Atomic cutover
- Works great for infrequent major updates
Alternative: Versioned Collections (Overkill for our use case):
WorldLore_v1,WorldLore_v2- More overhead, probably unnecessary
Performance & Cost Optimization
Cost Considerations
Embedding Generation:
- One-time cost per lore chunk
- Only re-generate during major updates
- Estimated cost: $X per 1000 chunks (TBD based on model choice)
Vector Search:
- No embedding cost for queries (just retrieval)
- Self-hosted: Infrastructure cost only
- Managed (WCS): Pay for storage + queries
Optimization Strategies:
- Pre-compute all embeddings at build time
- Cache frequently accessed regional DBs in Redis
- Only search World Lore DB if regional search returns no results (fallback pattern)
- Use cheaper embedding models for non-critical content
Retrieval Performance
Expected Query Times:
- Semantic search: < 100ms
- With metadata filtering: < 150ms
- Hybrid search: < 200ms
Caching Strategy:
- Cache top N regional lore chunks per active region in Redis
- TTL: 1 hour (or until session ends)
- Invalidate on major lore updates
Multiplayer Considerations
Shared World State
If multiple characters are in the same town talking to NPCs:
- Regional DB: Shared (same lore for everyone)
- World DB: Shared
- NPC Interactions: Character-specific (stored in
character.npc_interactions)
Result: NPCs can reference world events consistently across players while maintaining individual relationships.
Testing Strategy
Validation Steps
-
Retrieval Quality Testing
- Does semantic search return relevant lore?
- Are metadata filters working correctly?
- Do NPCs find appropriate information?
-
NPC Knowledge Boundaries
- Can a farmer access academic knowledge? (Should be filtered out)
- Do profession filters work as expected?
- Do NPCs authentically say "I don't know" when appropriate?
-
Performance Testing
- Query response times under load
- Cache hit rates
- Memory usage with multiple active regions
-
Content Quality
- Is lore consistent across databases?
- Are there contradictions between world/regional lore?
- Is chunk size appropriate for context?
Implementation Phases
Phase 1: Proof of Concept (Current)
- Set up local Weaviate with Docker
- Create sample lore chunks (20-30 entries for one town)
- Generate embeddings and upload to Weaviate
- Build simple API endpoint for querying Weaviate
- Test NPC conversation with lore augmentation
Phase 2: Core Implementation
- Define lore content structure (YAML schema)
- Write lore for starter region
- Implement embedding generation script
- Create Weaviate service layer in
/api/app/services/weaviate_service.py - Integrate with NPC conversation system
- Add DM lore query endpoints
Phase 3: Content Expansion
- Write world lore content
- Write lore for additional regions
- Implement knowledge filtering logic
- Add lore discovery system (optional: player codex)
Phase 4: Production Readiness
- Migrate to Weaviate Cloud Services
- Performance optimization and caching
- Backup and disaster recovery
- Monitoring and alerting
Open Questions
-
Authoring Tools: How will we create/maintain lore content efficiently?
- Manual YAML editing?
- AI-generated lore with human review?
- Web-based CMS?
-
Lore Discovery: Should players unlock lore entries (codex-style) as they learn about them?
- Could be fun for completionists
- Adds gameplay loop around exploration
-
Dynamic Lore: How to handle time-sensitive rumors or evolving world state?
- Separate "Rumors" collection with expiration dates?
- Regional events that trigger new lore entries?
-
Chunk Size: What's optimal for context vs. precision?
- Too small: NPCs miss broader context
- Too large: Less precise retrieval
- Needs testing to determine
-
Consistency Validation: How to ensure regional lore doesn't contradict world lore?
- Automated consistency checks?
- Manual review process?
- Lore versioning and dependency tracking?
Future Enhancements
- Player-Generated Lore: Allow DMs to add custom lore entries during sessions
- Lore Relationships: Graph connections between related lore entries
- Multilingual Support: Embed lore in multiple languages
- Seasonal/Event Lore: Time-based lore that appears during special events
- Quest Integration: Automatic lore unlock based on quest completion
References
- Weaviate Documentation: https://weaviate.io/developers/weaviate
- RAG Pattern Best Practices: (TBD)
- Embedding Model Comparisons: (TBD)
Notes
This strategy aligns with the project's core principles:
- Strong typing: Lore models will use dataclasses
- Configuration-driven: Lore content in YAML/JSON
- Microservices architecture: Weaviate is independent service
- Cost-conscious: Index-once strategy minimizes ongoing costs
- Future-proof: Easy migration from self-hosted to managed