Files
Code_of_Conquest/docs/VECTOR_DATABASE_STRATEGY.md
Phillip Tarrant 03ab783eeb Combat Backend & Data Models
- Implement Combat Service
- Implement Damage Calculator
- Implement Effect Processor
- Implement Combat Actions
- Created Combat API Endpoints
2025-11-26 15:43:20 -06:00

14 KiB

Vector Database Strategy

Overview

This document outlines the strategy for implementing layered knowledge systems using vector databases to provide NPCs and the Dungeon Master with contextual lore, regional history, and world knowledge.

Status: Planning Phase Last Updated: November 26, 2025 Decision: Use Weaviate for vector database implementation


Knowledge Hierarchy

Three-Tier Vector Database Structure

  1. World Lore DB (Global)

    • Broad historical events, mythology, major kingdoms, legendary figures
    • Accessible to all NPCs and DM for player questions
    • Examples: "The Great War 200 years ago", "The origin of magic", "The Five Kingdoms"
    • Scope: Universal knowledge any educated NPC might know
  2. Regional/Town Lore DB (Location-specific)

    • Local history, notable events, landmarks, politics, rumors
    • Current town leadership, recent events, local legends
    • Trade routes, neighboring settlements, regional conflicts
    • Scope: Knowledge specific to geographic area
  3. NPC Persona (Individual, YAML-defined)

    • Personal background, personality, motivations
    • Specific knowledge based on profession/role
    • Personal relationships and secrets
    • Scope: Character-specific information (already implemented in /api/app/data/npcs/*.yaml)

How Knowledge Layers Work Together

Contextual Knowledge Layering

When an NPC engages in conversation, build their knowledge context by:

  • Always include: NPC persona + their region's lore DB
  • Conditionally include: World lore (if the topic seems historical/broad)
  • Use semantic search: Query each DB for relevant chunks based on conversation topic

Example Interaction Flow

Player asks tavern keeper: "Tell me about the old ruins north of town"

  1. Check NPC persona: "Are ruins mentioned in their background?"
  2. Query Regional DB: "old ruins + north + [town name]"
  3. If no hits, query World Lore DB: "ancient ruins + [region name]"
  4. Combine results with NPC personality filter

Result: NPC responds with appropriate lore, or authentically says "I don't know about that" if nothing is found.


Knowledge Boundaries & Authenticity

NPCs Have Knowledge Limitations Based On:

  • Profession: Blacksmith knows metallurgy lore, scholar knows history, farmer knows agricultural traditions
  • Social Status: Nobles know court politics, commoners know street rumors
  • Age/Experience: Elder NPCs might reference events from decades ago
  • Travel History: Has this NPC been outside their region?

Implementation of "I don't know"

Add metadata to vector DB entries:

  • required_profession: ["scholar", "priest"]
  • social_class: ["noble", "merchant"]
  • knowledge_type: "academic" | "common" | "secret"
  • region_id: "thornhelm"
  • time_period: "ancient" | "recent" | "current"

Filter results before passing to the NPC's AI context, allowing authentic "I haven't heard of that" responses.


Retrieval-Augmented Generation (RAG) Pattern

Building AI Prompts for NPC Dialogue

[NPC Persona from YAML]
+
[Top 3-5 relevant chunks from Regional DB based on conversation topic]
+
[Top 2-3 relevant chunks from World Lore if topic is broad/historical]
+
[Conversation history from character's npc_interactions]
→ Feed to Claude with instruction to stay in character and admit ignorance if uncertain

DM Knowledge vs NPC Knowledge

DM Mode (Player talks directly to DM, not through NPC):

  • DM has access to ALL databases without restrictions
  • DM can reveal as much or as little as narratively appropriate
  • DM can generate content not in databases (creative liberty)

NPC Mode (Player talks to specific NPC):

  • NPC knowledge filtered by persona/role/location
  • NPC can redirect: "You should ask the town elder about that" or "I've heard scholars at the university know more"
  • Creates natural quest hooks and information-gathering gameplay

Technical Implementation

Technology Choice: Weaviate

Reasons for Weaviate:

  • Self-hosted option for dev/beta
  • Managed cloud service (Weaviate Cloud Services) for production
  • Same API for both self-hosted and managed (easy migration)
  • Rich metadata filtering capabilities
  • Multi-tenancy support
  • GraphQL API (fits strong typing preference)
  • Hybrid search (semantic + keyword)

Storage & Indexing Strategy

Where Each DB Lives:

  • World Lore: Single global vector DB collection
  • Regional DBs: One collection with region metadata filtering
    • Could use Weaviate multi-tenancy for efficient isolation
    • Lazy-load when character enters region
    • Cache in Redis for active sessions
  • NPC Personas: Remain in YAML (structured data, not semantic search needed)

Weaviate Collections Structure:

Collections:
- WorldLore
  - Metadata: knowledge_type, time_period, required_profession
- RegionalLore
  - Metadata: region_id, knowledge_type, social_class
- Rumors (optional: dynamic/time-sensitive content)
  - Metadata: region_id, expiration_date, source_npc

Semantic Chunk Strategy

Chunk lore content by logical units:

  • Events: "The Battle of Thornhelm (Year 1204) - A decisive victory..."
  • Locations: "The Abandoned Lighthouse - Once a beacon for traders..."
  • Figures: "Lord Varric the Stern - Current ruler of Thornhelm..."
  • Rumors/Gossip: "Strange lights have been seen in the forest lately..."

Each chunk gets embedded and stored with rich metadata for filtering.


Development Workflow

Index-Once Strategy

Rationale:

  • Lore is relatively static (updates only during major version releases)
  • Read-heavy workload (perfect for vector DBs)
  • Cost-effective (one-time embedding generation)
  • Allows thorough testing before deployment

Workflow Phases

Development:

  1. Write lore content (YAML/JSON/Markdown)
  2. Run embedding script locally
  3. Upload to local Weaviate instance (Docker)
  4. Test NPC conversations
  5. Iterate on lore content

Beta/Staging:

  1. Same self-hosted Weaviate, separate instance
  2. Finalize lore content
  3. Generate production embeddings
  4. Performance testing

Production:

  1. Migrate to Weaviate Cloud Services
  2. Upload final embedded lore
  3. Players query read-only
  4. No changes until next major update

Self-Hosted Development Setup

Docker Compose Example:

services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
    environment:
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'  # Dev only
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
    volumes:
      - weaviate_data:/var/lib/weaviate

Hardware Requirements (Self-Hosted):

  • RAM: 4-8GB sufficient for beta
  • CPU: Low (no heavy re-indexing)
  • Storage: Minimal (vectors are compact)

Migration Path: Dev → Production

Zero-Code Migration

  1. Export data from self-hosted Weaviate (backup tools)
  2. Create Weaviate Cloud Services cluster
  3. Import data to WCS
  4. Change environment variable: WEAVIATE_URL
  5. Deploy code (no code changes required)

Environment Configuration:

# /api/config/development.yaml
weaviate:
  url: "http://localhost:8080"
  api_key: null

# /api/config/production.yaml
weaviate:
  url: "https://your-cluster.weaviate.network"
  api_key: "${WEAVIATE_API_KEY}"  # From .env

Embedding Strategy

One-Time Embedding Generation

Since embeddings are generated once per release, prioritize quality over cost.

Embedding Model Options:

Model Pros Cons Recommendation
OpenAI text-embedding-3-large High quality, good semantic understanding Paid per use Production
Cohere Embed v3 Optimized for search, multilingual Paid per use Production Alternative
sentence-transformers (OSS) Free, self-host, fast iteration Lower quality Development/Testing

Recommendation:

Embedding Generation Script

Will be implemented in /api/scripts/generate_lore_embeddings.py:

  1. Read lore files (YAML/JSON/Markdown)
  2. Chunk content appropriately
  3. Generate embeddings using chosen model
  4. Upload to Weaviate with metadata
  5. Validate retrieval quality

Content Management

Lore Content Structure

Storage Location: /api/app/data/lore/

/api/app/data/lore/
  world/
    history.yaml
    mythology.yaml
    kingdoms.yaml
  regions/
    thornhelm/
      history.yaml
      locations.yaml
      rumors.yaml
    silverwood/
      history.yaml
      locations.yaml
      rumors.yaml

Example Lore Entry (YAML):

- id: "thornhelm_founding"
  title: "The Founding of Thornhelm"
  content: |
    Thornhelm was founded in the year 847 by Lord Theron the Bold,
    a retired general seeking to establish a frontier town...
  metadata:
    region_id: "thornhelm"
    knowledge_type: "common"
    time_period: "historical"
    required_profession: null  # Anyone can know this
    social_class: null  # All classes
  tags:
    - "founding"
    - "lord-theron"
    - "history"

Version Control for Lore Updates

Complete Re-Index Strategy (Simplest, recommended):

  1. Delete old collections during maintenance window
  2. Upload new lore with embeddings
  3. Atomic cutover
  4. Works great for infrequent major updates

Alternative: Versioned Collections (Overkill for our use case):

  • WorldLore_v1, WorldLore_v2
  • More overhead, probably unnecessary

Performance & Cost Optimization

Cost Considerations

Embedding Generation:

  • One-time cost per lore chunk
  • Only re-generate during major updates
  • Estimated cost: $X per 1000 chunks (TBD based on model choice)

Vector Search:

  • No embedding cost for queries (just retrieval)
  • Self-hosted: Infrastructure cost only
  • Managed (WCS): Pay for storage + queries

Optimization Strategies:

  • Pre-compute all embeddings at build time
  • Cache frequently accessed regional DBs in Redis
  • Only search World Lore DB if regional search returns no results (fallback pattern)
  • Use cheaper embedding models for non-critical content

Retrieval Performance

Expected Query Times:

  • Semantic search: < 100ms
  • With metadata filtering: < 150ms
  • Hybrid search: < 200ms

Caching Strategy:

  • Cache top N regional lore chunks per active region in Redis
  • TTL: 1 hour (or until session ends)
  • Invalidate on major lore updates

Multiplayer Considerations

Shared World State

If multiple characters are in the same town talking to NPCs:

  • Regional DB: Shared (same lore for everyone)
  • World DB: Shared
  • NPC Interactions: Character-specific (stored in character.npc_interactions)

Result: NPCs can reference world events consistently across players while maintaining individual relationships.


Testing Strategy

Validation Steps

  1. Retrieval Quality Testing

    • Does semantic search return relevant lore?
    • Are metadata filters working correctly?
    • Do NPCs find appropriate information?
  2. NPC Knowledge Boundaries

    • Can a farmer access academic knowledge? (Should be filtered out)
    • Do profession filters work as expected?
    • Do NPCs authentically say "I don't know" when appropriate?
  3. Performance Testing

    • Query response times under load
    • Cache hit rates
    • Memory usage with multiple active regions
  4. Content Quality

    • Is lore consistent across databases?
    • Are there contradictions between world/regional lore?
    • Is chunk size appropriate for context?

Implementation Phases

Phase 1: Proof of Concept (Current)

  • Set up local Weaviate with Docker
  • Create sample lore chunks (20-30 entries for one town)
  • Generate embeddings and upload to Weaviate
  • Build simple API endpoint for querying Weaviate
  • Test NPC conversation with lore augmentation

Phase 2: Core Implementation

  • Define lore content structure (YAML schema)
  • Write lore for starter region
  • Implement embedding generation script
  • Create Weaviate service layer in /api/app/services/weaviate_service.py
  • Integrate with NPC conversation system
  • Add DM lore query endpoints

Phase 3: Content Expansion

  • Write world lore content
  • Write lore for additional regions
  • Implement knowledge filtering logic
  • Add lore discovery system (optional: player codex)

Phase 4: Production Readiness

  • Migrate to Weaviate Cloud Services
  • Performance optimization and caching
  • Backup and disaster recovery
  • Monitoring and alerting

Open Questions

  1. Authoring Tools: How will we create/maintain lore content efficiently?

    • Manual YAML editing?
    • AI-generated lore with human review?
    • Web-based CMS?
  2. Lore Discovery: Should players unlock lore entries (codex-style) as they learn about them?

    • Could be fun for completionists
    • Adds gameplay loop around exploration
  3. Dynamic Lore: How to handle time-sensitive rumors or evolving world state?

    • Separate "Rumors" collection with expiration dates?
    • Regional events that trigger new lore entries?
  4. Chunk Size: What's optimal for context vs. precision?

    • Too small: NPCs miss broader context
    • Too large: Less precise retrieval
    • Needs testing to determine
  5. Consistency Validation: How to ensure regional lore doesn't contradict world lore?

    • Automated consistency checks?
    • Manual review process?
    • Lore versioning and dependency tracking?

Future Enhancements

  • Player-Generated Lore: Allow DMs to add custom lore entries during sessions
  • Lore Relationships: Graph connections between related lore entries
  • Multilingual Support: Embed lore in multiple languages
  • Seasonal/Event Lore: Time-based lore that appears during special events
  • Quest Integration: Automatic lore unlock based on quest completion

References


Notes

This strategy aligns with the project's core principles:

  • Strong typing: Lore models will use dataclasses
  • Configuration-driven: Lore content in YAML/JSON
  • Microservices architecture: Weaviate is independent service
  • Cost-conscious: Index-once strategy minimizes ongoing costs
  • Future-proof: Easy migration from self-hosted to managed