Files
Code_of_Conquest_Bright_Dawn/docs/arch.md

5.6 KiB

High-level view

[ Browser / Web GUI ]
        │  HTTPS (HTTP/3) via Cloudflare (DNS/WAF/CDN)
        ▼
[ Caddy API Gateway ]  — routing, TLS, real client IP, SSE/WebSocket pass-through
        │
        ├── /auth/*  →  [ Auth Service (Appwrite) ]
        ├── /api/*   →  [ Game API (Flask) ]
        │                 ├── calls → [ AI-DM Service (Flask) → Replicate ]
        │                 └── calls → [ Embeddings Service (Flask) ]
        │                                   └── KNN over pgvector
        │
        ├── presign → direct upload/download ↔ [ Appwrite ]
        │
        └── infra cache / rate limits ↔ [ Redis ]
                            
                          ┌─────────────────────────────────────────────┐
                          │                                             │
                          │          [ Postgres 16 + pgvector ]         │
                          │      (auth, game OLTP + semantic vectors)   │
                          └─────────────────────────────────────────────┘

Services & responsibilities

  • Web GUI

    • Player UX for auth, character management, sessions, chat.
    • Uses REST for CRUD; SSE/WebSocket for live DM replies/typing.
  • Caddy API Gateway

    • Edge routing for /auth, /api, /ai, /vec.
    • TLS termination behind Cloudflare; preserves real client IP; gzip/br.
    • Pass-through for SSE/WebSocket; access logging.
  • Auth Service (Flask)

    • Registration, login, refresh; JWT issuance/validation.
    • Owns player identity and credentials.
    • Simple rate limits via Redis.
  • Game API (Flask)

    • Core game domain (characters, sessions, inventory, rules orchestration).
    • Persists messages; orchestrates retrieval and AI calls.
    • Streams DM replies to clients (SSE/WebSocket).
    • Generates pre-signed URLs for Garage uploads/downloads.
  • AI-DM Service (Flask)

    • Thin, deterministic wrapper around Replicate models (prompt shaping, retries, timeouts).
    • Optional async path via job queue if responses are slow.
  • Embeddings Service (Flask)

    • Text → vector embedding (chosen model) and vector writes.
    • KNN search API (top-K over pgvector) for context retrieval.
    • Manages embedding version/dimension; supports re-embed workflows.
  • Postgres 16 + pgvector

    • Single source of truth for auth & game schemas.
    • Stores messages with vector column; IVF/HNSW index for similarity.
  • Garage (S3-compatible)

    • Object storage for player assets (character sheets, images, exports).
    • Access via pre-signed URLs (private buckets by default).
  • Redis

    • Caching hot reads (recent messages/session state).
    • Rate limiting tokens; optional Dramatiq broker for long jobs.

Data boundaries

  • Auth schema (Postgres)

    • players(id, email, password_hash, created_at, …)
    • Service: Auth exclusively reads/writes; others read via Auth or JWT claims.
  • Game schema (Postgres)

    • characters(id, player_id, name, clazz, level, sheet_json, …)

    • sessions(id, player_id, title, created_at, …)

    • messages(id, session_id, role, content, embedding vector(…)=NULL, created_at, …)

    • Indices:

      • messages(session_id, created_at)
      • messages USING hnsw|ivfflat (embedding vector_cosine_ops)
  • Objects (Garage)

    • Buckets: player-assets, exports, etc.
    • Keys include tenant/player and content hashes; metadata stored in DB.
  • Cache/queues (Redis)

    • Keys for rate limits, short-lived session state, optional job queues.

Core request flows

A) Player message → DM reply (sync POC)

  1. Web GUI → POST /api/sessions/{id}/messages (JWT).
  2. Game API writes player message (content only).
  3. Embeddings Service returns vector → Game API updates message.embedding.
  4. Embeddings Service (or direct SQL) performs KNN to fetch top-K prior messages.
  5. Game API calls AI-DM Service with {prompt, context, system}.
  6. AI-DM calls Replicate, returns text.
  7. Game API writes DM message (+ embedding), emits SSE/WebSocket event to client.

B) Asset upload (character sheet/map)

  1. Web GUI → POST /api/assets/presign {bucket, key, contentType} (JWT).
  2. Game API validates ACLs → returns pre-signed PUT URL for Garage.
  3. Browser uploads directly to Garage.
  4. Game API records/updates asset metadata row (owner, key, checksum, type).

C) Authentication

  1. Web GUI → Auth POST /auth/register / POST /auth/login.
  2. Auth returns {access, refresh} JWTs.
  3. Subsequent API calls include access token (Caddy passes through).

D) Retrieval-augmented turn (refine/search only)

  1. Game API (server-side) computes query embedding for player prompt.
  2. KNN over messages.embedding returns top-K context.
  3. Context trimmed/serialized and sent to AI-DM Service.
  4. Reply streamed back to client; transcripts persisted.

E) Long/slow generations (async job queue)

  1. Game API enqueues job (Redis/Dramatiq) to AI-DM.
  2. Returns {job_id}; Web GUI subscribes via SSE.
  3. Worker completes → Game API writes DM message and emits event.

This keeps each service small and focused, leans on Flask everywhere, uses Caddy + Cloudflare at the edge, Postgres + pgvector for state and search, and Garage for durable assets—with clean seams to swap pieces as you scale.