Files
Code_of_Conquest_Bright_Dawn/docs/arch.md

148 lines
5.6 KiB
Markdown

# High-level view
```
[ Browser / Web GUI ]
│ HTTPS (HTTP/3) via Cloudflare (DNS/WAF/CDN)
[ Caddy API Gateway ] — routing, TLS, real client IP, SSE/WebSocket pass-through
├── /auth/* → [ Auth Service (Appwrite) ]
├── /api/* → [ Game API (Flask) ]
│ ├── calls → [ AI-DM Service (Flask) → Replicate ]
│ └── calls → [ Embeddings Service (Flask) ]
│ └── KNN over pgvector
├── presign → direct upload/download ↔ [ Appwrite ]
└── infra cache / rate limits ↔ [ Redis ]
┌─────────────────────────────────────────────┐
│ │
│ [ Postgres 16 + pgvector ] │
│ (auth, game OLTP + semantic vectors) │
└─────────────────────────────────────────────┘
```
---
## Services & responsibilities
* **Web GUI**
* Player UX for auth, character management, sessions, chat.
* Uses REST for CRUD; SSE/WebSocket for live DM replies/typing.
* **Caddy API Gateway**
* Edge routing for `/auth`, `/api`, `/ai`, `/vec`.
* TLS termination behind Cloudflare; preserves real client IP; gzip/br.
* Pass-through for SSE/WebSocket; access logging.
* **Auth Service (Flask)**
* Registration, login, refresh; JWT issuance/validation.
* Owns player identity and credentials.
* Simple rate limits via Redis.
* **Game API (Flask)**
* Core game domain (characters, sessions, inventory, rules orchestration).
* Persists messages; orchestrates retrieval and AI calls.
* Streams DM replies to clients (SSE/WebSocket).
* Generates pre-signed URLs for Garage uploads/downloads.
* **AI-DM Service (Flask)**
* Thin, deterministic wrapper around **Replicate** models (prompt shaping, retries, timeouts).
* Optional async path via job queue if responses are slow.
* **Embeddings Service (Flask)**
* Text → vector embedding (chosen model) and vector writes.
* KNN search API (top-K over `pgvector`) for context retrieval.
* Manages embedding version/dimension; supports re-embed workflows.
* **Postgres 16 + pgvector**
* Single source of truth for auth & game schemas.
* Stores messages with `vector` column; IVF/HNSW index for similarity.
* **Garage (S3-compatible)**
* Object storage for player assets (character sheets, images, exports).
* Access via pre-signed URLs (private buckets by default).
* **Redis**
* Caching hot reads (recent messages/session state).
* Rate limiting tokens; optional Dramatiq broker for long jobs.
---
## Data boundaries
* **Auth schema (Postgres)**
* `players(id, email, password_hash, created_at, …)`
* Service: **Auth** exclusively reads/writes; others read via Auth or JWT claims.
* **Game schema (Postgres)**
* `characters(id, player_id, name, clazz, level, sheet_json, …)`
* `sessions(id, player_id, title, created_at, …)`
* `messages(id, session_id, role, content, embedding vector(…)=NULL, created_at, …)`
* Indices:
* `messages(session_id, created_at)`
* `messages USING hnsw|ivfflat (embedding vector_cosine_ops)`
* **Objects (Garage)**
* Buckets: `player-assets`, `exports`, etc.
* Keys include tenant/player and content hashes; metadata stored in DB.
* **Cache/queues (Redis)**
* Keys for rate limits, short-lived session state, optional job queues.
---
## Core request flows
### A) Player message → DM reply (sync POC)
1. Web GUI → `POST /api/sessions/{id}/messages` (JWT).
2. **Game API** writes player message (content only).
3. **Embeddings Service** returns vector → **Game API** updates message.embedding.
4. **Embeddings Service** (or direct SQL) performs KNN to fetch top-K prior messages.
5. **Game API** calls **AI-DM Service** with `{prompt, context, system}`.
6. **AI-DM** calls **Replicate**, returns text.
7. **Game API** writes DM message (+ embedding), emits SSE/WebSocket event to client.
### B) Asset upload (character sheet/map)
1. Web GUI → `POST /api/assets/presign {bucket, key, contentType}` (JWT).
2. **Game API** validates ACLs → returns pre-signed PUT URL for **Garage**.
3. Browser uploads directly to **Garage**.
4. **Game API** records/updates asset metadata row (owner, key, checksum, type).
### C) Authentication
1. Web GUI → **Auth** `POST /auth/register` / `POST /auth/login`.
2. **Auth** returns `{access, refresh}` JWTs.
3. Subsequent API calls include access token (Caddy passes through).
### D) Retrieval-augmented turn (refine/search only)
1. **Game API** (server-side) computes query embedding for player prompt.
2. KNN over `messages.embedding` returns top-K context.
3. Context trimmed/serialized and sent to **AI-DM Service**.
4. Reply streamed back to client; transcripts persisted.
### E) Long/slow generations (async job queue)
1. **Game API** enqueues job (Redis/Dramatiq) to **AI-DM**.
2. Returns `{job_id}`; Web GUI subscribes via SSE.
3. Worker completes → **Game API** writes DM message and emits event.
This keeps each service small and focused, leans on Flask everywhere, uses **Caddy + Cloudflare** at the edge, **Postgres + pgvector** for state and search, and **Garage** for durable assets—with clean seams to swap pieces as you scale.