160 lines
5.9 KiB
Markdown
160 lines
5.9 KiB
Markdown
# High-level view
|
|
|
|
```
|
|
[ Browser / Web GUI ]
|
|
│ HTTPS (HTTP/3) via Cloudflare (DNS/WAF/CDN)
|
|
▼
|
|
[ Caddy API Gateway ] — routing, TLS, real client IP, SSE/WebSocket pass-through
|
|
│
|
|
├── /auth/* → [ Auth Service (Appwrite) ]
|
|
├── /api/* → [ Game API (Flask) ]
|
|
│ ├── calls → [ AI-DM Service (Flask) → Replicate ]
|
|
│ └── calls → [ Embeddings Service (Flask) ]
|
|
│ └── KNN over pgvector
|
|
│
|
|
├── presign → direct upload/download ↔ [ Appwrite ]
|
|
│
|
|
└── infra cache / rate limits ↔ [ Redis ]
|
|
|
|
┌─────────────────────────────────────────────┐
|
|
│ │
|
|
│ [ Postgres 16 + pgvector ] │
|
|
│ (auth, game OLTP + semantic vectors) │
|
|
└─────────────────────────────────────────────┘
|
|
```
|
|
## Front end / Back end auth
|
|
---
|
|
```
|
|
[Frontend] ── login(email,pass) ─▶ [API Gateway] ─▶ [Auth Service]
|
|
│ │
|
|
│ <── Set-Cookie: access_token=JWT ───┘
|
|
│
|
|
├─▶ call /game/start (cookie auto-attached)
|
|
│
|
|
└─▶ logout → clear cookie
|
|
```
|
|
|
|
---
|
|
|
|
## Services & responsibilities
|
|
|
|
* **Web GUI**
|
|
|
|
* Player UX for auth, character management, sessions, chat.
|
|
* Uses REST for CRUD; SSE/WebSocket for live DM replies/typing.
|
|
|
|
* **Caddy API Gateway**
|
|
|
|
* Edge routing for `/auth`, `/api`, `/ai`, `/vec`.
|
|
* TLS termination behind Cloudflare; preserves real client IP; gzip/br.
|
|
* Pass-through for SSE/WebSocket; access logging.
|
|
|
|
* **Auth Service (Flask)**
|
|
|
|
* Registration, login, refresh; JWT issuance/validation.
|
|
* Owns player identity and credentials.
|
|
* Simple rate limits via Redis.
|
|
|
|
* **Game API (Flask)**
|
|
|
|
* Core game domain (characters, sessions, inventory, rules orchestration).
|
|
* Persists messages; orchestrates retrieval and AI calls.
|
|
* Streams DM replies to clients (SSE/WebSocket).
|
|
* Generates pre-signed URLs for Garage uploads/downloads.
|
|
|
|
* **AI-DM Service (Flask)**
|
|
|
|
* Thin, deterministic wrapper around **Replicate** models (prompt shaping, retries, timeouts).
|
|
* Optional async path via job queue if responses are slow.
|
|
|
|
* **Embeddings Service (Flask)**
|
|
|
|
* Text → vector embedding (chosen model) and vector writes.
|
|
* KNN search API (top-K over `pgvector`) for context retrieval.
|
|
* Manages embedding version/dimension; supports re-embed workflows.
|
|
|
|
* **Postgres 16 + pgvector**
|
|
|
|
* Single source of truth for auth & game schemas.
|
|
* Stores messages with `vector` column; IVF/HNSW index for similarity.
|
|
|
|
* **Garage (S3-compatible)**
|
|
|
|
* Object storage for player assets (character sheets, images, exports).
|
|
* Access via pre-signed URLs (private buckets by default).
|
|
|
|
* **Redis**
|
|
|
|
* Caching hot reads (recent messages/session state).
|
|
* Rate limiting tokens; optional Dramatiq broker for long jobs.
|
|
|
|
---
|
|
|
|
## Data boundaries
|
|
|
|
* **Auth schema (Postgres)**
|
|
|
|
* `players(id, email, password_hash, created_at, …)`
|
|
* Service: **Auth** exclusively reads/writes; others read via Auth or JWT claims.
|
|
|
|
* **Game schema (Postgres)**
|
|
|
|
* `characters(id, player_id, name, clazz, level, sheet_json, …)`
|
|
* `sessions(id, player_id, title, created_at, …)`
|
|
* `messages(id, session_id, role, content, embedding vector(…)=NULL, created_at, …)`
|
|
* Indices:
|
|
|
|
* `messages(session_id, created_at)`
|
|
* `messages USING hnsw|ivfflat (embedding vector_cosine_ops)`
|
|
|
|
* **Objects (Garage)**
|
|
|
|
* Buckets: `player-assets`, `exports`, etc.
|
|
* Keys include tenant/player and content hashes; metadata stored in DB.
|
|
|
|
* **Cache/queues (Redis)**
|
|
|
|
* Keys for rate limits, short-lived session state, optional job queues.
|
|
|
|
---
|
|
|
|
## Core request flows
|
|
|
|
### A) Player message → DM reply (sync POC)
|
|
|
|
1. Web GUI → `POST /api/sessions/{id}/messages` (JWT).
|
|
2. **Game API** writes player message (content only).
|
|
3. **Embeddings Service** returns vector → **Game API** updates message.embedding.
|
|
4. **Embeddings Service** (or direct SQL) performs KNN to fetch top-K prior messages.
|
|
5. **Game API** calls **AI-DM Service** with `{prompt, context, system}`.
|
|
6. **AI-DM** calls **Replicate**, returns text.
|
|
7. **Game API** writes DM message (+ embedding), emits SSE/WebSocket event to client.
|
|
|
|
### B) Asset upload (character sheet/map)
|
|
|
|
1. Web GUI → `POST /api/assets/presign {bucket, key, contentType}` (JWT).
|
|
2. **Game API** validates ACLs → returns pre-signed PUT URL for **Garage**.
|
|
3. Browser uploads directly to **Garage**.
|
|
4. **Game API** records/updates asset metadata row (owner, key, checksum, type).
|
|
|
|
### C) Authentication
|
|
|
|
1. Web GUI → **Auth** `POST /auth/register` / `POST /auth/login`.
|
|
2. **Auth** returns `{access, refresh}` JWTs.
|
|
3. Subsequent API calls include access token (Caddy passes through).
|
|
|
|
### D) Retrieval-augmented turn (refine/search only)
|
|
|
|
1. **Game API** (server-side) computes query embedding for player prompt.
|
|
2. KNN over `messages.embedding` returns top-K context.
|
|
3. Context trimmed/serialized and sent to **AI-DM Service**.
|
|
4. Reply streamed back to client; transcripts persisted.
|
|
|
|
### E) Long/slow generations (async job queue)
|
|
|
|
1. **Game API** enqueues job (Redis/Dramatiq) to **AI-DM**.
|
|
2. Returns `{job_id}`; Web GUI subscribes via SSE.
|
|
3. Worker completes → **Game API** writes DM message and emits event.
|
|
|
|
This keeps each service small and focused, leans on Flask everywhere, uses **Caddy + Cloudflare** at the edge, **Postgres + pgvector** for state and search, and **Garage** for durable assets—with clean seams to swap pieces as you scale.
|