ptarrant/Code_of_Conquest

Fork 0

Files

Phillip Tarrant 8315fa51c9 first commit

2025-11-24 23:10:55 -06:00

15 KiB

Raw Blame History

Deployment & Operations

Local Development Setup

Prerequisites

Tool	Version	Purpose
Python	3.11+	Backend runtime
Docker	Latest	Local services
Redis	7.0+	Job queue & caching
Git	Latest	Version control

Setup Steps

# 1. Clone repository
git clone <repo-url>
cd code_of_conquest

# 2. Create virtual environment
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure environment
cp .env.example .env
# Edit .env with your API keys and settings

# 5. Start local services
docker-compose up -d

# 6. Start RQ workers
rq worker ai_tasks combat_tasks marketplace_tasks &

# 7. Run Flask development server
flask run --debug

Environment Variables

Variable	Description	Required
`FLASK_ENV`	development/production	Yes
`SECRET_KEY`	Flask secret key	Yes
`REPLICATE_API_KEY`	Replicate API key	Yes
`ANTHROPIC_API_KEY`	Anthropic API key	Yes
`APPWRITE_ENDPOINT`	Appwrite server URL	Yes
`APPWRITE_PROJECT_ID`	Appwrite project ID	Yes
`APPWRITE_API_KEY`	Appwrite API key	Yes
`REDIS_URL`	Redis connection URL	Yes
`LOG_LEVEL`	Logging level (DEBUG/INFO/WARNING/ERROR)	No

Docker Compose (Local Development)

docker-compose.yml:

version: '3.8'
services:
  redis:
    image: redis:alpine
    ports:
      - "6379:6379"
    volumes:
      - redis_data:/data

  rq-worker:
    build: .
    command: rq worker ai_tasks combat_tasks marketplace_tasks --url redis://redis:6379
    depends_on:
      - redis
    env_file:
      - .env
    environment:
      - REDIS_URL=redis://redis:6379

volumes:
  redis_data:

Testing Strategy

Manual Testing (Preferred)

API Testing Document: docs/API_TESTING.md

Contains:

Endpoint examples
Sample curl/httpie commands
Expected responses
Authentication setup

Example API Test:

# Login
curl -X POST http://localhost:5000/api/v1/auth/login \
  -H "Content-Type: application/json" \
  -d '{"email": "test@example.com", "password": "password123"}'

# Create character (with auth token)
curl -X POST http://localhost:5000/api/v1/characters \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <token>" \
  -d '{"name": "Aragorn", "class_id": "vanguard"}'

Unit Tests (Optional)

Framework: pytest

Test Categories:

Category	Location	Focus
Combat	`tests/test_combat.py`	Damage calculations, effect processing
Skills	`tests/test_skills.py`	Skill unlock logic, prerequisites
Marketplace	`tests/test_marketplace.py`	Bidding logic, auction processing
Character	`tests/test_character.py`	Character creation, stats

Run Tests:

# All tests
pytest

# Specific test file
pytest tests/test_combat.py

# With coverage
pytest --cov=app tests/

Load Testing

Tool: Locust or Apache Bench

Test Scenarios:

Scenario	Target	Success Criteria
Concurrent AI requests	50 concurrent users	< 5s response time
Marketplace browsing	100 concurrent users	< 1s response time
Session realtime updates	10 players per session	< 100ms update latency

Production Deployment

Deployment Checklist

Pre-Deployment:

All environment variables configured
Appwrite collections created with proper permissions
Redis configured and accessible
RQ workers running
SSL certificates installed
Rate limiting configured
Error logging/monitoring set up (Sentry recommended)
Backup strategy for Appwrite data

Production Configuration:

DEBUG = False in Flask
Secure session keys (random, long)
CORS restricted to production domain
Rate limits appropriate for production
AI cost alerts configured
CDN for static assets (optional)

Dockerfile

FROM python:3.11-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY . .

# Create non-root user
RUN useradd -m appuser && chown -R appuser:appuser /app
USER appuser

# Expose port
EXPOSE 5000

# Run application
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "4", "wsgi:app"]

Build & Push Script

scripts/build_and_push.sh:

#!/bin/bash

# Get current git branch
BRANCH=$(git rev-parse --abbrev-ref HEAD)

# Ask for tag options
read -p "Tag as :latest? (y/n) " TAG_LATEST
read -p "Push to registry? (y/n) " PUSH_IMAGE

# Build image
docker build -t ai-dungeon-master:$BRANCH .

if [ "$TAG_LATEST" = "y" ]; then
    docker tag ai-dungeon-master:$BRANCH ai-dungeon-master:latest
fi

if [ "$PUSH_IMAGE" = "y" ]; then
    docker push ai-dungeon-master:$BRANCH
    if [ "$TAG_LATEST" = "y" ]; then
        docker push ai-dungeon-master:latest
    fi
fi

Production Environment

Recommended Stack:

Web Server: Nginx (reverse proxy)
WSGI Server: Gunicorn (4+ workers)
Process Manager: Supervisor or systemd
Redis: Standalone or Redis Cluster
RQ Workers: Separate instances for each queue

Scaling Strategy:

Component	Scaling Method	Trigger
Flask API	Horizontal (add workers)	CPU > 70%
RQ Workers	Horizontal (add workers)	Queue length > 100
Redis	Vertical (upgrade instance)	Memory > 80%
Appwrite	Managed by Appwrite	N/A

Monitoring & Logging

Application Logging

Logging Configuration:

Level	Use Case	Examples
DEBUG	Development only	Variable values, function calls
INFO	Normal operations	User actions, API calls
WARNING	Potential issues	Rate limit approaching, slow queries
ERROR	Errors (recoverable)	Failed AI calls, validation errors
CRITICAL	Critical failures	Database connection lost, service down

Structured Logging with Structlog:

import structlog

logger = structlog.get_logger(__name__)

logger.info("Combat action executed",
    session_id=session_id,
    character_id=character_id,
    action_type="attack",
    damage=15
)

Monitoring Tools

Recommended Tools:

Tool	Purpose	Priority
Sentry	Error tracking and alerting	High
Prometheus	Metrics collection	Medium
Grafana	Metrics visualization	Medium
Uptime Robot	Uptime monitoring	High
CloudWatch	AWS logs/metrics (if using AWS)	Medium

Key Metrics to Monitor

Metric	Alert Threshold	Action
API response time	> 3s average	Scale workers
Error rate	> 5%	Investigate logs
AI API errors	> 10%	Check API status
Queue length	> 500	Add workers
Redis memory	> 80%	Upgrade instance
CPU usage	> 80%	Scale horizontally
AI cost per day	> budget × 1.2	Investigate usage

AI Cost Tracking

Log Structure:

Field	Type	Purpose
`user_id`	str	Track per-user usage
`model`	str	Which model used
`tier`	str	FREE/STANDARD/PREMIUM
`tokens_used`	int	Token count
`cost_estimate`	float	Estimated cost
`timestamp`	datetime	When called
`context_type`	str	What prompted the call

Daily Report:

Total AI calls per tier
Total tokens used
Estimated cost
Top users by usage
Anomaly detection (unusual spikes)

Security

Authentication & Authorization

Implementation:

Layer	Method	Details
User Auth	Appwrite Auth	Email/password, OAuth providers
API Auth	JWT tokens	Bearer token in Authorization header
Session Validation	Every API call	Verify token, check expiry
Resource Access	User ID check	Users can only access their own data

Input Validation

Validation Strategy:

Input Type	Validation	Tools
JSON payloads	Schema validation	Marshmallow or Pydantic
Character names	Sanitize, length limits	Bleach library
Chat messages	Sanitize, profanity filter	Custom validators
AI prompts	Template-based only	Jinja2 (no direct user input)

Example Validation:

Field	Rules
Character name	3-20 chars, alphanumeric + spaces only
Gold amount	Positive integer, max 999,999,999
Action text	Max 500 chars, sanitized HTML

Rate Limiting

Implementation: Flask-Limiter with Redis backend

Limits by Tier:

Tier	API Calls/Min	AI Calls/Day	Marketplace Actions/Day
FREE	30	50	N/A
BASIC	60	200	N/A
PREMIUM	120	1000	50
ELITE	300	Unlimited	100

Rate Limit Bypass:

Admin accounts
Health check endpoints
Static assets

API Security

Configuration:

Setting	Value	Reason
CORS	Production domain only	Prevent unauthorized access
HTTPS	Required	Encrypt data in transit
API Keys	Environment variables	Never in code
Appwrite Permissions	Least privilege	Collection-level security
SQL Injection	N/A	Using Appwrite (NoSQL)
XSS	Sanitize all inputs	Prevent script injection
CSRF	CSRF tokens	For form submissions

Data Protection

Access Control Matrix:

Resource	Owner	Party Member	Public	System
Characters	RW	R	-	RW
Sessions	R	RW (turn)	-	RW
Marketplace Listings	RW (own)	-	R	RW
Transactions	R (own)	-	-	RW

RW = Read/Write, R = Read only, - = No access

Secrets Management

Never Commit:

API keys
Database credentials
Secret keys
Tokens

Best Practices:

Use .env for local development
Use environment variables in production
Use secrets manager (AWS Secrets Manager, HashiCorp Vault) in production
Rotate keys regularly
Different keys for dev/staging/prod

Backup & Recovery

Appwrite Data Backup

Strategy:

Data Type	Backup Frequency	Retention	Method
Characters	Daily	30 days	Appwrite export
Sessions (active)	Hourly	7 days	Appwrite export
Marketplace	Daily	30 days	Appwrite export
Transactions	Daily	90 days	Appwrite export

Backup Script:

Export collections to JSON
Compress and encrypt
Upload to S3 or object storage
Verify backup integrity

Disaster Recovery Plan

Scenario	RTO	RPO	Steps
Database corruption	4 hours	24 hours	Restore from latest backup
API server down	15 minutes	0	Restart/failover to standby
Redis failure	5 minutes	Session data loss	Restart, users re-login
Complete infrastructure loss	24 hours	24 hours	Restore from backups to new infrastructure

RTO = Recovery Time Objective, RPO = Recovery Point Objective

CI/CD Pipeline

Recommended Workflow

Stage	Actions	Tools
1. Commit	Developer pushes to `dev` branch	Git
2. Build	Run tests, lint code	GitHub Actions, pytest, flake8
3. Test	Unit tests, integration tests	pytest
4. Build Image	Create Docker image	Docker
5. Deploy to Staging	Deploy to staging environment	Docker, SSH
6. Manual Test	QA testing on staging	Manual
7. Merge to Beta	Promote to beta branch	Git
8. Deploy to Beta	Deploy to beta environment	Docker, SSH
9. Merge to Master	Production promotion	Git
10. Deploy to Prod	Deploy to production	Docker, SSH
11. Tag Release	Create version tag	Git

GitHub Actions Example

name: CI/CD

on:
  push:
    branches: [ dev, beta, master ]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.11
      - name: Install dependencies
        run: pip install -r requirements.txt
      - name: Run tests
        run: pytest
      - name: Lint
        run: flake8 app/

  build:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Build Docker image
        run: docker build -t ai-dungeon-master:${{ github.ref_name }} .
      - name: Push to registry
        run: docker push ai-dungeon-master:${{ github.ref_name }}

Performance Optimization

Caching Strategy

Cache Type	What to Cache	TTL
Redis Cache	Session data	30 minutes
	Character data (read-heavy)	5 minutes
	Marketplace listings	1 minute
	NPC shop items	1 hour
Browser Cache	Static assets	1 year
	API responses (GET)	30 seconds

Database Optimization

Appwrite Indexing:

Index userId on characters collection
Index status on game_sessions collection
Index listing_type + status on marketplace_listings
Index created_at for time-based queries

AI Call Optimization

Strategies:

Strategy	Impact	Implementation
Batch requests	Reduce API calls	Combine multiple actions
Cache common responses	Reduce cost	Cache item descriptions
Prompt optimization	Reduce tokens	Shorter, more efficient prompts
Model selection	Reduce cost	Use cheaper models when appropriate

Troubleshooting

Common Issues

Issue	Symptoms	Solution
RQ workers not processing	Jobs stuck in queue	Check Redis connection, restart workers
AI calls failing	401/403 errors	Verify API keys, check rate limits
Appwrite connection errors	Database errors	Check Appwrite status, verify credentials
Session not updating	Stale data in UI	Check Appwrite Realtime connection
High latency	Slow API responses	Check RQ queue length, scale workers

Debug Mode

Enable Debug Logging:

export LOG_LEVEL=DEBUG
flask run --debug

Debug Endpoints (development only):

GET /debug/health - Health check
GET /debug/redis - Redis connection status
GET /debug/queues - RQ queue status

Resources

Resource	URL
Appwrite Docs	https://appwrite.io/docs
RQ Docs	https://python-rq.org/
Flask Docs	https://flask.palletsprojects.com/
Structlog Docs	https://www.structlog.org/
HTMX Docs	https://htmx.org/docs/
Anthropic API	https://docs.anthropic.com/
Replicate API	https://replicate.com/docs

15 KiB Raw Blame History Unescape Escape