first commit

2025-11-24 23:10:55 -06:00
commit 8315fa51c9
279 changed files with 74600 additions and 0 deletions
--- a/api/docs/USAGE_TRACKING.md
+++ b/api/docs/USAGE_TRACKING.md
@@ -0,0 +1,614 @@
+# Usage Tracking & Cost Controls
+
+## Overview
+
+Code of Conquest implements comprehensive usage tracking and cost controls for AI operations. This ensures sustainable costs, fair usage across tiers, and visibility into system usage patterns.
+
+**Key Components:**
+- **UsageTrackingService** - Logs all AI usage and calculates costs
+- **RateLimiterService** - Enforces tier-based daily limits
+- **AIUsageLog** - Data model for usage events
+
+---
+
+## Architecture
+
+```
+┌─────────────────────┐
+│   AI Task Jobs      │
+├─────────────────────┤
+│ UsageTrackingService│  ← Logs usage, calculates costs
+├─────────────────────┤
+│  RateLimiterService │  ← Enforces limits before processing
+├─────────────────────┤
+│   Redis + Appwrite  │  ← Storage layer
+└─────────────────────┘
+```
+
+---
+
+## Usage Tracking Service
+
+**File:** `app/services/usage_tracking_service.py`
+
+### Initialization
+
+```python
+from app.services.usage_tracking_service import UsageTrackingService
+
+tracker = UsageTrackingService()
+```
+
+**Required Environment Variables:**
+```bash
+APPWRITE_ENDPOINT=https://cloud.appwrite.io/v1
+APPWRITE_PROJECT_ID=your-project-id
+APPWRITE_API_KEY=your-api-key
+APPWRITE_DATABASE_ID=main
+```
+
+### Logging Usage
+
+```python
+from app.models.ai_usage import TaskType
+
+# Log a usage event
+usage_log = tracker.log_usage(
+    user_id="user_123",
+    model="anthropic/claude-3.5-sonnet",
+    tokens_input=150,
+    tokens_output=450,
+    task_type=TaskType.STORY_PROGRESSION,
+    session_id="sess_789",
+    character_id="char_456",
+    request_duration_ms=2500,
+    success=True
+)
+
+print(f"Log ID: {usage_log.log_id}")
+print(f"Cost: ${usage_log.estimated_cost:.6f}")
+```
+
+### Querying Usage
+
+**Daily Usage:**
+```python
+from datetime import date
+
+# Get today's usage
+usage = tracker.get_daily_usage("user_123", date.today())
+
+print(f"Requests: {usage.total_requests}")
+print(f"Tokens: {usage.total_tokens}")
+print(f"Input tokens: {usage.total_input_tokens}")
+print(f"Output tokens: {usage.total_output_tokens}")
+print(f"Cost: ${usage.estimated_cost:.4f}")
+print(f"By task: {usage.requests_by_task}")
+# {"story_progression": 10, "combat_narration": 3, ...}
+```
+
+**Monthly Cost:**
+```python
+# Get November 2025 cost
+monthly = tracker.get_monthly_cost("user_123", 2025, 11)
+
+print(f"Monthly requests: {monthly.total_requests}")
+print(f"Monthly tokens: {monthly.total_tokens}")
+print(f"Monthly cost: ${monthly.estimated_cost:.2f}")
+```
+
+**Admin Monitoring:**
+```python
+# Get total platform cost for a day
+total_cost = tracker.get_total_daily_cost(date.today())
+print(f"Platform daily cost: ${total_cost:.2f}")
+
+# Get user request count for rate limiting
+count = tracker.get_user_request_count_today("user_123")
+```
+
+### Cost Estimation
+
+**Static Methods (no instance needed):**
+```python
+from app.services.usage_tracking_service import UsageTrackingService
+
+# Estimate cost for specific request
+cost = UsageTrackingService.estimate_cost_for_model(
+    model="anthropic/claude-3.5-sonnet",
+    tokens_input=100,
+    tokens_output=400
+)
+print(f"Estimated: ${cost:.6f}")
+
+# Get model pricing
+info = UsageTrackingService.get_model_cost_info("anthropic/claude-3.5-sonnet")
+print(f"Input: ${info['input']}/1K tokens")
+print(f"Output: ${info['output']}/1K tokens")
+```
+
+---
+
+## Model Pricing
+
+Costs per 1,000 tokens (USD):
+
+| Model | Input | Output | Tier |
+|-------|-------|--------|------|
+| `meta/meta-llama-3-8b-instruct` | $0.0001 | $0.0001 | Free |
+| `meta/meta-llama-3-70b-instruct` | $0.0006 | $0.0006 | - |
+| `anthropic/claude-3.5-haiku` | $0.001 | $0.005 | Basic |
+| `anthropic/claude-3.5-sonnet` | $0.003 | $0.015 | Premium |
+| `anthropic/claude-4.5-sonnet` | $0.003 | $0.015 | Elite |
+| `anthropic/claude-3-opus` | $0.015 | $0.075 | - |
+
+**Default cost for unknown models:** $0.001 input, $0.005 output per 1K tokens
+
+---
+
+## Token Estimation
+
+Since the Replicate API doesn't return exact token counts, tokens are estimated based on text length.
+
+### Estimation Formula
+
+```python
+# Approximate 4 characters per token
+tokens = len(text) // 4
+```
+
+### How Tokens Are Calculated
+
+**Input Tokens:**
+- Calculated from the full prompt sent to the AI
+- Includes: user prompt + system prompt
+- Estimated at: `len(prompt + system_prompt) // 4`
+
+**Output Tokens:**
+- Calculated from the AI's response text
+- Estimated at: `len(response_text) // 4`
+
+### ReplicateResponse Structure
+
+The Replicate client returns both input and output token estimates:
+
+```python
+@dataclass
+class ReplicateResponse:
+    text: str
+    tokens_used: int      # Total (input + output)
+    tokens_input: int     # Estimated input tokens
+    tokens_output: int    # Estimated output tokens
+    model: str
+    generation_time: float
+```
+
+### Example Token Counts
+
+| Content | Characters | Estimated Tokens |
+|---------|------------|------------------|
+| Short prompt | 400 chars | ~100 tokens |
+| Full DM prompt | 4,000 chars | ~1,000 tokens |
+| Short response | 200 chars | ~50 tokens |
+| Full narrative | 800 chars | ~200 tokens |
+
+### Accuracy Notes
+
+- Estimation is approximate (~75-80% accurate)
+- Real tokenization varies by model
+- Better to over-estimate for cost budgeting
+- Logs use estimates; billing reconciliation may differ
+
+---
+
+## Data Models
+
+**File:** `app/models/ai_usage.py`
+
+### AIUsageLog
+
+```python
+@dataclass
+class AIUsageLog:
+    log_id: str                    # Unique identifier
+    user_id: str                   # User who made request
+    timestamp: datetime            # When request was made
+    model: str                     # Model identifier
+    tokens_input: int              # Input/prompt tokens
+    tokens_output: int             # Output/response tokens
+    tokens_total: int              # Total tokens
+    estimated_cost: float          # Cost in USD
+    task_type: TaskType            # Type of task
+    session_id: Optional[str]      # Game session
+    character_id: Optional[str]    # Character
+    request_duration_ms: int       # Duration
+    success: bool                  # Success status
+    error_message: Optional[str]   # Error if failed
+```
+
+### TaskType Enum
+
+```python
+class TaskType(str, Enum):
+    STORY_PROGRESSION = "story_progression"
+    COMBAT_NARRATION = "combat_narration"
+    QUEST_SELECTION = "quest_selection"
+    NPC_DIALOGUE = "npc_dialogue"
+    GENERAL = "general"
+```
+
+### Summary Objects
+
+```python
+@dataclass
+class DailyUsageSummary:
+    date: date
+    user_id: str
+    total_requests: int
+    total_tokens: int
+    total_input_tokens: int
+    total_output_tokens: int
+    estimated_cost: float
+    requests_by_task: Dict[str, int]
+
+@dataclass
+class MonthlyUsageSummary:
+    year: int
+    month: int
+    user_id: str
+    total_requests: int
+    total_tokens: int
+    estimated_cost: float
+    daily_breakdown: list
+```
+
+---
+
+## Rate Limiter Service
+
+**File:** `app/services/rate_limiter_service.py`
+
+### Daily Turn Limits
+
+| Tier | Limit | Cost Level |
+|------|-------|------------|
+| FREE | 20 turns/day | Zero |
+| BASIC | 50 turns/day | Low |
+| PREMIUM | 100 turns/day | Medium |
+| ELITE | 200 turns/day | High |
+
+Counters reset at midnight UTC.
+
+### Custom Action Limits
+
+Free-text actions (beyond preset buttons) have additional limits per tier:
+
+| Tier | Custom Actions/Day | Character Limit |
+|------|-------------------|-----------------|
+| FREE | 10 | 150 chars |
+| BASIC | 50 | 300 chars |
+| PREMIUM | Unlimited | 500 chars |
+| ELITE | Unlimited | 500 chars |
+
+**Configuration:** These values are defined in `config/*.yaml` under `rate_limiting.tiers`:
+```yaml
+tiers:
+  free:
+    custom_actions_per_day: 10
+    custom_action_char_limit: 150
+```
+
+**Access in code:**
+```python
+from app.config import get_config
+
+config = get_config()
+tier_config = config.rate_limiting.tiers['free']
+print(tier_config.custom_actions_per_day)      # 10
+print(tier_config.custom_action_char_limit)    # 150
+```
+
+### Basic Usage
+
+```python
+from app.services.rate_limiter_service import RateLimiterService, RateLimitExceeded
+from app.ai.model_selector import UserTier
+
+limiter = RateLimiterService()
+
+# Check and increment (typical flow)
+try:
+    limiter.check_rate_limit("user_123", UserTier.PREMIUM)
+    # Process AI request...
+    limiter.increment_usage("user_123")
+except RateLimitExceeded as e:
+    print(f"Limit reached: {e.current_usage}/{e.limit}")
+    print(f"Resets at: {e.reset_time}")
+```
+
+### Query Methods
+
+```python
+# Get current usage
+current = limiter.get_current_usage("user_123")
+
+# Get remaining turns
+remaining = limiter.get_remaining_turns("user_123", UserTier.PREMIUM)
+print(f"Remaining: {remaining} turns")
+
+# Get comprehensive info
+info = limiter.get_usage_info("user_123", UserTier.PREMIUM)
+# {
+#     "user_id": "user_123",
+#     "user_tier": "premium",
+#     "current_usage": 45,
+#     "daily_limit": 100,
+#     "remaining": 55,
+#     "reset_time": "2025-11-22T00:00:00+00:00",
+#     "is_limited": False
+# }
+
+# Get limit for tier
+limit = limiter.get_limit_for_tier(UserTier.ELITE)  # 200
+```
+
+### Admin Functions
+
+```python
+# Reset user's daily counter (testing/admin)
+limiter.reset_usage("user_123")
+```
+
+### RateLimitExceeded Exception
+
+```python
+class RateLimitExceeded(Exception):
+    user_id: str
+    user_tier: UserTier
+    limit: int
+    current_usage: int
+    reset_time: datetime
+```
+
+Provides all information needed for user-friendly error messages.
+
+---
+
+## Integration Pattern
+
+### In AI Task Jobs
+
+```python
+from app.services.rate_limiter_service import RateLimiterService, RateLimitExceeded
+from app.services.usage_tracking_service import UsageTrackingService
+from app.ai.narrative_generator import NarrativeGenerator
+from app.models.ai_usage import TaskType
+
+def process_ai_request(user_id: str, user_tier: UserTier, action: str, ...):
+    limiter = RateLimiterService()
+    tracker = UsageTrackingService()
+    generator = NarrativeGenerator()
+
+    # 1. Check rate limit BEFORE processing
+    try:
+        limiter.check_rate_limit(user_id, user_tier)
+    except RateLimitExceeded as e:
+        return {
+            "error": "rate_limit_exceeded",
+            "message": f"Daily limit reached ({e.limit} turns). Resets at {e.reset_time}",
+            "remaining": 0,
+            "reset_time": e.reset_time.isoformat()
+        }
+
+    # 2. Generate AI response
+    start_time = time.time()
+    response = generator.generate_story_response(...)
+    duration_ms = int((time.time() - start_time) * 1000)
+
+    # 3. Log usage (tokens are estimated in ReplicateClient)
+    tracker.log_usage(
+        user_id=user_id,
+        model=response.model,
+        tokens_input=response.tokens_input,   # From prompt length
+        tokens_output=response.tokens_output, # From response length
+        task_type=TaskType.STORY_PROGRESSION,
+        session_id=session_id,
+        request_duration_ms=duration_ms,
+        success=True
+    )
+
+    # 4. Increment rate limit counter
+    limiter.increment_usage(user_id)
+
+    return {"narrative": response.narrative, ...}
+```
+
+### API Endpoint Pattern
+
+```python
+@bp.route('/sessions/<session_id>/action', methods=['POST'])
+@require_auth
+def take_action(session_id):
+    user = get_current_user()
+    limiter = RateLimiterService()
+
+    # Check limit and return remaining info
+    try:
+        limiter.check_rate_limit(user.id, user.tier)
+    except RateLimitExceeded as e:
+        return api_response(
+            status=429,
+            error={
+                "code": "RATE_LIMIT_EXCEEDED",
+                "message": "Daily turn limit reached",
+                "details": {
+                    "limit": e.limit,
+                    "current": e.current_usage,
+                    "reset_time": e.reset_time.isoformat()
+                }
+            }
+        )
+
+    # Queue AI job...
+    remaining = limiter.get_remaining_turns(user.id, user.tier)
+
+    return api_response(
+        status=202,
+        result={
+            "job_id": job.id,
+            "remaining_turns": remaining
+        }
+    )
+```
+
+---
+
+## Appwrite Collection Schema
+
+**Collection:** `ai_usage_logs`
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `log_id` | string | Primary key |
+| `user_id` | string | User identifier |
+| `timestamp` | datetime | Request time (UTC) |
+| `model` | string | Model identifier |
+| `tokens_input` | integer | Input tokens |
+| `tokens_output` | integer | Output tokens |
+| `tokens_total` | integer | Total tokens |
+| `estimated_cost` | double | Cost in USD |
+| `task_type` | string | Task type enum |
+| `session_id` | string | Optional session |
+| `character_id` | string | Optional character |
+| `request_duration_ms` | integer | Duration |
+| `success` | boolean | Success status |
+| `error_message` | string | Error if failed |
+
+**Indexes:**
+- `user_id` + `timestamp` (for daily queries)
+- `timestamp` (for admin monitoring)
+
+---
+
+## Cost Management Best Practices
+
+### 1. Pre-request Validation
+
+Always check rate limits before processing:
+
+```python
+limiter.check_rate_limit(user_id, user_tier)
+```
+
+### 2. Log All Requests
+
+Log both successful and failed requests:
+
+```python
+tracker.log_usage(
+    ...,
+    success=False,
+    error_message="Model timeout"
+)
+```
+
+### 3. Monitor Platform Costs
+
+```python
+# Daily monitoring
+daily_cost = tracker.get_total_daily_cost(date.today())
+
+if daily_cost > 50:
+    send_alert("WARNING: Daily AI cost exceeded $50")
+if daily_cost > 100:
+    send_alert("CRITICAL: Daily AI cost exceeded $100")
+```
+
+### 4. Cost Estimation for UI
+
+Show users estimated costs before actions:
+
+```python
+cost_info = UsageTrackingService.get_model_cost_info(model)
+estimated = (base_tokens * 1.5 / 1000) * (cost_info['input'] + cost_info['output'])
+```
+
+### 5. Tier Upgrade Prompts
+
+When rate limited, prompt upgrades:
+
+```python
+if e.user_tier == UserTier.FREE:
+    message = "Upgrade to Basic for 50 turns/day!"
+elif e.user_tier == UserTier.BASIC:
+    message = "Upgrade to Premium for 100 turns/day!"
+```
+
+---
+
+## Target Cost Goals
+
+- **Development:** < $50/day
+- **Production target:** < $500/month total
+- **Cost per user:** ~$0.10/day (premium tier average)
+
+### Cost Breakdown by Tier (estimated daily)
+
+| Tier | Avg Requests | Avg Cost/Request | Daily Cost |
+|------|-------------|-----------------|------------|
+| FREE | 10 | $0.00 | $0.00 |
+| BASIC | 30 | $0.003 | $0.09 |
+| PREMIUM | 60 | $0.01 | $0.60 |
+| ELITE | 100 | $0.02 | $2.00 |
+
+---
+
+## Testing
+
+### Unit Tests
+
+```python
+# test_usage_tracking_service.py
+def test_log_usage():
+    tracker = UsageTrackingService()
+    log = tracker.log_usage(
+        user_id="test_user",
+        model="meta/meta-llama-3-8b-instruct",
+        tokens_input=100,
+        tokens_output=200,
+        task_type=TaskType.STORY_PROGRESSION
+    )
+    assert log.tokens_total == 300
+    assert log.estimated_cost > 0
+
+# test_rate_limiter_service.py
+def test_rate_limit_exceeded():
+    limiter = RateLimiterService()
+
+    # Exceed free tier limit
+    for _ in range(20):
+        limiter.increment_usage("test_user")
+
+    with pytest.raises(RateLimitExceeded):
+        limiter.check_rate_limit("test_user", UserTier.FREE)
+```
+
+### Integration Testing
+
+```bash
+# Check Redis connection
+redis-cli ping
+
+# Check Appwrite connection
+python -c "from app.services.usage_tracking_service import UsageTrackingService; UsageTrackingService()"
+```
+
+---
+
+## Future Enhancements (Deferred)
+
+- **Task 7.15:** Cost monitoring and alerts (daily job, email alerts)
+- Billing integration
+- Usage quotas per session
+- Real-time cost dashboard
+- Cost projections