# Usage Tracking & Cost Controls ## Overview Code of Conquest implements comprehensive usage tracking and cost controls for AI operations. This ensures sustainable costs, fair usage across tiers, and visibility into system usage patterns. **Key Components:** - **UsageTrackingService** - Logs all AI usage and calculates costs - **RateLimiterService** - Enforces tier-based daily limits - **AIUsageLog** - Data model for usage events --- ## Architecture ``` ┌─────────────────────┐ │ AI Task Jobs │ ├─────────────────────┤ │ UsageTrackingService│ ← Logs usage, calculates costs ├─────────────────────┤ │ RateLimiterService │ ← Enforces limits before processing ├─────────────────────┤ │ Redis + Appwrite │ ← Storage layer └─────────────────────┘ ``` --- ## Usage Tracking Service **File:** `app/services/usage_tracking_service.py` ### Initialization ```python from app.services.usage_tracking_service import UsageTrackingService tracker = UsageTrackingService() ``` **Required Environment Variables:** ```bash APPWRITE_ENDPOINT=https://cloud.appwrite.io/v1 APPWRITE_PROJECT_ID=your-project-id APPWRITE_API_KEY=your-api-key APPWRITE_DATABASE_ID=main ``` ### Logging Usage ```python from app.models.ai_usage import TaskType # Log a usage event usage_log = tracker.log_usage( user_id="user_123", model="anthropic/claude-3.5-sonnet", tokens_input=150, tokens_output=450, task_type=TaskType.STORY_PROGRESSION, session_id="sess_789", character_id="char_456", request_duration_ms=2500, success=True ) print(f"Log ID: {usage_log.log_id}") print(f"Cost: ${usage_log.estimated_cost:.6f}") ``` ### Querying Usage **Daily Usage:** ```python from datetime import date # Get today's usage usage = tracker.get_daily_usage("user_123", date.today()) print(f"Requests: {usage.total_requests}") print(f"Tokens: {usage.total_tokens}") print(f"Input tokens: {usage.total_input_tokens}") print(f"Output tokens: {usage.total_output_tokens}") print(f"Cost: ${usage.estimated_cost:.4f}") print(f"By task: {usage.requests_by_task}") # {"story_progression": 10, "combat_narration": 3, ...} ``` **Monthly Cost:** ```python # Get November 2025 cost monthly = tracker.get_monthly_cost("user_123", 2025, 11) print(f"Monthly requests: {monthly.total_requests}") print(f"Monthly tokens: {monthly.total_tokens}") print(f"Monthly cost: ${monthly.estimated_cost:.2f}") ``` **Admin Monitoring:** ```python # Get total platform cost for a day total_cost = tracker.get_total_daily_cost(date.today()) print(f"Platform daily cost: ${total_cost:.2f}") # Get user request count for rate limiting count = tracker.get_user_request_count_today("user_123") ``` ### Cost Estimation **Static Methods (no instance needed):** ```python from app.services.usage_tracking_service import UsageTrackingService # Estimate cost for specific request cost = UsageTrackingService.estimate_cost_for_model( model="anthropic/claude-3.5-sonnet", tokens_input=100, tokens_output=400 ) print(f"Estimated: ${cost:.6f}") # Get model pricing info = UsageTrackingService.get_model_cost_info("anthropic/claude-3.5-sonnet") print(f"Input: ${info['input']}/1K tokens") print(f"Output: ${info['output']}/1K tokens") ``` --- ## Model Pricing Costs per 1,000 tokens (USD): | Model | Input | Output | Tier | |-------|-------|--------|------| | `meta/meta-llama-3-8b-instruct` | $0.0001 | $0.0001 | Free | | `meta/meta-llama-3-70b-instruct` | $0.0006 | $0.0006 | - | | `anthropic/claude-3.5-haiku` | $0.001 | $0.005 | Basic | | `anthropic/claude-3.5-sonnet` | $0.003 | $0.015 | Premium | | `anthropic/claude-4.5-sonnet` | $0.003 | $0.015 | Elite | | `anthropic/claude-3-opus` | $0.015 | $0.075 | - | **Default cost for unknown models:** $0.001 input, $0.005 output per 1K tokens --- ## Token Estimation Since the Replicate API doesn't return exact token counts, tokens are estimated based on text length. ### Estimation Formula ```python # Approximate 4 characters per token tokens = len(text) // 4 ``` ### How Tokens Are Calculated **Input Tokens:** - Calculated from the full prompt sent to the AI - Includes: user prompt + system prompt - Estimated at: `len(prompt + system_prompt) // 4` **Output Tokens:** - Calculated from the AI's response text - Estimated at: `len(response_text) // 4` ### ReplicateResponse Structure The Replicate client returns both input and output token estimates: ```python @dataclass class ReplicateResponse: text: str tokens_used: int # Total (input + output) tokens_input: int # Estimated input tokens tokens_output: int # Estimated output tokens model: str generation_time: float ``` ### Example Token Counts | Content | Characters | Estimated Tokens | |---------|------------|------------------| | Short prompt | 400 chars | ~100 tokens | | Full DM prompt | 4,000 chars | ~1,000 tokens | | Short response | 200 chars | ~50 tokens | | Full narrative | 800 chars | ~200 tokens | ### Accuracy Notes - Estimation is approximate (~75-80% accurate) - Real tokenization varies by model - Better to over-estimate for cost budgeting - Logs use estimates; billing reconciliation may differ --- ## Data Models **File:** `app/models/ai_usage.py` ### AIUsageLog ```python @dataclass class AIUsageLog: log_id: str # Unique identifier user_id: str # User who made request timestamp: datetime # When request was made model: str # Model identifier tokens_input: int # Input/prompt tokens tokens_output: int # Output/response tokens tokens_total: int # Total tokens estimated_cost: float # Cost in USD task_type: TaskType # Type of task session_id: Optional[str] # Game session character_id: Optional[str] # Character request_duration_ms: int # Duration success: bool # Success status error_message: Optional[str] # Error if failed ``` ### TaskType Enum ```python class TaskType(str, Enum): STORY_PROGRESSION = "story_progression" COMBAT_NARRATION = "combat_narration" QUEST_SELECTION = "quest_selection" NPC_DIALOGUE = "npc_dialogue" GENERAL = "general" ``` ### Summary Objects ```python @dataclass class DailyUsageSummary: date: date user_id: str total_requests: int total_tokens: int total_input_tokens: int total_output_tokens: int estimated_cost: float requests_by_task: Dict[str, int] @dataclass class MonthlyUsageSummary: year: int month: int user_id: str total_requests: int total_tokens: int estimated_cost: float daily_breakdown: list ``` --- ## Rate Limiter Service **File:** `app/services/rate_limiter_service.py` ### Daily Turn Limits | Tier | Limit | Cost Level | |------|-------|------------| | FREE | 20 turns/day | Zero | | BASIC | 50 turns/day | Low | | PREMIUM | 100 turns/day | Medium | | ELITE | 200 turns/day | High | Counters reset at midnight UTC. ### Custom Action Limits Free-text actions (beyond preset buttons) have additional limits per tier: | Tier | Custom Actions/Day | Character Limit | |------|-------------------|-----------------| | FREE | 10 | 150 chars | | BASIC | 50 | 300 chars | | PREMIUM | Unlimited | 500 chars | | ELITE | Unlimited | 500 chars | **Configuration:** These values are defined in `config/*.yaml` under `rate_limiting.tiers`: ```yaml tiers: free: custom_actions_per_day: 10 custom_action_char_limit: 150 ``` **Access in code:** ```python from app.config import get_config config = get_config() tier_config = config.rate_limiting.tiers['free'] print(tier_config.custom_actions_per_day) # 10 print(tier_config.custom_action_char_limit) # 150 ``` ### Basic Usage ```python from app.services.rate_limiter_service import RateLimiterService, RateLimitExceeded from app.ai.model_selector import UserTier limiter = RateLimiterService() # Check and increment (typical flow) try: limiter.check_rate_limit("user_123", UserTier.PREMIUM) # Process AI request... limiter.increment_usage("user_123") except RateLimitExceeded as e: print(f"Limit reached: {e.current_usage}/{e.limit}") print(f"Resets at: {e.reset_time}") ``` ### Query Methods ```python # Get current usage current = limiter.get_current_usage("user_123") # Get remaining turns remaining = limiter.get_remaining_turns("user_123", UserTier.PREMIUM) print(f"Remaining: {remaining} turns") # Get comprehensive info info = limiter.get_usage_info("user_123", UserTier.PREMIUM) # { # "user_id": "user_123", # "user_tier": "premium", # "current_usage": 45, # "daily_limit": 100, # "remaining": 55, # "reset_time": "2025-11-22T00:00:00+00:00", # "is_limited": False # } # Get limit for tier limit = limiter.get_limit_for_tier(UserTier.ELITE) # 200 ``` ### Admin Functions ```python # Reset user's daily counter (testing/admin) limiter.reset_usage("user_123") ``` ### RateLimitExceeded Exception ```python class RateLimitExceeded(Exception): user_id: str user_tier: UserTier limit: int current_usage: int reset_time: datetime ``` Provides all information needed for user-friendly error messages. --- ## Integration Pattern ### In AI Task Jobs ```python from app.services.rate_limiter_service import RateLimiterService, RateLimitExceeded from app.services.usage_tracking_service import UsageTrackingService from app.ai.narrative_generator import NarrativeGenerator from app.models.ai_usage import TaskType def process_ai_request(user_id: str, user_tier: UserTier, action: str, ...): limiter = RateLimiterService() tracker = UsageTrackingService() generator = NarrativeGenerator() # 1. Check rate limit BEFORE processing try: limiter.check_rate_limit(user_id, user_tier) except RateLimitExceeded as e: return { "error": "rate_limit_exceeded", "message": f"Daily limit reached ({e.limit} turns). Resets at {e.reset_time}", "remaining": 0, "reset_time": e.reset_time.isoformat() } # 2. Generate AI response start_time = time.time() response = generator.generate_story_response(...) duration_ms = int((time.time() - start_time) * 1000) # 3. Log usage (tokens are estimated in ReplicateClient) tracker.log_usage( user_id=user_id, model=response.model, tokens_input=response.tokens_input, # From prompt length tokens_output=response.tokens_output, # From response length task_type=TaskType.STORY_PROGRESSION, session_id=session_id, request_duration_ms=duration_ms, success=True ) # 4. Increment rate limit counter limiter.increment_usage(user_id) return {"narrative": response.narrative, ...} ``` ### API Endpoint Pattern ```python @bp.route('/sessions//action', methods=['POST']) @require_auth def take_action(session_id): user = get_current_user() limiter = RateLimiterService() # Check limit and return remaining info try: limiter.check_rate_limit(user.id, user.tier) except RateLimitExceeded as e: return api_response( status=429, error={ "code": "RATE_LIMIT_EXCEEDED", "message": "Daily turn limit reached", "details": { "limit": e.limit, "current": e.current_usage, "reset_time": e.reset_time.isoformat() } } ) # Queue AI job... remaining = limiter.get_remaining_turns(user.id, user.tier) return api_response( status=202, result={ "job_id": job.id, "remaining_turns": remaining } ) ``` --- ## Appwrite Collection Schema **Collection:** `ai_usage_logs` | Field | Type | Description | |-------|------|-------------| | `log_id` | string | Primary key | | `user_id` | string | User identifier | | `timestamp` | datetime | Request time (UTC) | | `model` | string | Model identifier | | `tokens_input` | integer | Input tokens | | `tokens_output` | integer | Output tokens | | `tokens_total` | integer | Total tokens | | `estimated_cost` | double | Cost in USD | | `task_type` | string | Task type enum | | `session_id` | string | Optional session | | `character_id` | string | Optional character | | `request_duration_ms` | integer | Duration | | `success` | boolean | Success status | | `error_message` | string | Error if failed | **Indexes:** - `user_id` + `timestamp` (for daily queries) - `timestamp` (for admin monitoring) --- ## Cost Management Best Practices ### 1. Pre-request Validation Always check rate limits before processing: ```python limiter.check_rate_limit(user_id, user_tier) ``` ### 2. Log All Requests Log both successful and failed requests: ```python tracker.log_usage( ..., success=False, error_message="Model timeout" ) ``` ### 3. Monitor Platform Costs ```python # Daily monitoring daily_cost = tracker.get_total_daily_cost(date.today()) if daily_cost > 50: send_alert("WARNING: Daily AI cost exceeded $50") if daily_cost > 100: send_alert("CRITICAL: Daily AI cost exceeded $100") ``` ### 4. Cost Estimation for UI Show users estimated costs before actions: ```python cost_info = UsageTrackingService.get_model_cost_info(model) estimated = (base_tokens * 1.5 / 1000) * (cost_info['input'] + cost_info['output']) ``` ### 5. Tier Upgrade Prompts When rate limited, prompt upgrades: ```python if e.user_tier == UserTier.FREE: message = "Upgrade to Basic for 50 turns/day!" elif e.user_tier == UserTier.BASIC: message = "Upgrade to Premium for 100 turns/day!" ``` --- ## Target Cost Goals - **Development:** < $50/day - **Production target:** < $500/month total - **Cost per user:** ~$0.10/day (premium tier average) ### Cost Breakdown by Tier (estimated daily) | Tier | Avg Requests | Avg Cost/Request | Daily Cost | |------|-------------|-----------------|------------| | FREE | 10 | $0.00 | $0.00 | | BASIC | 30 | $0.003 | $0.09 | | PREMIUM | 60 | $0.01 | $0.60 | | ELITE | 100 | $0.02 | $2.00 | --- ## Testing ### Unit Tests ```python # test_usage_tracking_service.py def test_log_usage(): tracker = UsageTrackingService() log = tracker.log_usage( user_id="test_user", model="meta/meta-llama-3-8b-instruct", tokens_input=100, tokens_output=200, task_type=TaskType.STORY_PROGRESSION ) assert log.tokens_total == 300 assert log.estimated_cost > 0 # test_rate_limiter_service.py def test_rate_limit_exceeded(): limiter = RateLimiterService() # Exceed free tier limit for _ in range(20): limiter.increment_usage("test_user") with pytest.raises(RateLimitExceeded): limiter.check_rate_limit("test_user", UserTier.FREE) ``` ### Integration Testing ```bash # Check Redis connection redis-cli ping # Check Appwrite connection python -c "from app.services.usage_tracking_service import UsageTrackingService; UsageTrackingService()" ``` --- ## Future Enhancements (Deferred) - **Task 7.15:** Cost monitoring and alerts (daily job, email alerts) - Billing integration - Usage quotas per session - Real-time cost dashboard - Cost projections