15 KiB
Usage Tracking & Cost Controls
Overview
Code of Conquest implements comprehensive usage tracking and cost controls for AI operations. This ensures sustainable costs, fair usage across tiers, and visibility into system usage patterns.
Key Components:
- UsageTrackingService - Logs all AI usage and calculates costs
- RateLimiterService - Enforces tier-based daily limits
- AIUsageLog - Data model for usage events
Architecture
┌─────────────────────┐
│ AI Task Jobs │
├─────────────────────┤
│ UsageTrackingService│ ← Logs usage, calculates costs
├─────────────────────┤
│ RateLimiterService │ ← Enforces limits before processing
├─────────────────────┤
│ Redis + Appwrite │ ← Storage layer
└─────────────────────┘
Usage Tracking Service
File: app/services/usage_tracking_service.py
Initialization
from app.services.usage_tracking_service import UsageTrackingService
tracker = UsageTrackingService()
Required Environment Variables:
APPWRITE_ENDPOINT=https://cloud.appwrite.io/v1
APPWRITE_PROJECT_ID=your-project-id
APPWRITE_API_KEY=your-api-key
APPWRITE_DATABASE_ID=main
Logging Usage
from app.models.ai_usage import TaskType
# Log a usage event
usage_log = tracker.log_usage(
user_id="user_123",
model="anthropic/claude-3.5-sonnet",
tokens_input=150,
tokens_output=450,
task_type=TaskType.STORY_PROGRESSION,
session_id="sess_789",
character_id="char_456",
request_duration_ms=2500,
success=True
)
print(f"Log ID: {usage_log.log_id}")
print(f"Cost: ${usage_log.estimated_cost:.6f}")
Querying Usage
Daily Usage:
from datetime import date
# Get today's usage
usage = tracker.get_daily_usage("user_123", date.today())
print(f"Requests: {usage.total_requests}")
print(f"Tokens: {usage.total_tokens}")
print(f"Input tokens: {usage.total_input_tokens}")
print(f"Output tokens: {usage.total_output_tokens}")
print(f"Cost: ${usage.estimated_cost:.4f}")
print(f"By task: {usage.requests_by_task}")
# {"story_progression": 10, "combat_narration": 3, ...}
Monthly Cost:
# Get November 2025 cost
monthly = tracker.get_monthly_cost("user_123", 2025, 11)
print(f"Monthly requests: {monthly.total_requests}")
print(f"Monthly tokens: {monthly.total_tokens}")
print(f"Monthly cost: ${monthly.estimated_cost:.2f}")
Admin Monitoring:
# Get total platform cost for a day
total_cost = tracker.get_total_daily_cost(date.today())
print(f"Platform daily cost: ${total_cost:.2f}")
# Get user request count for rate limiting
count = tracker.get_user_request_count_today("user_123")
Cost Estimation
Static Methods (no instance needed):
from app.services.usage_tracking_service import UsageTrackingService
# Estimate cost for specific request
cost = UsageTrackingService.estimate_cost_for_model(
model="anthropic/claude-3.5-sonnet",
tokens_input=100,
tokens_output=400
)
print(f"Estimated: ${cost:.6f}")
# Get model pricing
info = UsageTrackingService.get_model_cost_info("anthropic/claude-3.5-sonnet")
print(f"Input: ${info['input']}/1K tokens")
print(f"Output: ${info['output']}/1K tokens")
Model Pricing
Costs per 1,000 tokens (USD):
| Model | Input | Output | Tier |
|---|---|---|---|
meta/meta-llama-3-8b-instruct |
$0.0001 | $0.0001 | Free |
meta/meta-llama-3-70b-instruct |
$0.0006 | $0.0006 | - |
anthropic/claude-3.5-haiku |
$0.001 | $0.005 | Basic |
anthropic/claude-3.5-sonnet |
$0.003 | $0.015 | Premium |
anthropic/claude-4.5-sonnet |
$0.003 | $0.015 | Elite |
anthropic/claude-3-opus |
$0.015 | $0.075 | - |
Default cost for unknown models: $0.001 input, $0.005 output per 1K tokens
Token Estimation
Since the Replicate API doesn't return exact token counts, tokens are estimated based on text length.
Estimation Formula
# Approximate 4 characters per token
tokens = len(text) // 4
How Tokens Are Calculated
Input Tokens:
- Calculated from the full prompt sent to the AI
- Includes: user prompt + system prompt
- Estimated at:
len(prompt + system_prompt) // 4
Output Tokens:
- Calculated from the AI's response text
- Estimated at:
len(response_text) // 4
ReplicateResponse Structure
The Replicate client returns both input and output token estimates:
@dataclass
class ReplicateResponse:
text: str
tokens_used: int # Total (input + output)
tokens_input: int # Estimated input tokens
tokens_output: int # Estimated output tokens
model: str
generation_time: float
Example Token Counts
| Content | Characters | Estimated Tokens |
|---|---|---|
| Short prompt | 400 chars | ~100 tokens |
| Full DM prompt | 4,000 chars | ~1,000 tokens |
| Short response | 200 chars | ~50 tokens |
| Full narrative | 800 chars | ~200 tokens |
Accuracy Notes
- Estimation is approximate (~75-80% accurate)
- Real tokenization varies by model
- Better to over-estimate for cost budgeting
- Logs use estimates; billing reconciliation may differ
Data Models
File: app/models/ai_usage.py
AIUsageLog
@dataclass
class AIUsageLog:
log_id: str # Unique identifier
user_id: str # User who made request
timestamp: datetime # When request was made
model: str # Model identifier
tokens_input: int # Input/prompt tokens
tokens_output: int # Output/response tokens
tokens_total: int # Total tokens
estimated_cost: float # Cost in USD
task_type: TaskType # Type of task
session_id: Optional[str] # Game session
character_id: Optional[str] # Character
request_duration_ms: int # Duration
success: bool # Success status
error_message: Optional[str] # Error if failed
TaskType Enum
class TaskType(str, Enum):
STORY_PROGRESSION = "story_progression"
COMBAT_NARRATION = "combat_narration"
QUEST_SELECTION = "quest_selection"
NPC_DIALOGUE = "npc_dialogue"
GENERAL = "general"
Summary Objects
@dataclass
class DailyUsageSummary:
date: date
user_id: str
total_requests: int
total_tokens: int
total_input_tokens: int
total_output_tokens: int
estimated_cost: float
requests_by_task: Dict[str, int]
@dataclass
class MonthlyUsageSummary:
year: int
month: int
user_id: str
total_requests: int
total_tokens: int
estimated_cost: float
daily_breakdown: list
Rate Limiter Service
File: app/services/rate_limiter_service.py
Daily Turn Limits
| Tier | Limit | Cost Level |
|---|---|---|
| FREE | 20 turns/day | Zero |
| BASIC | 50 turns/day | Low |
| PREMIUM | 100 turns/day | Medium |
| ELITE | 200 turns/day | High |
Counters reset at midnight UTC.
Custom Action Limits
Free-text actions (beyond preset buttons) have additional limits per tier:
| Tier | Custom Actions/Day | Character Limit |
|---|---|---|
| FREE | 10 | 150 chars |
| BASIC | 50 | 300 chars |
| PREMIUM | Unlimited | 500 chars |
| ELITE | Unlimited | 500 chars |
Configuration: These values are defined in config/*.yaml under rate_limiting.tiers:
tiers:
free:
custom_actions_per_day: 10
custom_action_char_limit: 150
Access in code:
from app.config import get_config
config = get_config()
tier_config = config.rate_limiting.tiers['free']
print(tier_config.custom_actions_per_day) # 10
print(tier_config.custom_action_char_limit) # 150
Basic Usage
from app.services.rate_limiter_service import RateLimiterService, RateLimitExceeded
from app.ai.model_selector import UserTier
limiter = RateLimiterService()
# Check and increment (typical flow)
try:
limiter.check_rate_limit("user_123", UserTier.PREMIUM)
# Process AI request...
limiter.increment_usage("user_123")
except RateLimitExceeded as e:
print(f"Limit reached: {e.current_usage}/{e.limit}")
print(f"Resets at: {e.reset_time}")
Query Methods
# Get current usage
current = limiter.get_current_usage("user_123")
# Get remaining turns
remaining = limiter.get_remaining_turns("user_123", UserTier.PREMIUM)
print(f"Remaining: {remaining} turns")
# Get comprehensive info
info = limiter.get_usage_info("user_123", UserTier.PREMIUM)
# {
# "user_id": "user_123",
# "user_tier": "premium",
# "current_usage": 45,
# "daily_limit": 100,
# "remaining": 55,
# "reset_time": "2025-11-22T00:00:00+00:00",
# "is_limited": False
# }
# Get limit for tier
limit = limiter.get_limit_for_tier(UserTier.ELITE) # 200
Admin Functions
# Reset user's daily counter (testing/admin)
limiter.reset_usage("user_123")
RateLimitExceeded Exception
class RateLimitExceeded(Exception):
user_id: str
user_tier: UserTier
limit: int
current_usage: int
reset_time: datetime
Provides all information needed for user-friendly error messages.
Integration Pattern
In AI Task Jobs
from app.services.rate_limiter_service import RateLimiterService, RateLimitExceeded
from app.services.usage_tracking_service import UsageTrackingService
from app.ai.narrative_generator import NarrativeGenerator
from app.models.ai_usage import TaskType
def process_ai_request(user_id: str, user_tier: UserTier, action: str, ...):
limiter = RateLimiterService()
tracker = UsageTrackingService()
generator = NarrativeGenerator()
# 1. Check rate limit BEFORE processing
try:
limiter.check_rate_limit(user_id, user_tier)
except RateLimitExceeded as e:
return {
"error": "rate_limit_exceeded",
"message": f"Daily limit reached ({e.limit} turns). Resets at {e.reset_time}",
"remaining": 0,
"reset_time": e.reset_time.isoformat()
}
# 2. Generate AI response
start_time = time.time()
response = generator.generate_story_response(...)
duration_ms = int((time.time() - start_time) * 1000)
# 3. Log usage (tokens are estimated in ReplicateClient)
tracker.log_usage(
user_id=user_id,
model=response.model,
tokens_input=response.tokens_input, # From prompt length
tokens_output=response.tokens_output, # From response length
task_type=TaskType.STORY_PROGRESSION,
session_id=session_id,
request_duration_ms=duration_ms,
success=True
)
# 4. Increment rate limit counter
limiter.increment_usage(user_id)
return {"narrative": response.narrative, ...}
API Endpoint Pattern
@bp.route('/sessions/<session_id>/action', methods=['POST'])
@require_auth
def take_action(session_id):
user = get_current_user()
limiter = RateLimiterService()
# Check limit and return remaining info
try:
limiter.check_rate_limit(user.id, user.tier)
except RateLimitExceeded as e:
return api_response(
status=429,
error={
"code": "RATE_LIMIT_EXCEEDED",
"message": "Daily turn limit reached",
"details": {
"limit": e.limit,
"current": e.current_usage,
"reset_time": e.reset_time.isoformat()
}
}
)
# Queue AI job...
remaining = limiter.get_remaining_turns(user.id, user.tier)
return api_response(
status=202,
result={
"job_id": job.id,
"remaining_turns": remaining
}
)
Appwrite Collection Schema
Collection: ai_usage_logs
| Field | Type | Description |
|---|---|---|
log_id |
string | Primary key |
user_id |
string | User identifier |
timestamp |
datetime | Request time (UTC) |
model |
string | Model identifier |
tokens_input |
integer | Input tokens |
tokens_output |
integer | Output tokens |
tokens_total |
integer | Total tokens |
estimated_cost |
double | Cost in USD |
task_type |
string | Task type enum |
session_id |
string | Optional session |
character_id |
string | Optional character |
request_duration_ms |
integer | Duration |
success |
boolean | Success status |
error_message |
string | Error if failed |
Indexes:
user_id+timestamp(for daily queries)timestamp(for admin monitoring)
Cost Management Best Practices
1. Pre-request Validation
Always check rate limits before processing:
limiter.check_rate_limit(user_id, user_tier)
2. Log All Requests
Log both successful and failed requests:
tracker.log_usage(
...,
success=False,
error_message="Model timeout"
)
3. Monitor Platform Costs
# Daily monitoring
daily_cost = tracker.get_total_daily_cost(date.today())
if daily_cost > 50:
send_alert("WARNING: Daily AI cost exceeded $50")
if daily_cost > 100:
send_alert("CRITICAL: Daily AI cost exceeded $100")
4. Cost Estimation for UI
Show users estimated costs before actions:
cost_info = UsageTrackingService.get_model_cost_info(model)
estimated = (base_tokens * 1.5 / 1000) * (cost_info['input'] + cost_info['output'])
5. Tier Upgrade Prompts
When rate limited, prompt upgrades:
if e.user_tier == UserTier.FREE:
message = "Upgrade to Basic for 50 turns/day!"
elif e.user_tier == UserTier.BASIC:
message = "Upgrade to Premium for 100 turns/day!"
Target Cost Goals
- Development: < $50/day
- Production target: < $500/month total
- Cost per user: ~$0.10/day (premium tier average)
Cost Breakdown by Tier (estimated daily)
| Tier | Avg Requests | Avg Cost/Request | Daily Cost |
|---|---|---|---|
| FREE | 10 | $0.00 | $0.00 |
| BASIC | 30 | $0.003 | $0.09 |
| PREMIUM | 60 | $0.01 | $0.60 |
| ELITE | 100 | $0.02 | $2.00 |
Testing
Unit Tests
# test_usage_tracking_service.py
def test_log_usage():
tracker = UsageTrackingService()
log = tracker.log_usage(
user_id="test_user",
model="meta/meta-llama-3-8b-instruct",
tokens_input=100,
tokens_output=200,
task_type=TaskType.STORY_PROGRESSION
)
assert log.tokens_total == 300
assert log.estimated_cost > 0
# test_rate_limiter_service.py
def test_rate_limit_exceeded():
limiter = RateLimiterService()
# Exceed free tier limit
for _ in range(20):
limiter.increment_usage("test_user")
with pytest.raises(RateLimitExceeded):
limiter.check_rate_limit("test_user", UserTier.FREE)
Integration Testing
# Check Redis connection
redis-cli ping
# Check Appwrite connection
python -c "from app.services.usage_tracking_service import UsageTrackingService; UsageTrackingService()"
Future Enhancements (Deferred)
- Task 7.15: Cost monitoring and alerts (daily job, email alerts)
- Billing integration
- Usage quotas per session
- Real-time cost dashboard
- Cost projections