first commit
This commit is contained in:
614
api/docs/USAGE_TRACKING.md
Normal file
614
api/docs/USAGE_TRACKING.md
Normal file
@@ -0,0 +1,614 @@
|
||||
# Usage Tracking & Cost Controls
|
||||
|
||||
## Overview
|
||||
|
||||
Code of Conquest implements comprehensive usage tracking and cost controls for AI operations. This ensures sustainable costs, fair usage across tiers, and visibility into system usage patterns.
|
||||
|
||||
**Key Components:**
|
||||
- **UsageTrackingService** - Logs all AI usage and calculates costs
|
||||
- **RateLimiterService** - Enforces tier-based daily limits
|
||||
- **AIUsageLog** - Data model for usage events
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────┐
|
||||
│ AI Task Jobs │
|
||||
├─────────────────────┤
|
||||
│ UsageTrackingService│ ← Logs usage, calculates costs
|
||||
├─────────────────────┤
|
||||
│ RateLimiterService │ ← Enforces limits before processing
|
||||
├─────────────────────┤
|
||||
│ Redis + Appwrite │ ← Storage layer
|
||||
└─────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage Tracking Service
|
||||
|
||||
**File:** `app/services/usage_tracking_service.py`
|
||||
|
||||
### Initialization
|
||||
|
||||
```python
|
||||
from app.services.usage_tracking_service import UsageTrackingService
|
||||
|
||||
tracker = UsageTrackingService()
|
||||
```
|
||||
|
||||
**Required Environment Variables:**
|
||||
```bash
|
||||
APPWRITE_ENDPOINT=https://cloud.appwrite.io/v1
|
||||
APPWRITE_PROJECT_ID=your-project-id
|
||||
APPWRITE_API_KEY=your-api-key
|
||||
APPWRITE_DATABASE_ID=main
|
||||
```
|
||||
|
||||
### Logging Usage
|
||||
|
||||
```python
|
||||
from app.models.ai_usage import TaskType
|
||||
|
||||
# Log a usage event
|
||||
usage_log = tracker.log_usage(
|
||||
user_id="user_123",
|
||||
model="anthropic/claude-3.5-sonnet",
|
||||
tokens_input=150,
|
||||
tokens_output=450,
|
||||
task_type=TaskType.STORY_PROGRESSION,
|
||||
session_id="sess_789",
|
||||
character_id="char_456",
|
||||
request_duration_ms=2500,
|
||||
success=True
|
||||
)
|
||||
|
||||
print(f"Log ID: {usage_log.log_id}")
|
||||
print(f"Cost: ${usage_log.estimated_cost:.6f}")
|
||||
```
|
||||
|
||||
### Querying Usage
|
||||
|
||||
**Daily Usage:**
|
||||
```python
|
||||
from datetime import date
|
||||
|
||||
# Get today's usage
|
||||
usage = tracker.get_daily_usage("user_123", date.today())
|
||||
|
||||
print(f"Requests: {usage.total_requests}")
|
||||
print(f"Tokens: {usage.total_tokens}")
|
||||
print(f"Input tokens: {usage.total_input_tokens}")
|
||||
print(f"Output tokens: {usage.total_output_tokens}")
|
||||
print(f"Cost: ${usage.estimated_cost:.4f}")
|
||||
print(f"By task: {usage.requests_by_task}")
|
||||
# {"story_progression": 10, "combat_narration": 3, ...}
|
||||
```
|
||||
|
||||
**Monthly Cost:**
|
||||
```python
|
||||
# Get November 2025 cost
|
||||
monthly = tracker.get_monthly_cost("user_123", 2025, 11)
|
||||
|
||||
print(f"Monthly requests: {monthly.total_requests}")
|
||||
print(f"Monthly tokens: {monthly.total_tokens}")
|
||||
print(f"Monthly cost: ${monthly.estimated_cost:.2f}")
|
||||
```
|
||||
|
||||
**Admin Monitoring:**
|
||||
```python
|
||||
# Get total platform cost for a day
|
||||
total_cost = tracker.get_total_daily_cost(date.today())
|
||||
print(f"Platform daily cost: ${total_cost:.2f}")
|
||||
|
||||
# Get user request count for rate limiting
|
||||
count = tracker.get_user_request_count_today("user_123")
|
||||
```
|
||||
|
||||
### Cost Estimation
|
||||
|
||||
**Static Methods (no instance needed):**
|
||||
```python
|
||||
from app.services.usage_tracking_service import UsageTrackingService
|
||||
|
||||
# Estimate cost for specific request
|
||||
cost = UsageTrackingService.estimate_cost_for_model(
|
||||
model="anthropic/claude-3.5-sonnet",
|
||||
tokens_input=100,
|
||||
tokens_output=400
|
||||
)
|
||||
print(f"Estimated: ${cost:.6f}")
|
||||
|
||||
# Get model pricing
|
||||
info = UsageTrackingService.get_model_cost_info("anthropic/claude-3.5-sonnet")
|
||||
print(f"Input: ${info['input']}/1K tokens")
|
||||
print(f"Output: ${info['output']}/1K tokens")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Model Pricing
|
||||
|
||||
Costs per 1,000 tokens (USD):
|
||||
|
||||
| Model | Input | Output | Tier |
|
||||
|-------|-------|--------|------|
|
||||
| `meta/meta-llama-3-8b-instruct` | $0.0001 | $0.0001 | Free |
|
||||
| `meta/meta-llama-3-70b-instruct` | $0.0006 | $0.0006 | - |
|
||||
| `anthropic/claude-3.5-haiku` | $0.001 | $0.005 | Basic |
|
||||
| `anthropic/claude-3.5-sonnet` | $0.003 | $0.015 | Premium |
|
||||
| `anthropic/claude-4.5-sonnet` | $0.003 | $0.015 | Elite |
|
||||
| `anthropic/claude-3-opus` | $0.015 | $0.075 | - |
|
||||
|
||||
**Default cost for unknown models:** $0.001 input, $0.005 output per 1K tokens
|
||||
|
||||
---
|
||||
|
||||
## Token Estimation
|
||||
|
||||
Since the Replicate API doesn't return exact token counts, tokens are estimated based on text length.
|
||||
|
||||
### Estimation Formula
|
||||
|
||||
```python
|
||||
# Approximate 4 characters per token
|
||||
tokens = len(text) // 4
|
||||
```
|
||||
|
||||
### How Tokens Are Calculated
|
||||
|
||||
**Input Tokens:**
|
||||
- Calculated from the full prompt sent to the AI
|
||||
- Includes: user prompt + system prompt
|
||||
- Estimated at: `len(prompt + system_prompt) // 4`
|
||||
|
||||
**Output Tokens:**
|
||||
- Calculated from the AI's response text
|
||||
- Estimated at: `len(response_text) // 4`
|
||||
|
||||
### ReplicateResponse Structure
|
||||
|
||||
The Replicate client returns both input and output token estimates:
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class ReplicateResponse:
|
||||
text: str
|
||||
tokens_used: int # Total (input + output)
|
||||
tokens_input: int # Estimated input tokens
|
||||
tokens_output: int # Estimated output tokens
|
||||
model: str
|
||||
generation_time: float
|
||||
```
|
||||
|
||||
### Example Token Counts
|
||||
|
||||
| Content | Characters | Estimated Tokens |
|
||||
|---------|------------|------------------|
|
||||
| Short prompt | 400 chars | ~100 tokens |
|
||||
| Full DM prompt | 4,000 chars | ~1,000 tokens |
|
||||
| Short response | 200 chars | ~50 tokens |
|
||||
| Full narrative | 800 chars | ~200 tokens |
|
||||
|
||||
### Accuracy Notes
|
||||
|
||||
- Estimation is approximate (~75-80% accurate)
|
||||
- Real tokenization varies by model
|
||||
- Better to over-estimate for cost budgeting
|
||||
- Logs use estimates; billing reconciliation may differ
|
||||
|
||||
---
|
||||
|
||||
## Data Models
|
||||
|
||||
**File:** `app/models/ai_usage.py`
|
||||
|
||||
### AIUsageLog
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class AIUsageLog:
|
||||
log_id: str # Unique identifier
|
||||
user_id: str # User who made request
|
||||
timestamp: datetime # When request was made
|
||||
model: str # Model identifier
|
||||
tokens_input: int # Input/prompt tokens
|
||||
tokens_output: int # Output/response tokens
|
||||
tokens_total: int # Total tokens
|
||||
estimated_cost: float # Cost in USD
|
||||
task_type: TaskType # Type of task
|
||||
session_id: Optional[str] # Game session
|
||||
character_id: Optional[str] # Character
|
||||
request_duration_ms: int # Duration
|
||||
success: bool # Success status
|
||||
error_message: Optional[str] # Error if failed
|
||||
```
|
||||
|
||||
### TaskType Enum
|
||||
|
||||
```python
|
||||
class TaskType(str, Enum):
|
||||
STORY_PROGRESSION = "story_progression"
|
||||
COMBAT_NARRATION = "combat_narration"
|
||||
QUEST_SELECTION = "quest_selection"
|
||||
NPC_DIALOGUE = "npc_dialogue"
|
||||
GENERAL = "general"
|
||||
```
|
||||
|
||||
### Summary Objects
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class DailyUsageSummary:
|
||||
date: date
|
||||
user_id: str
|
||||
total_requests: int
|
||||
total_tokens: int
|
||||
total_input_tokens: int
|
||||
total_output_tokens: int
|
||||
estimated_cost: float
|
||||
requests_by_task: Dict[str, int]
|
||||
|
||||
@dataclass
|
||||
class MonthlyUsageSummary:
|
||||
year: int
|
||||
month: int
|
||||
user_id: str
|
||||
total_requests: int
|
||||
total_tokens: int
|
||||
estimated_cost: float
|
||||
daily_breakdown: list
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Rate Limiter Service
|
||||
|
||||
**File:** `app/services/rate_limiter_service.py`
|
||||
|
||||
### Daily Turn Limits
|
||||
|
||||
| Tier | Limit | Cost Level |
|
||||
|------|-------|------------|
|
||||
| FREE | 20 turns/day | Zero |
|
||||
| BASIC | 50 turns/day | Low |
|
||||
| PREMIUM | 100 turns/day | Medium |
|
||||
| ELITE | 200 turns/day | High |
|
||||
|
||||
Counters reset at midnight UTC.
|
||||
|
||||
### Custom Action Limits
|
||||
|
||||
Free-text actions (beyond preset buttons) have additional limits per tier:
|
||||
|
||||
| Tier | Custom Actions/Day | Character Limit |
|
||||
|------|-------------------|-----------------|
|
||||
| FREE | 10 | 150 chars |
|
||||
| BASIC | 50 | 300 chars |
|
||||
| PREMIUM | Unlimited | 500 chars |
|
||||
| ELITE | Unlimited | 500 chars |
|
||||
|
||||
**Configuration:** These values are defined in `config/*.yaml` under `rate_limiting.tiers`:
|
||||
```yaml
|
||||
tiers:
|
||||
free:
|
||||
custom_actions_per_day: 10
|
||||
custom_action_char_limit: 150
|
||||
```
|
||||
|
||||
**Access in code:**
|
||||
```python
|
||||
from app.config import get_config
|
||||
|
||||
config = get_config()
|
||||
tier_config = config.rate_limiting.tiers['free']
|
||||
print(tier_config.custom_actions_per_day) # 10
|
||||
print(tier_config.custom_action_char_limit) # 150
|
||||
```
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from app.services.rate_limiter_service import RateLimiterService, RateLimitExceeded
|
||||
from app.ai.model_selector import UserTier
|
||||
|
||||
limiter = RateLimiterService()
|
||||
|
||||
# Check and increment (typical flow)
|
||||
try:
|
||||
limiter.check_rate_limit("user_123", UserTier.PREMIUM)
|
||||
# Process AI request...
|
||||
limiter.increment_usage("user_123")
|
||||
except RateLimitExceeded as e:
|
||||
print(f"Limit reached: {e.current_usage}/{e.limit}")
|
||||
print(f"Resets at: {e.reset_time}")
|
||||
```
|
||||
|
||||
### Query Methods
|
||||
|
||||
```python
|
||||
# Get current usage
|
||||
current = limiter.get_current_usage("user_123")
|
||||
|
||||
# Get remaining turns
|
||||
remaining = limiter.get_remaining_turns("user_123", UserTier.PREMIUM)
|
||||
print(f"Remaining: {remaining} turns")
|
||||
|
||||
# Get comprehensive info
|
||||
info = limiter.get_usage_info("user_123", UserTier.PREMIUM)
|
||||
# {
|
||||
# "user_id": "user_123",
|
||||
# "user_tier": "premium",
|
||||
# "current_usage": 45,
|
||||
# "daily_limit": 100,
|
||||
# "remaining": 55,
|
||||
# "reset_time": "2025-11-22T00:00:00+00:00",
|
||||
# "is_limited": False
|
||||
# }
|
||||
|
||||
# Get limit for tier
|
||||
limit = limiter.get_limit_for_tier(UserTier.ELITE) # 200
|
||||
```
|
||||
|
||||
### Admin Functions
|
||||
|
||||
```python
|
||||
# Reset user's daily counter (testing/admin)
|
||||
limiter.reset_usage("user_123")
|
||||
```
|
||||
|
||||
### RateLimitExceeded Exception
|
||||
|
||||
```python
|
||||
class RateLimitExceeded(Exception):
|
||||
user_id: str
|
||||
user_tier: UserTier
|
||||
limit: int
|
||||
current_usage: int
|
||||
reset_time: datetime
|
||||
```
|
||||
|
||||
Provides all information needed for user-friendly error messages.
|
||||
|
||||
---
|
||||
|
||||
## Integration Pattern
|
||||
|
||||
### In AI Task Jobs
|
||||
|
||||
```python
|
||||
from app.services.rate_limiter_service import RateLimiterService, RateLimitExceeded
|
||||
from app.services.usage_tracking_service import UsageTrackingService
|
||||
from app.ai.narrative_generator import NarrativeGenerator
|
||||
from app.models.ai_usage import TaskType
|
||||
|
||||
def process_ai_request(user_id: str, user_tier: UserTier, action: str, ...):
|
||||
limiter = RateLimiterService()
|
||||
tracker = UsageTrackingService()
|
||||
generator = NarrativeGenerator()
|
||||
|
||||
# 1. Check rate limit BEFORE processing
|
||||
try:
|
||||
limiter.check_rate_limit(user_id, user_tier)
|
||||
except RateLimitExceeded as e:
|
||||
return {
|
||||
"error": "rate_limit_exceeded",
|
||||
"message": f"Daily limit reached ({e.limit} turns). Resets at {e.reset_time}",
|
||||
"remaining": 0,
|
||||
"reset_time": e.reset_time.isoformat()
|
||||
}
|
||||
|
||||
# 2. Generate AI response
|
||||
start_time = time.time()
|
||||
response = generator.generate_story_response(...)
|
||||
duration_ms = int((time.time() - start_time) * 1000)
|
||||
|
||||
# 3. Log usage (tokens are estimated in ReplicateClient)
|
||||
tracker.log_usage(
|
||||
user_id=user_id,
|
||||
model=response.model,
|
||||
tokens_input=response.tokens_input, # From prompt length
|
||||
tokens_output=response.tokens_output, # From response length
|
||||
task_type=TaskType.STORY_PROGRESSION,
|
||||
session_id=session_id,
|
||||
request_duration_ms=duration_ms,
|
||||
success=True
|
||||
)
|
||||
|
||||
# 4. Increment rate limit counter
|
||||
limiter.increment_usage(user_id)
|
||||
|
||||
return {"narrative": response.narrative, ...}
|
||||
```
|
||||
|
||||
### API Endpoint Pattern
|
||||
|
||||
```python
|
||||
@bp.route('/sessions/<session_id>/action', methods=['POST'])
|
||||
@require_auth
|
||||
def take_action(session_id):
|
||||
user = get_current_user()
|
||||
limiter = RateLimiterService()
|
||||
|
||||
# Check limit and return remaining info
|
||||
try:
|
||||
limiter.check_rate_limit(user.id, user.tier)
|
||||
except RateLimitExceeded as e:
|
||||
return api_response(
|
||||
status=429,
|
||||
error={
|
||||
"code": "RATE_LIMIT_EXCEEDED",
|
||||
"message": "Daily turn limit reached",
|
||||
"details": {
|
||||
"limit": e.limit,
|
||||
"current": e.current_usage,
|
||||
"reset_time": e.reset_time.isoformat()
|
||||
}
|
||||
}
|
||||
)
|
||||
|
||||
# Queue AI job...
|
||||
remaining = limiter.get_remaining_turns(user.id, user.tier)
|
||||
|
||||
return api_response(
|
||||
status=202,
|
||||
result={
|
||||
"job_id": job.id,
|
||||
"remaining_turns": remaining
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Appwrite Collection Schema
|
||||
|
||||
**Collection:** `ai_usage_logs`
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `log_id` | string | Primary key |
|
||||
| `user_id` | string | User identifier |
|
||||
| `timestamp` | datetime | Request time (UTC) |
|
||||
| `model` | string | Model identifier |
|
||||
| `tokens_input` | integer | Input tokens |
|
||||
| `tokens_output` | integer | Output tokens |
|
||||
| `tokens_total` | integer | Total tokens |
|
||||
| `estimated_cost` | double | Cost in USD |
|
||||
| `task_type` | string | Task type enum |
|
||||
| `session_id` | string | Optional session |
|
||||
| `character_id` | string | Optional character |
|
||||
| `request_duration_ms` | integer | Duration |
|
||||
| `success` | boolean | Success status |
|
||||
| `error_message` | string | Error if failed |
|
||||
|
||||
**Indexes:**
|
||||
- `user_id` + `timestamp` (for daily queries)
|
||||
- `timestamp` (for admin monitoring)
|
||||
|
||||
---
|
||||
|
||||
## Cost Management Best Practices
|
||||
|
||||
### 1. Pre-request Validation
|
||||
|
||||
Always check rate limits before processing:
|
||||
|
||||
```python
|
||||
limiter.check_rate_limit(user_id, user_tier)
|
||||
```
|
||||
|
||||
### 2. Log All Requests
|
||||
|
||||
Log both successful and failed requests:
|
||||
|
||||
```python
|
||||
tracker.log_usage(
|
||||
...,
|
||||
success=False,
|
||||
error_message="Model timeout"
|
||||
)
|
||||
```
|
||||
|
||||
### 3. Monitor Platform Costs
|
||||
|
||||
```python
|
||||
# Daily monitoring
|
||||
daily_cost = tracker.get_total_daily_cost(date.today())
|
||||
|
||||
if daily_cost > 50:
|
||||
send_alert("WARNING: Daily AI cost exceeded $50")
|
||||
if daily_cost > 100:
|
||||
send_alert("CRITICAL: Daily AI cost exceeded $100")
|
||||
```
|
||||
|
||||
### 4. Cost Estimation for UI
|
||||
|
||||
Show users estimated costs before actions:
|
||||
|
||||
```python
|
||||
cost_info = UsageTrackingService.get_model_cost_info(model)
|
||||
estimated = (base_tokens * 1.5 / 1000) * (cost_info['input'] + cost_info['output'])
|
||||
```
|
||||
|
||||
### 5. Tier Upgrade Prompts
|
||||
|
||||
When rate limited, prompt upgrades:
|
||||
|
||||
```python
|
||||
if e.user_tier == UserTier.FREE:
|
||||
message = "Upgrade to Basic for 50 turns/day!"
|
||||
elif e.user_tier == UserTier.BASIC:
|
||||
message = "Upgrade to Premium for 100 turns/day!"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Target Cost Goals
|
||||
|
||||
- **Development:** < $50/day
|
||||
- **Production target:** < $500/month total
|
||||
- **Cost per user:** ~$0.10/day (premium tier average)
|
||||
|
||||
### Cost Breakdown by Tier (estimated daily)
|
||||
|
||||
| Tier | Avg Requests | Avg Cost/Request | Daily Cost |
|
||||
|------|-------------|-----------------|------------|
|
||||
| FREE | 10 | $0.00 | $0.00 |
|
||||
| BASIC | 30 | $0.003 | $0.09 |
|
||||
| PREMIUM | 60 | $0.01 | $0.60 |
|
||||
| ELITE | 100 | $0.02 | $2.00 |
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
### Unit Tests
|
||||
|
||||
```python
|
||||
# test_usage_tracking_service.py
|
||||
def test_log_usage():
|
||||
tracker = UsageTrackingService()
|
||||
log = tracker.log_usage(
|
||||
user_id="test_user",
|
||||
model="meta/meta-llama-3-8b-instruct",
|
||||
tokens_input=100,
|
||||
tokens_output=200,
|
||||
task_type=TaskType.STORY_PROGRESSION
|
||||
)
|
||||
assert log.tokens_total == 300
|
||||
assert log.estimated_cost > 0
|
||||
|
||||
# test_rate_limiter_service.py
|
||||
def test_rate_limit_exceeded():
|
||||
limiter = RateLimiterService()
|
||||
|
||||
# Exceed free tier limit
|
||||
for _ in range(20):
|
||||
limiter.increment_usage("test_user")
|
||||
|
||||
with pytest.raises(RateLimitExceeded):
|
||||
limiter.check_rate_limit("test_user", UserTier.FREE)
|
||||
```
|
||||
|
||||
### Integration Testing
|
||||
|
||||
```bash
|
||||
# Check Redis connection
|
||||
redis-cli ping
|
||||
|
||||
# Check Appwrite connection
|
||||
python -c "from app.services.usage_tracking_service import UsageTrackingService; UsageTrackingService()"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Future Enhancements (Deferred)
|
||||
|
||||
- **Task 7.15:** Cost monitoring and alerts (daily job, email alerts)
|
||||
- Billing integration
|
||||
- Usage quotas per session
|
||||
- Real-time cost dashboard
|
||||
- Cost projections
|
||||
Reference in New Issue
Block a user