- Update USAGE_TRACKING.md with new tier limits (50, 200, 1000, unlimited) - Update AI_INTEGRATION.md with new tier limits - Add note that limits are loaded from config (ai_calls_per_day) - Document GET /api/v1/usage endpoint - Update examples to show is_unlimited field - Fix test examples with correct limit values 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
637 lines
16 KiB
Markdown
637 lines
16 KiB
Markdown
# Usage Tracking & Cost Controls
|
|
|
|
## Overview
|
|
|
|
Code of Conquest implements comprehensive usage tracking and cost controls for AI operations. This ensures sustainable costs, fair usage across tiers, and visibility into system usage patterns.
|
|
|
|
**Key Components:**
|
|
- **UsageTrackingService** - Logs all AI usage and calculates costs
|
|
- **RateLimiterService** - Enforces tier-based daily limits
|
|
- **AIUsageLog** - Data model for usage events
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────┐
|
|
│ AI Task Jobs │
|
|
├─────────────────────┤
|
|
│ UsageTrackingService│ ← Logs usage, calculates costs
|
|
├─────────────────────┤
|
|
│ RateLimiterService │ ← Enforces limits before processing
|
|
├─────────────────────┤
|
|
│ Redis + Appwrite │ ← Storage layer
|
|
└─────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Usage Tracking Service
|
|
|
|
**File:** `app/services/usage_tracking_service.py`
|
|
|
|
### Initialization
|
|
|
|
```python
|
|
from app.services.usage_tracking_service import UsageTrackingService
|
|
|
|
tracker = UsageTrackingService()
|
|
```
|
|
|
|
**Required Environment Variables:**
|
|
```bash
|
|
APPWRITE_ENDPOINT=https://cloud.appwrite.io/v1
|
|
APPWRITE_PROJECT_ID=your-project-id
|
|
APPWRITE_API_KEY=your-api-key
|
|
APPWRITE_DATABASE_ID=main
|
|
```
|
|
|
|
### Logging Usage
|
|
|
|
```python
|
|
from app.models.ai_usage import TaskType
|
|
|
|
# Log a usage event
|
|
usage_log = tracker.log_usage(
|
|
user_id="user_123",
|
|
model="anthropic/claude-3.5-sonnet",
|
|
tokens_input=150,
|
|
tokens_output=450,
|
|
task_type=TaskType.STORY_PROGRESSION,
|
|
session_id="sess_789",
|
|
character_id="char_456",
|
|
request_duration_ms=2500,
|
|
success=True
|
|
)
|
|
|
|
print(f"Log ID: {usage_log.log_id}")
|
|
print(f"Cost: ${usage_log.estimated_cost:.6f}")
|
|
```
|
|
|
|
### Querying Usage
|
|
|
|
**Daily Usage:**
|
|
```python
|
|
from datetime import date
|
|
|
|
# Get today's usage
|
|
usage = tracker.get_daily_usage("user_123", date.today())
|
|
|
|
print(f"Requests: {usage.total_requests}")
|
|
print(f"Tokens: {usage.total_tokens}")
|
|
print(f"Input tokens: {usage.total_input_tokens}")
|
|
print(f"Output tokens: {usage.total_output_tokens}")
|
|
print(f"Cost: ${usage.estimated_cost:.4f}")
|
|
print(f"By task: {usage.requests_by_task}")
|
|
# {"story_progression": 10, "combat_narration": 3, ...}
|
|
```
|
|
|
|
**Monthly Cost:**
|
|
```python
|
|
# Get November 2025 cost
|
|
monthly = tracker.get_monthly_cost("user_123", 2025, 11)
|
|
|
|
print(f"Monthly requests: {monthly.total_requests}")
|
|
print(f"Monthly tokens: {monthly.total_tokens}")
|
|
print(f"Monthly cost: ${monthly.estimated_cost:.2f}")
|
|
```
|
|
|
|
**Admin Monitoring:**
|
|
```python
|
|
# Get total platform cost for a day
|
|
total_cost = tracker.get_total_daily_cost(date.today())
|
|
print(f"Platform daily cost: ${total_cost:.2f}")
|
|
|
|
# Get user request count for rate limiting
|
|
count = tracker.get_user_request_count_today("user_123")
|
|
```
|
|
|
|
### Cost Estimation
|
|
|
|
**Static Methods (no instance needed):**
|
|
```python
|
|
from app.services.usage_tracking_service import UsageTrackingService
|
|
|
|
# Estimate cost for specific request
|
|
cost = UsageTrackingService.estimate_cost_for_model(
|
|
model="anthropic/claude-3.5-sonnet",
|
|
tokens_input=100,
|
|
tokens_output=400
|
|
)
|
|
print(f"Estimated: ${cost:.6f}")
|
|
|
|
# Get model pricing
|
|
info = UsageTrackingService.get_model_cost_info("anthropic/claude-3.5-sonnet")
|
|
print(f"Input: ${info['input']}/1K tokens")
|
|
print(f"Output: ${info['output']}/1K tokens")
|
|
```
|
|
|
|
---
|
|
|
|
## Model Pricing
|
|
|
|
Costs per 1,000 tokens (USD):
|
|
|
|
| Model | Input | Output | Tier |
|
|
|-------|-------|--------|------|
|
|
| `meta/meta-llama-3-8b-instruct` | $0.0001 | $0.0001 | Free |
|
|
| `meta/meta-llama-3-70b-instruct` | $0.0006 | $0.0006 | - |
|
|
| `anthropic/claude-3.5-haiku` | $0.001 | $0.005 | Basic |
|
|
| `anthropic/claude-3.5-sonnet` | $0.003 | $0.015 | Premium |
|
|
| `anthropic/claude-4.5-sonnet` | $0.003 | $0.015 | Elite |
|
|
| `anthropic/claude-3-opus` | $0.015 | $0.075 | - |
|
|
|
|
**Default cost for unknown models:** $0.001 input, $0.005 output per 1K tokens
|
|
|
|
---
|
|
|
|
## Token Estimation
|
|
|
|
Since the Replicate API doesn't return exact token counts, tokens are estimated based on text length.
|
|
|
|
### Estimation Formula
|
|
|
|
```python
|
|
# Approximate 4 characters per token
|
|
tokens = len(text) // 4
|
|
```
|
|
|
|
### How Tokens Are Calculated
|
|
|
|
**Input Tokens:**
|
|
- Calculated from the full prompt sent to the AI
|
|
- Includes: user prompt + system prompt
|
|
- Estimated at: `len(prompt + system_prompt) // 4`
|
|
|
|
**Output Tokens:**
|
|
- Calculated from the AI's response text
|
|
- Estimated at: `len(response_text) // 4`
|
|
|
|
### ReplicateResponse Structure
|
|
|
|
The Replicate client returns both input and output token estimates:
|
|
|
|
```python
|
|
@dataclass
|
|
class ReplicateResponse:
|
|
text: str
|
|
tokens_used: int # Total (input + output)
|
|
tokens_input: int # Estimated input tokens
|
|
tokens_output: int # Estimated output tokens
|
|
model: str
|
|
generation_time: float
|
|
```
|
|
|
|
### Example Token Counts
|
|
|
|
| Content | Characters | Estimated Tokens |
|
|
|---------|------------|------------------|
|
|
| Short prompt | 400 chars | ~100 tokens |
|
|
| Full DM prompt | 4,000 chars | ~1,000 tokens |
|
|
| Short response | 200 chars | ~50 tokens |
|
|
| Full narrative | 800 chars | ~200 tokens |
|
|
|
|
### Accuracy Notes
|
|
|
|
- Estimation is approximate (~75-80% accurate)
|
|
- Real tokenization varies by model
|
|
- Better to over-estimate for cost budgeting
|
|
- Logs use estimates; billing reconciliation may differ
|
|
|
|
---
|
|
|
|
## Data Models
|
|
|
|
**File:** `app/models/ai_usage.py`
|
|
|
|
### AIUsageLog
|
|
|
|
```python
|
|
@dataclass
|
|
class AIUsageLog:
|
|
log_id: str # Unique identifier
|
|
user_id: str # User who made request
|
|
timestamp: datetime # When request was made
|
|
model: str # Model identifier
|
|
tokens_input: int # Input/prompt tokens
|
|
tokens_output: int # Output/response tokens
|
|
tokens_total: int # Total tokens
|
|
estimated_cost: float # Cost in USD
|
|
task_type: TaskType # Type of task
|
|
session_id: Optional[str] # Game session
|
|
character_id: Optional[str] # Character
|
|
request_duration_ms: int # Duration
|
|
success: bool # Success status
|
|
error_message: Optional[str] # Error if failed
|
|
```
|
|
|
|
### TaskType Enum
|
|
|
|
```python
|
|
class TaskType(str, Enum):
|
|
STORY_PROGRESSION = "story_progression"
|
|
COMBAT_NARRATION = "combat_narration"
|
|
QUEST_SELECTION = "quest_selection"
|
|
NPC_DIALOGUE = "npc_dialogue"
|
|
GENERAL = "general"
|
|
```
|
|
|
|
### Summary Objects
|
|
|
|
```python
|
|
@dataclass
|
|
class DailyUsageSummary:
|
|
date: date
|
|
user_id: str
|
|
total_requests: int
|
|
total_tokens: int
|
|
total_input_tokens: int
|
|
total_output_tokens: int
|
|
estimated_cost: float
|
|
requests_by_task: Dict[str, int]
|
|
|
|
@dataclass
|
|
class MonthlyUsageSummary:
|
|
year: int
|
|
month: int
|
|
user_id: str
|
|
total_requests: int
|
|
total_tokens: int
|
|
estimated_cost: float
|
|
daily_breakdown: list
|
|
```
|
|
|
|
---
|
|
|
|
## Rate Limiter Service
|
|
|
|
**File:** `app/services/rate_limiter_service.py`
|
|
|
|
### Daily Turn Limits
|
|
|
|
Limits are loaded from config (`rate_limiting.tiers.{tier}.ai_calls_per_day`):
|
|
|
|
| Tier | Limit | Cost Level |
|
|
|------|-------|------------|
|
|
| FREE | 50 turns/day | Zero |
|
|
| BASIC | 200 turns/day | Low |
|
|
| PREMIUM | 1000 turns/day | Medium |
|
|
| ELITE | Unlimited | High |
|
|
|
|
Counters reset at midnight UTC. A value of `-1` in config means unlimited.
|
|
|
|
### Usage API Endpoint
|
|
|
|
Get current usage info via `GET /api/v1/usage`:
|
|
|
|
```json
|
|
{
|
|
"user_id": "user_123",
|
|
"user_tier": "free",
|
|
"current_usage": 15,
|
|
"daily_limit": 50,
|
|
"remaining": 35,
|
|
"reset_time": "2025-11-27T00:00:00+00:00",
|
|
"is_limited": false,
|
|
"is_unlimited": false
|
|
}
|
|
```
|
|
|
|
### Custom Action Limits
|
|
|
|
Free-text actions (beyond preset buttons) have additional limits per tier:
|
|
|
|
| Tier | Custom Actions/Day | Character Limit |
|
|
|------|-------------------|-----------------|
|
|
| FREE | 10 | 150 chars |
|
|
| BASIC | 50 | 300 chars |
|
|
| PREMIUM | Unlimited | 500 chars |
|
|
| ELITE | Unlimited | 500 chars |
|
|
|
|
**Configuration:** These values are defined in `config/*.yaml` under `rate_limiting.tiers`:
|
|
```yaml
|
|
tiers:
|
|
free:
|
|
custom_actions_per_day: 10
|
|
custom_action_char_limit: 150
|
|
```
|
|
|
|
**Access in code:**
|
|
```python
|
|
from app.config import get_config
|
|
|
|
config = get_config()
|
|
tier_config = config.rate_limiting.tiers['free']
|
|
print(tier_config.custom_actions_per_day) # 10
|
|
print(tier_config.custom_action_char_limit) # 150
|
|
```
|
|
|
|
### Basic Usage
|
|
|
|
```python
|
|
from app.services.rate_limiter_service import RateLimiterService, RateLimitExceeded
|
|
from app.ai.model_selector import UserTier
|
|
|
|
limiter = RateLimiterService()
|
|
|
|
# Check and increment (typical flow)
|
|
try:
|
|
limiter.check_rate_limit("user_123", UserTier.PREMIUM)
|
|
# Process AI request...
|
|
limiter.increment_usage("user_123")
|
|
except RateLimitExceeded as e:
|
|
print(f"Limit reached: {e.current_usage}/{e.limit}")
|
|
print(f"Resets at: {e.reset_time}")
|
|
```
|
|
|
|
### Query Methods
|
|
|
|
```python
|
|
# Get current usage
|
|
current = limiter.get_current_usage("user_123")
|
|
|
|
# Get remaining turns
|
|
remaining = limiter.get_remaining_turns("user_123", UserTier.PREMIUM)
|
|
print(f"Remaining: {remaining} turns")
|
|
|
|
# Get comprehensive info
|
|
info = limiter.get_usage_info("user_123", UserTier.PREMIUM)
|
|
# {
|
|
# "user_id": "user_123",
|
|
# "user_tier": "premium",
|
|
# "current_usage": 45,
|
|
# "daily_limit": 1000,
|
|
# "remaining": 955,
|
|
# "reset_time": "2025-11-22T00:00:00+00:00",
|
|
# "is_limited": False,
|
|
# "is_unlimited": False
|
|
# }
|
|
|
|
# Get limit for tier (-1 means unlimited)
|
|
limit = limiter.get_limit_for_tier(UserTier.ELITE) # -1 (unlimited)
|
|
```
|
|
|
|
### Admin Functions
|
|
|
|
```python
|
|
# Reset user's daily counter (testing/admin)
|
|
limiter.reset_usage("user_123")
|
|
```
|
|
|
|
### RateLimitExceeded Exception
|
|
|
|
```python
|
|
class RateLimitExceeded(Exception):
|
|
user_id: str
|
|
user_tier: UserTier
|
|
limit: int
|
|
current_usage: int
|
|
reset_time: datetime
|
|
```
|
|
|
|
Provides all information needed for user-friendly error messages.
|
|
|
|
---
|
|
|
|
## Integration Pattern
|
|
|
|
### In AI Task Jobs
|
|
|
|
```python
|
|
from app.services.rate_limiter_service import RateLimiterService, RateLimitExceeded
|
|
from app.services.usage_tracking_service import UsageTrackingService
|
|
from app.ai.narrative_generator import NarrativeGenerator
|
|
from app.models.ai_usage import TaskType
|
|
|
|
def process_ai_request(user_id: str, user_tier: UserTier, action: str, ...):
|
|
limiter = RateLimiterService()
|
|
tracker = UsageTrackingService()
|
|
generator = NarrativeGenerator()
|
|
|
|
# 1. Check rate limit BEFORE processing
|
|
try:
|
|
limiter.check_rate_limit(user_id, user_tier)
|
|
except RateLimitExceeded as e:
|
|
return {
|
|
"error": "rate_limit_exceeded",
|
|
"message": f"Daily limit reached ({e.limit} turns). Resets at {e.reset_time}",
|
|
"remaining": 0,
|
|
"reset_time": e.reset_time.isoformat()
|
|
}
|
|
|
|
# 2. Generate AI response
|
|
start_time = time.time()
|
|
response = generator.generate_story_response(...)
|
|
duration_ms = int((time.time() - start_time) * 1000)
|
|
|
|
# 3. Log usage (tokens are estimated in ReplicateClient)
|
|
tracker.log_usage(
|
|
user_id=user_id,
|
|
model=response.model,
|
|
tokens_input=response.tokens_input, # From prompt length
|
|
tokens_output=response.tokens_output, # From response length
|
|
task_type=TaskType.STORY_PROGRESSION,
|
|
session_id=session_id,
|
|
request_duration_ms=duration_ms,
|
|
success=True
|
|
)
|
|
|
|
# 4. Increment rate limit counter
|
|
limiter.increment_usage(user_id)
|
|
|
|
return {"narrative": response.narrative, ...}
|
|
```
|
|
|
|
### API Endpoint Pattern
|
|
|
|
```python
|
|
@bp.route('/sessions/<session_id>/action', methods=['POST'])
|
|
@require_auth
|
|
def take_action(session_id):
|
|
user = get_current_user()
|
|
limiter = RateLimiterService()
|
|
|
|
# Check limit and return remaining info
|
|
try:
|
|
limiter.check_rate_limit(user.id, user.tier)
|
|
except RateLimitExceeded as e:
|
|
return api_response(
|
|
status=429,
|
|
error={
|
|
"code": "RATE_LIMIT_EXCEEDED",
|
|
"message": "Daily turn limit reached",
|
|
"details": {
|
|
"limit": e.limit,
|
|
"current": e.current_usage,
|
|
"reset_time": e.reset_time.isoformat()
|
|
}
|
|
}
|
|
)
|
|
|
|
# Queue AI job...
|
|
remaining = limiter.get_remaining_turns(user.id, user.tier)
|
|
|
|
return api_response(
|
|
status=202,
|
|
result={
|
|
"job_id": job.id,
|
|
"remaining_turns": remaining
|
|
}
|
|
)
|
|
```
|
|
|
|
---
|
|
|
|
## Appwrite Collection Schema
|
|
|
|
**Collection:** `ai_usage_logs`
|
|
|
|
| Field | Type | Description |
|
|
|-------|------|-------------|
|
|
| `log_id` | string | Primary key |
|
|
| `user_id` | string | User identifier |
|
|
| `timestamp` | datetime | Request time (UTC) |
|
|
| `model` | string | Model identifier |
|
|
| `tokens_input` | integer | Input tokens |
|
|
| `tokens_output` | integer | Output tokens |
|
|
| `tokens_total` | integer | Total tokens |
|
|
| `estimated_cost` | double | Cost in USD |
|
|
| `task_type` | string | Task type enum |
|
|
| `session_id` | string | Optional session |
|
|
| `character_id` | string | Optional character |
|
|
| `request_duration_ms` | integer | Duration |
|
|
| `success` | boolean | Success status |
|
|
| `error_message` | string | Error if failed |
|
|
|
|
**Indexes:**
|
|
- `user_id` + `timestamp` (for daily queries)
|
|
- `timestamp` (for admin monitoring)
|
|
|
|
---
|
|
|
|
## Cost Management Best Practices
|
|
|
|
### 1. Pre-request Validation
|
|
|
|
Always check rate limits before processing:
|
|
|
|
```python
|
|
limiter.check_rate_limit(user_id, user_tier)
|
|
```
|
|
|
|
### 2. Log All Requests
|
|
|
|
Log both successful and failed requests:
|
|
|
|
```python
|
|
tracker.log_usage(
|
|
...,
|
|
success=False,
|
|
error_message="Model timeout"
|
|
)
|
|
```
|
|
|
|
### 3. Monitor Platform Costs
|
|
|
|
```python
|
|
# Daily monitoring
|
|
daily_cost = tracker.get_total_daily_cost(date.today())
|
|
|
|
if daily_cost > 50:
|
|
send_alert("WARNING: Daily AI cost exceeded $50")
|
|
if daily_cost > 100:
|
|
send_alert("CRITICAL: Daily AI cost exceeded $100")
|
|
```
|
|
|
|
### 4. Cost Estimation for UI
|
|
|
|
Show users estimated costs before actions:
|
|
|
|
```python
|
|
cost_info = UsageTrackingService.get_model_cost_info(model)
|
|
estimated = (base_tokens * 1.5 / 1000) * (cost_info['input'] + cost_info['output'])
|
|
```
|
|
|
|
### 5. Tier Upgrade Prompts
|
|
|
|
When rate limited, prompt upgrades:
|
|
|
|
```python
|
|
if e.user_tier == UserTier.FREE:
|
|
message = "Upgrade to Basic for 200 turns/day!"
|
|
elif e.user_tier == UserTier.BASIC:
|
|
message = "Upgrade to Premium for 1000 turns/day!"
|
|
elif e.user_tier == UserTier.PREMIUM:
|
|
message = "Upgrade to Elite for unlimited turns!"
|
|
```
|
|
|
|
---
|
|
|
|
## Target Cost Goals
|
|
|
|
- **Development:** < $50/day
|
|
- **Production target:** < $500/month total
|
|
- **Cost per user:** ~$0.10/day (premium tier average)
|
|
|
|
### Cost Breakdown by Tier (estimated daily)
|
|
|
|
| Tier | Avg Requests | Avg Cost/Request | Daily Cost |
|
|
|------|-------------|-----------------|------------|
|
|
| FREE | 10 | $0.00 | $0.00 |
|
|
| BASIC | 30 | $0.003 | $0.09 |
|
|
| PREMIUM | 60 | $0.01 | $0.60 |
|
|
| ELITE | 100 | $0.02 | $2.00 |
|
|
|
|
---
|
|
|
|
## Testing
|
|
|
|
### Unit Tests
|
|
|
|
```python
|
|
# test_usage_tracking_service.py
|
|
def test_log_usage():
|
|
tracker = UsageTrackingService()
|
|
log = tracker.log_usage(
|
|
user_id="test_user",
|
|
model="meta/meta-llama-3-8b-instruct",
|
|
tokens_input=100,
|
|
tokens_output=200,
|
|
task_type=TaskType.STORY_PROGRESSION
|
|
)
|
|
assert log.tokens_total == 300
|
|
assert log.estimated_cost > 0
|
|
|
|
# test_rate_limiter_service.py
|
|
def test_rate_limit_exceeded():
|
|
limiter = RateLimiterService()
|
|
|
|
# Exceed free tier limit (50 from config)
|
|
for _ in range(50):
|
|
limiter.increment_usage("test_user")
|
|
|
|
with pytest.raises(RateLimitExceeded):
|
|
limiter.check_rate_limit("test_user", UserTier.FREE)
|
|
```
|
|
|
|
### Integration Testing
|
|
|
|
```bash
|
|
# Check Redis connection
|
|
redis-cli ping
|
|
|
|
# Check Appwrite connection
|
|
python -c "from app.services.usage_tracking_service import UsageTrackingService; UsageTrackingService()"
|
|
```
|
|
|
|
---
|
|
|
|
## Future Enhancements (Deferred)
|
|
|
|
- **Task 7.15:** Cost monitoring and alerts (daily job, email alerts)
|
|
- Billing integration
|
|
- Usage quotas per session
|
|
- Real-time cost dashboard
|
|
- Cost projections
|