Files
SneakyScan/docs/ai/PHASE2.md

1855 lines
54 KiB
Markdown

# Phase 2 Implementation Plan: Flask Web App Core
**Status:** Step 4 Complete ✅ - Authentication System (Days 7-8)
**Progress:** 8/14 days complete (57%)
**Estimated Duration:** 14 days (2 weeks)
**Dependencies:** Phase 1 Complete ✅
## Progress Summary
-**Step 1: Database & Service Layer** (Days 1-2) - COMPLETE
- ScanService with full CRUD operations
- Pagination and validation utilities
- Database migration for indexes
- 15 unit tests (100% passing)
- 1,668 lines of code added
-**Step 2: Scan API Endpoints** (Days 3-4) - COMPLETE
- All 5 scan endpoints implemented
- Comprehensive error handling and logging
- 24 integration tests written
- 300+ lines of code added
-**Step 3: Background Job Queue** (Days 5-6) - COMPLETE
- APScheduler integration with BackgroundScheduler
- Scan execution in background threads
- SchedulerService with job management
- Database migration for scan timing fields
- 13 unit tests (scheduler, timing, errors)
- 600+ lines of code added
-**Step 4: Authentication System** (Days 7-8) - COMPLETE
- Flask-Login integration with single-user support
- User model with bcrypt password hashing
- Login, logout, and password setup routes
- @login_required and @api_auth_required decorators
- All API endpoints protected with authentication
- Bootstrap 5 dark theme UI templates
- 30+ authentication tests
- 1,200+ lines of code added
-**Step 5: Basic UI Templates** (Days 9-10) - NEXT
- 📋 **Step 6: Docker & Deployment** (Day 11) - Pending
- 📋 **Step 7: Error Handling & Logging** (Day 12) - Pending
- 📋 **Step 8: Testing & Documentation** (Days 13-14) - Pending
---
## Table of Contents
1. [Overview](#overview)
2. [Current State Analysis](#current-state-analysis)
3. [Files to Create](#files-to-create)
4. [Files to Modify](#files-to-modify)
5. [Step-by-Step Implementation](#step-by-step-implementation)
6. [Dependencies & Prerequisites](#dependencies--prerequisites)
7. [Testing Approach](#testing-approach)
8. [Potential Challenges & Solutions](#potential-challenges--solutions)
9. [Success Criteria](#success-criteria)
10. [Migration Path](#migration-path)
11. [Estimated Timeline](#estimated-timeline)
12. [Key Design Decisions](#key-design-decisions)
13. [Documentation Deliverables](#documentation-deliverables)
---
## Overview
Phase 2 focuses on creating the core web application functionality by:
1. **REST API for Scans** - Trigger scans and retrieve results via API
2. **Background Job Queue** - Execute scans asynchronously using APScheduler
3. **Authentication** - Simple Flask-Login session management
4. **Scanner Integration** - Save scan results to database automatically
5. **Basic UI** - Login page and dashboard placeholder
### Goals
- ✅ Working REST API for scan management (trigger, list, view, delete, status)
- ✅ Background scan execution with status tracking
- ✅ Basic authentication system with session management
- ✅ Scanner saves all results to database
- ✅ Simple login page and dashboard placeholder
- ✅ Production-ready Docker deployment
---
## Current State Analysis
### What's Already Done (Phase 1)
- ✅ Database schema with 11 models (Scan, ScanSite, ScanIP, ScanPort, ScanService, ScanCertificate, ScanTLSVersion, Schedule, Alert, AlertRule, Setting)
- ✅ SQLAlchemy models with relationships
- ✅ Flask app factory pattern in `web/app.py`
- ✅ Settings management with encryption (SettingsManager, PasswordManager)
- ✅ API blueprint stubs for scans, schedules, alerts, settings
- ✅ Alembic migrations system
- ✅ Docker deployment infrastructure
### Scanner Capabilities (src/scanner.py)
The existing scanner has these key characteristics:
- **scan() method** returns `(report_dict, timestamp)` tuple
- **generate_outputs()** creates JSON, HTML, ZIP files
- **Five-phase scanning:** ping, TCP ports, UDP ports, service detection, HTTP/HTTPS analysis
- **Screenshot capture** with Playwright
- **Results structured** by sites → IPs → ports → services
**Key Methods:**
- `scan()` - Main workflow, returns report and timestamp
- `generate_outputs(report, timestamp)` - Creates JSON/HTML/ZIP files
- `save_report(report, timestamp)` - Saves JSON to disk
- `_run_masscan()` - Port scanning
- `_run_nmap_service_detection()` - Service detection
- `_run_http_analysis()` - HTTP/HTTPS and SSL/TLS analysis
---
## Files to Create
### Backend Services (Core Logic)
#### 1. `web/services/__init__.py`
Services package initialization.
#### 2. `web/services/scan_service.py`
Core service for scan orchestration and database integration.
**Class: ScanService**
Methods:
- `trigger_scan(config_file, triggered_by='manual', schedule_id=None)` → scan_id
- Validate config file exists
- Create Scan record with status='running'
- Queue background job
- Return scan_id
- `get_scan(scan_id)` → scan dict with all related data
- Query Scan with all relationships
- Format for API response
- Include sites, IPs, ports, services, certificates, TLS versions
- `list_scans(page=1, per_page=20, status_filter=None)` → paginated results
- Query with pagination
- Filter by status if provided
- Return total count and items
- `delete_scan(scan_id)` → cleanup DB + files
- Delete database records (cascade handles relationships)
- Delete JSON, HTML, ZIP files
- Delete screenshot directory
- Handle missing files gracefully
- `get_scan_status(scan_id)` → status dict
- Return current scan status
- Include progress percentage if available
- Return error message if failed
- `_save_scan_to_db(report, scan_id, status='completed')` → persist results
- Update Scan record with duration, file paths
- Call _map_report_to_models()
- Commit transaction
- `_map_report_to_models(report, scan_obj)` → create related records
- Map JSON structure to database models
- Create ScanSite, ScanIP, ScanPort, ScanService records
- Create ScanCertificate and ScanTLSVersion records
- Handle nested relationships properly
#### 3. `web/services/scheduler_service.py`
APScheduler integration for scheduled scans.
**Class: SchedulerService**
Methods:
- `init_scheduler(app)` → setup APScheduler
- Initialize BackgroundScheduler
- Load existing schedules from DB
- Start scheduler
- `add_job(schedule_id, config_file, cron_expression)` → create scheduled job
- Parse cron expression
- Add job to scheduler
- Store job_id in database
- `remove_job(schedule_id)` → cancel job
- Remove from scheduler
- Update database
- `trigger_scheduled_scan(schedule_id)` → manual trigger
- Load schedule from DB
- Trigger scan via ScanService
- Update last_run timestamp
- `update_schedule_times(schedule_id, last_run, next_run)` → DB update
- Update Schedule record
- Commit transaction
### Authentication System
#### 4. `web/auth/__init__.py`
Authentication package initialization.
#### 5. `web/auth/routes.py`
Login/logout routes blueprint.
**Routes:**
- `GET /login` - Render login form
- `POST /login` - Authenticate and create session
- Verify password via PasswordManager
- Create Flask-Login session
- Redirect to dashboard
- `GET /logout` - Destroy session
- Logout user
- Redirect to login page
#### 6. `web/auth/decorators.py`
Custom authentication decorators.
**Decorators:**
- `@login_required` - Wrapper for Flask-Login's login_required
- `@api_auth_required` - For API endpoints (session-based for Phase 2)
#### 7. `web/auth/models.py`
User model for Flask-Login.
**Class: User**
- Simple class representing the single application user
- Load from settings table (app_password)
- Implement Flask-Login required methods (get_id, is_authenticated, etc.)
### Frontend Templates
#### 8. `web/templates/base.html`
Base layout with Bootstrap 5 dark theme.
**Features:**
- Navigation bar (Dashboard, Scans, Settings, Logout)
- Flash message display
- Jinja2 blocks: title, content
- Footer with version info
- Bootstrap 5 dark theme CSS
- Mobile responsive
#### 9. `web/templates/login.html`
Login page.
**Features:**
- Username/password form
- Error message display
- Remember me checkbox (optional)
- Redirect to dashboard after login
- Clean, minimal design
#### 10. `web/templates/dashboard.html`
Dashboard placeholder.
**Features:**
- Welcome message
- "Run Scan Now" button (manual trigger)
- Recent scans table (5 most recent)
- Summary stats:
- Total scans
- Last scan time
- Scans running
- Link to full scan history
### Background Jobs
#### 11. `web/jobs/__init__.py`
Jobs package initialization.
#### 12. `web/jobs/scan_job.py`
Background scan execution.
**Function: execute_scan(config_file, scan_id, db_url)**
- Run scanner in subprocess
- Update scan status in DB (running → completed/failed)
- Handle exceptions and log errors
- Store scan results in database
- Generate JSON/HTML/ZIP files
**Implementation:**
```python
import subprocess
from pathlib import Path
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from web.models import Scan
from src.scanner import SneakyScanner
from web.services.scan_service import ScanService
def execute_scan(config_file, scan_id, db_url):
"""Execute scan in background and save to database."""
engine = create_engine(db_url)
Session = sessionmaker(bind=engine)
session = Session()
try:
# Update status to running
scan = session.query(Scan).get(scan_id)
scan.status = 'running'
session.commit()
# Run scanner
scanner = SneakyScanner(config_file)
report, timestamp = scanner.scan()
# Generate outputs (JSON, HTML, ZIP)
scanner.generate_outputs(report, timestamp)
# Save to database
scan_service = ScanService(session)
scan_service._save_scan_to_db(report, scan_id, status='completed')
except Exception as e:
# Mark as failed
scan = session.query(Scan).get(scan_id)
scan.status = 'failed'
session.commit()
raise
finally:
session.close()
```
### Utilities
#### 13. `web/utils/pagination.py`
Pagination helper.
**Function: paginate(query, page, per_page)**
- Apply offset and limit to SQLAlchemy query
- Return paginated results with metadata
- Handle edge cases (invalid page, empty results)
#### 14. `web/utils/validators.py`
Input validation utilities.
**Functions:**
- `validate_config_file(path)` → check file exists and is valid YAML
- `validate_scan_status(status)` → enum validation (running, completed, failed)
- `validate_page_params(page, per_page)` → sanitize pagination params
### Web Routes
#### 15. `web/routes/__init__.py`
Web routes package initialization.
#### 16. `web/routes/main.py`
Main web routes (dashboard, etc.).
**Routes:**
- `GET /` - Redirect to dashboard
- `GET /dashboard` - Dashboard page (@login_required)
- `GET /scans` - Scan list page (@login_required)
- `GET /scans/<id>` - Scan details page (@login_required)
### Testing
#### 17. `tests/__init__.py`
Test package initialization.
#### 18. `tests/conftest.py`
Pytest fixtures and configuration.
**Fixtures:**
- `app` - Flask app instance with test config
- `client` - Flask test client
- `db` - Test database session
- `sample_scan` - Sample scan data for testing
#### 19. `tests/test_scan_api.py`
API endpoint tests.
**Tests:**
- `test_list_scans` - GET /api/scans
- `test_get_scan` - GET /api/scans/<id>
- `test_trigger_scan` - POST /api/scans
- `test_delete_scan` - DELETE /api/scans/<id>
- `test_scan_status` - GET /api/scans/<id>/status
- `test_pagination` - List scans with page/per_page params
- `test_authentication` - Verify auth required
#### 20. `tests/test_scan_service.py`
ScanService unit tests.
**Tests:**
- `test_trigger_scan` - Scan creation and queuing
- `test_get_scan` - Retrieve scan with relationships
- `test_list_scans` - Pagination and filtering
- `test_delete_scan` - Cleanup files and DB records
- `test_save_scan_to_db` - Database persistence
- `test_map_report_to_models` - JSON to DB mapping
#### 21. `tests/test_authentication.py`
Authentication tests.
**Tests:**
- `test_login_success` - Valid credentials
- `test_login_failure` - Invalid credentials
- `test_logout` - Session destruction
- `test_protected_route` - Requires authentication
---
## Files to Modify
### Backend Updates
#### 1. `web/app.py`
Flask application factory - add authentication and scheduler.
**Changes:**
- Import Flask-Login and configure LoginManager
- Import APScheduler and initialize in app factory
- Register auth blueprint
- Register web routes blueprint
- Add user_loader callback for Flask-Login
- Add before_request handler for authentication
- Initialize scheduler service
**New imports:**
```python
from flask_login import LoginManager
from web.auth.routes import bp as auth_bp
from web.routes.main import bp as main_bp
from web.services.scheduler_service import SchedulerService
```
**New code:**
```python
# Initialize Flask-Login
login_manager = LoginManager()
login_manager.login_view = 'auth.login'
login_manager.init_app(app)
@login_manager.user_loader
def load_user(user_id):
from web.auth.models import User
return User.get(user_id)
# Initialize APScheduler
scheduler_service = SchedulerService()
scheduler_service.init_scheduler(app)
# Register blueprints
app.register_blueprint(auth_bp, url_prefix='/auth')
app.register_blueprint(main_bp, url_prefix='/')
```
#### 2. `web/api/scans.py`
Implement all scan endpoints (currently stubs).
**Changes:**
- Import ScanService
- Implement all endpoint logic
- Add authentication decorators
- Add proper error handling
- Add input validation
- Add logging
**Endpoint implementations:**
```python
from web.services.scan_service import ScanService
from web.auth.decorators import api_auth_required
from web.utils.validators import validate_config_file
@bp.route('', methods=['POST'])
@api_auth_required
def trigger_scan():
"""Trigger a new scan."""
data = request.get_json() or {}
config_file = data.get('config_file')
# Validate config file
if not validate_config_file(config_file):
return jsonify({'error': 'Invalid config file'}), 400
# Trigger scan
scan_service = ScanService(current_app.db_session)
scan_id = scan_service.trigger_scan(config_file, triggered_by='api')
return jsonify({
'scan_id': scan_id,
'status': 'running',
'message': 'Scan queued successfully'
}), 201
# Similar implementations for GET, DELETE, status endpoints...
```
#### 3. `src/scanner.py`
Minor modifications for progress callbacks (optional).
**Changes (optional):**
- Add optional `progress_callback` parameter to scan() method
- Call callback at each phase (ping, TCP, UDP, services, HTTP)
- No breaking changes to existing functionality
**Example:**
```python
def scan(self, progress_callback=None):
"""Run scan with optional progress reporting."""
if progress_callback:
progress_callback('phase', 'ping', 0)
# Run ping scan
ping_results = self._run_ping_scan(all_ips)
if progress_callback:
progress_callback('phase', 'tcp_scan', 20)
# Continue with other phases...
```
#### 4. `requirements-web.txt`
Add missing dependencies.
**Add:**
```
Flask-APScheduler==1.13.1
```
All other dependencies already present from Phase 1.
#### 5. `docker-compose-web.yml`
Updates for production deployment.
**Changes:**
- Add environment variable for scheduler threads
- Ensure proper volume mounts for data persistence
- Add healthcheck for web service
- Configure restart policy
**Example additions:**
```yaml
environment:
- SCHEDULER_EXECUTORS=2 # Number of concurrent scan jobs
- SCHEDULER_JOB_DEFAULTS_MAX_INSTANCES=3
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:5000/api/settings/health"]
interval: 30s
timeout: 10s
retries: 3
restart: unless-stopped
```
#### 6. Create New Alembic Migration
**File:** `migrations/versions/002_add_scan_indexes.py`
Add database indexes for better query performance:
- Index on scans.timestamp (for sorting recent scans)
- Index on scans.status (for filtering)
- Verify foreign key indexes exist (usually auto-created)
**Migration code:**
```python
def upgrade():
op.create_index('ix_scans_status', 'scans', ['status'])
# timestamp index already exists from 001 migration
def downgrade():
op.drop_index('ix_scans_status', 'scans')
```
#### 7. `docs/ai/ROADMAP.md`
Update with Phase 2 progress.
**Changes:**
- Mark Phase 2 tasks as completed
- Update success metrics
- Add Phase 2 completion date
---
## Step-by-Step Implementation
### Step 1: Database & Service Layer ✅ COMPLETE (Days 1-2)
**Priority: CRITICAL** - Foundation for everything else
**Status:** ✅ Complete - Committed: d7c68a2
**Tasks Completed:**
1. ✅ Created `web/services/` package
2. ✅ Implemented `ScanService` class (545 lines)
-`trigger_scan()` - Create scan records
-`get_scan()` - Retrieve with eager loading
-`list_scans()` - Paginated list with filtering
-`delete_scan()` - Remove DB records and files
-`get_scan_status()` - Poll scan status
-`_save_scan_to_db()` - Persist results
-`_map_report_to_models()` - Complex JSON-to-DB mapping
- ✅ Helper methods for dict conversion
3. ✅ Implemented pagination utility (`web/utils/pagination.py` - 153 lines)
- PaginatedResult class with metadata
- paginate() function for SQLAlchemy queries
- validate_page_params() for input sanitization
4. ✅ Implemented validators (`web/utils/validators.py` - 245 lines)
- validate_config_file() - YAML structure validation
- validate_scan_status() - Enum validation
- validate_scan_id(), validate_port(), validate_ip_address()
- sanitize_filename() - Security
5. ✅ Wrote comprehensive unit tests (374 lines)
- 15 tests covering all ScanService methods
- Test fixtures for DB, reports, config files
- Tests for trigger, get, list, delete, status
- Tests for complex database mapping
- **All tests passing ✓**
6. ✅ Created Alembic migration 002 for scan status index
**Testing Results:**
- ✅ All 15 unit tests passing
- ✅ Database records created correctly with nested relationships
- ✅ Pagination logic validated
- ✅ Foreign key relationships working
- ✅ Complex JSON-to-DB mapping successful
**Files Created:**
- web/services/__init__.py
- web/services/scan_service.py (545 lines)
- web/utils/pagination.py (153 lines)
- web/utils/validators.py (245 lines)
- migrations/versions/002_add_scan_indexes.py
- tests/__init__.py
- tests/conftest.py (142 lines)
- tests/test_scan_service.py (374 lines)
**Total:** 8 files, 1,668 lines added
**Key Challenge:** Mapping complex JSON structure to normalized database schema
**Solution Implemented:** Process in order (sites → IPs → ports → services → certs → TLS), use SQLAlchemy relationships for FK handling, flush() after each level for ID generation
### Step 2: Scan API Endpoints ✅ COMPLETE (Days 3-4)
**Priority: HIGH** - Core functionality
**Status:** ✅ Complete - Committed: [pending]
**Tasks Completed:**
1. ✅ Updated `web/api/scans.py`:
- ✅ Implemented `POST /api/scans` (trigger scan)
- ✅ Implemented `GET /api/scans` (list with pagination)
- ✅ Implemented `GET /api/scans/<id>` (get details)
- ✅ Implemented `DELETE /api/scans/<id>` (delete scan + files)
- ✅ Implemented `GET /api/scans/<id>/status` (status polling)
2. ✅ Added comprehensive error handling for all endpoints
3. ✅ Added structured logging with appropriate log levels
4. ✅ Wrote 24 integration tests covering:
- Empty and populated scan lists
- Pagination with multiple pages
- Status filtering
- Individual scan retrieval
- Scan triggering with validation
- Scan deletion
- Status polling
- Complete workflow integration test
- Error handling scenarios (404, 400, 500)
**Testing Results:**
- ✅ All endpoints properly handle errors (400, 404, 500)
- ✅ Pagination logic implemented with metadata
- ✅ Input validation through validators
- ✅ Logging at appropriate levels (info, warning, error, debug)
- ✅ Integration tests written and ready to run in Docker
**Files Modified:**
- web/api/scans.py (262 lines, all endpoints implemented)
**Files Created:**
- tests/test_scan_api.py (301 lines, 24 tests)
- tests/conftest.py (updated with Flask fixtures)
**Total:** 2 files modified, 563 lines added/modified
**Key Implementation Details:**
- All endpoints use ScanService for business logic
- Proper HTTP status codes (200, 201, 400, 404, 500)
- Consistent JSON error format with 'error' and 'message' keys
- SQLAlchemy error handling with graceful degradation
- Logging includes request details and scan IDs for traceability
**Key Challenge Addressed:** Long-running scans causing HTTP timeouts
**Solution Implemented:** POST /api/scans immediately returns scan_id with status 'running', client polls GET /api/scans/<id>/status for updates
### Step 3: Background Job Queue ✅ COMPLETE (Days 5-6)
**Priority: HIGH** - Async scan execution
**Status:** ✅ Complete - Committed: [pending]
**Tasks Completed:**
1. ✅ Created `web/jobs/` package structure
2. ✅ Implemented `web/jobs/scan_job.py` (130 lines):
- `execute_scan()` - Runs scanner in background thread
- Creates isolated database session per thread
- Updates scan status: running → completed/failed
- Handles exceptions with detailed error logging
- Stores error messages in database
- Tracks timing with started_at/completed_at
3. ✅ Created `SchedulerService` class (web/services/scheduler_service.py - 220 lines):
- Initialized APScheduler with BackgroundScheduler
- ThreadPoolExecutor for concurrent jobs (max 3 workers)
- `queue_scan()` - Queue immediate scan execution
- `add_scheduled_scan()` - Placeholder for future scheduled scans
- `remove_scheduled_scan()` - Remove scheduled jobs
- `list_jobs()` and `get_job_status()` - Job monitoring
- Graceful shutdown handling
4. ✅ Integrated APScheduler with Flask app (web/app.py):
- Created `init_scheduler()` function
- Initialized in app factory after extensions
- Stored scheduler in app context (`app.scheduler`)
5. ✅ Updated `ScanService.trigger_scan()` to queue background jobs:
- Added `scheduler` parameter
- Queues job immediately after creating scan record
- Handles job queuing failures gracefully
6. ✅ Added database fields for scan timing (migration 003):
- `started_at` - When scan execution began
- `completed_at` - When scan finished
- `error_message` - Error details for failed scans
7. ✅ Updated `ScanService.get_scan_status()` to include new fields
8. ✅ Updated API endpoint `POST /api/scans` to pass scheduler
**Testing Results:**
- ✅ 13 unit tests for background jobs and scheduler
- ✅ Tests for scheduler initialization
- ✅ Tests for job queuing and status tracking
- ✅ Tests for scan timing fields
- ✅ Tests for error handling and storage
- ✅ Tests for job listing and monitoring
- ✅ Integration test for full workflow (skipped by default - requires scanner)
**Files Created:**
- web/jobs/__init__.py (6 lines)
- web/jobs/scan_job.py (130 lines)
- web/services/scheduler_service.py (220 lines)
- migrations/versions/003_add_scan_timing_fields.py (38 lines)
- tests/test_background_jobs.py (232 lines)
**Files Modified:**
- web/app.py (added init_scheduler function and call)
- web/models.py (added 3 fields to Scan model)
- web/services/scan_service.py (updated trigger_scan and get_scan_status)
- web/api/scans.py (pass scheduler to trigger_scan)
**Total:** 5 files created, 4 files modified, 626 lines added
**Key Implementation Details:**
- BackgroundScheduler runs in separate thread pool
- Each background job gets isolated database session
- Scan status tracked through lifecycle: created → running → completed/failed
- Error messages captured and stored in database
- Graceful shutdown waits for running jobs
- Job IDs follow pattern: `scan_{scan_id}`
- Support for concurrent scans (max 3 default, configurable)
**Key Challenge Addressed:** Scanner requires privileged operations (masscan/nmap)
**Solution Implemented:**
- Scanner runs in subprocess from background thread
- Docker container provides necessary privileges (--privileged, --network host)
- Background thread isolation prevents web app crashes
- Database session per thread avoids SQLite locking issues
### Step 4: Authentication System ⏱️ Days 7-8
**Priority: HIGH** - Security
**Tasks:**
1. Create `web/auth/` package
2. Implement Flask-Login integration:
- Create User class (simple - single user)
- Configure LoginManager in app factory
- Implement user_loader callback
3. Create `auth/routes.py`:
- `GET /login` - render login form
- `POST /login` - authenticate and create session
- `GET /logout` - destroy session
4. Create `auth/decorators.py`:
- `@login_required` for web routes
- `@api_auth_required` for API endpoints
5. Apply decorators to all API endpoints
6. Test authentication flow
**Testing:**
- Test login with correct/incorrect password
- Verify session persistence
- Test logout
- Verify protected routes require auth
- Test API authentication
- Test session timeout
**Key Challenge:** Need both web UI and API authentication
**Solution:** Use Flask-Login sessions for both (Phase 5 adds token auth)
### Step 5: Basic UI Templates ⏱️ Days 9-10
**Priority: MEDIUM** - User interface
**Tasks:**
1. Create `base.html` template:
- Bootstrap 5 dark theme
- Navigation bar (Dashboard, Scans, Logout)
- Flash message display
- Footer with version info
2. Create `login.html`:
- Simple username/password form
- Error message display
- Redirect to dashboard after login
3. Create `dashboard.html`:
- Welcome message
- "Run Scan Now" button (manual trigger)
- Recent scans table (AJAX call to API)
- Summary stats (total scans, last scan time)
4. Create web routes blueprint (`web/routes/main.py`)
5. Style with Bootstrap 5 dark theme
6. Add minimal JavaScript for AJAX calls
**Testing:**
- Verify templates render correctly
- Test responsive layout (desktop, tablet, mobile)
- Test login flow end-to-end
- Verify dashboard displays data from API
- Test manual scan trigger
**Key Feature:** Dark theme matching existing HTML reports
### Step 6: Docker & Deployment ⏱️ Day 11
**Priority: MEDIUM** - Production readiness
**Tasks:**
1. Update Dockerfile if needed (mostly done in Phase 1)
2. Update `docker-compose-web.yml`:
- Verify volume mounts
- Add environment variables for scheduler
- Set proper restart policy
- Add healthcheck
3. Create `.env.example` file with configuration template
4. Test deployment workflow
5. Create deployment documentation
**Testing:**
- Build Docker image
- Run `docker-compose up`
- Test full workflow in Docker
- Verify volume persistence (database, scans)
- Test restart behavior
- Test healthcheck endpoint
**Deliverable:** Production-ready Docker deployment
### Step 7: Error Handling & Logging ⏱️ Day 12
**Priority: MEDIUM** - Robustness
**Tasks:**
1. Add comprehensive error handling:
- API error responses (JSON format)
- Web error pages (404, 500)
- Database transaction rollback on errors
2. Enhance logging:
- Structured logging for API calls
- Scan execution logging
- Error logging with stack traces
3. Add request/response logging middleware
4. Configure log rotation
**Testing:**
- Test error scenarios (invalid input, DB errors, scanner failures)
- Verify error logging
- Check log file rotation
- Test error pages render correctly
**Key Feature:** Helpful error messages for debugging
### Step 8: Testing & Documentation ⏱️ Days 13-14
**Priority: HIGH** - Quality assurance
**Tasks:**
1. Write comprehensive tests:
- Unit tests for services (ScanService, SchedulerService)
- Integration tests for API endpoints
- End-to-end tests for workflows (login → scan → view → delete)
2. Create API documentation:
- Endpoint descriptions with request/response examples
- Authentication instructions
- Error code reference
3. Update README.md:
- Phase 2 features
- Installation instructions
- Configuration guide
- API usage examples
4. Create `PHASE2_COMPLETE.md` in `docs/ai/`
5. Update `docs/ai/ROADMAP.md` with completion status
**Testing:**
- Run full test suite
- Achieve >80% code coverage
- Manual testing of all features
- Performance testing (multiple concurrent scans)
**Deliverables:**
- Comprehensive test suite
- API documentation
- Updated user documentation
- Phase 2 completion summary
---
## Dependencies & Prerequisites
### Python Packages
**Add to `requirements-web.txt`:**
```
Flask-APScheduler==1.13.1
```
**Already Present (from Phase 1):**
- Flask==3.0.0
- Werkzeug==3.0.1
- SQLAlchemy==2.0.23
- alembic==1.13.0
- Flask-Login==0.6.3
- bcrypt==4.1.2
- cryptography==41.0.7
- Flask-CORS==4.0.0
- marshmallow==3.20.1
- marshmallow-sqlalchemy==0.29.0
- APScheduler==3.10.4
- Flask-Mail==0.9.1
- python-dotenv==1.0.0
- pytest==7.4.3
- pytest-flask==1.3.0
### System Requirements
- Python 3.12+ (development)
- Docker and Docker Compose (deployment)
- SQLite3 (database)
- Masscan, Nmap, Playwright (scanner dependencies - in Dockerfile)
### Configuration Files
**New file: `.env.example`**
```bash
# Flask configuration
FLASK_ENV=production
FLASK_DEBUG=false
FLASK_HOST=0.0.0.0
FLASK_PORT=5000
# Database
DATABASE_URL=sqlite:////app/data/sneakyscanner.db
# Security
SECRET_KEY=your-secret-key-here-change-in-production
SNEAKYSCANNER_ENCRYPTION_KEY=your-encryption-key-here
# CORS (comma-separated origins)
CORS_ORIGINS=*
# Logging
LOG_LEVEL=INFO
# Scheduler
SCHEDULER_EXECUTORS=2
SCHEDULER_JOB_DEFAULTS_MAX_INSTANCES=3
```
---
## Testing Approach
### Unit Tests
**Framework:** pytest
**Coverage Target:** >80%
**Test Files:**
- `tests/test_scan_service.py` - ScanService methods
- `tests/test_scheduler_service.py` - SchedulerService methods
- `tests/test_validators.py` - Validation functions
- `tests/test_pagination.py` - Pagination helper
**Approach:**
- Mock database calls with pytest fixtures
- Test each method independently
- Test edge cases and error conditions
- Use `@pytest.mark.parametrize` for multiple scenarios
### Integration Tests
**Test Files:**
- `tests/test_scan_api.py` - API endpoints with real database
- `tests/test_authentication.py` - Auth flow
**Approach:**
- Use test database (separate from production)
- Test API endpoints end-to-end
- Verify database state after operations
- Test authentication required for protected routes
### End-to-End Tests
**Test Files:**
- `tests/test_workflows.py` - Complete user workflows
**Scenarios:**
- Login → Trigger scan → View results → Delete scan → Logout
- API: Trigger scan → Poll status → Get results → Delete
- Background job execution → Database update → File creation
**Approach:**
- Use pytest-flask for Flask testing
- Test both UI and API flows
- Verify files created/deleted
- Test concurrent operations
### Manual Testing Checklist
Phase 2 complete when all items checked:
- [ ] Login with correct password succeeds
- [ ] Login with incorrect password fails
- [ ] Trigger scan via UI button works
- [ ] Trigger scan via API `POST /api/scans` works
- [ ] Scan runs in background (doesn't block)
- [ ] Scan status updates correctly (running → completed)
- [ ] View scan list with pagination
- [ ] View scan details page
- [ ] Delete scan removes DB records and files
- [ ] Logout destroys session
- [ ] Access protected route while logged out redirects to login
- [ ] Background scan completes successfully
- [ ] Background scan handles errors gracefully (e.g., invalid config)
- [ ] Multiple concurrent scans work
- [ ] Docker Compose deployment works
- [ ] Database persists across container restarts
- [ ] Scan files persist in volume
---
## Potential Challenges & Solutions
### Challenge 1: Scanner Integration with Background Jobs
**Problem:** Scanner runs privileged operations (masscan/nmap require CAP_NET_RAW). How to execute from web app?
**Impact:** High - core functionality
**Solution:**
- Run scanner in subprocess with proper privileges
- Use Docker's `--privileged` mode and `--network host`
- Pass database URL to subprocess for status updates
- Isolate scanner execution errors from web app
- Use APScheduler's BackgroundScheduler (runs in threads)
**Implementation:**
```python
# web/jobs/scan_job.py
def execute_scan(config_file, scan_id, db_url):
# Create new DB session for this thread
engine = create_engine(db_url)
Session = sessionmaker(bind=engine)
session = Session()
try:
# Update status
scan = session.query(Scan).get(scan_id)
scan.status = 'running'
session.commit()
# Run scanner (has privileged access via Docker)
scanner = SneakyScanner(config_file)
report, timestamp = scanner.scan()
scanner.generate_outputs(report, timestamp)
# Save to DB
scan_service = ScanService(session)
scan_service._save_scan_to_db(report, scan_id)
except Exception as e:
scan.status = 'failed'
session.commit()
raise
finally:
session.close()
```
### Challenge 2: Database Concurrency
**Problem:** Background jobs and web requests accessing SQLite simultaneously can cause locking issues.
**Impact:** Medium - affects reliability
**Solution:**
- Enable SQLite WAL (Write-Ahead Logging) mode for better concurrency
- Use SQLAlchemy scoped sessions (already configured)
- Add proper transaction handling and rollback
- Use connection pool (already configured in app.py)
- Consider timeout for busy database
**Implementation:**
```python
# web/app.py - when creating engine
engine = create_engine(
app.config['SQLALCHEMY_DATABASE_URI'],
echo=app.debug,
pool_pre_ping=True,
pool_recycle=3600,
connect_args={'timeout': 15} # 15 second timeout for locks
)
# Enable WAL mode for SQLite
if 'sqlite' in app.config['SQLALCHEMY_DATABASE_URI']:
@event.listens_for(engine, "connect")
def set_sqlite_pragma(dbapi_conn, connection_record):
cursor = dbapi_conn.cursor()
cursor.execute("PRAGMA journal_mode=WAL")
cursor.close()
```
### Challenge 3: Scan Status Tracking
**Problem:** Need to show real-time progress for long-running scans (can take 5-10 minutes).
**Impact:** Medium - affects UX
**Solution:**
- Store scan status in DB (running, completed, failed)
- Implement polling endpoint `GET /api/scans/<id>/status`
- Update status at each scan phase (ping, TCP, UDP, services, HTTP)
- Dashboard polls every 5 seconds for running scans
- Consider WebSocket for real-time updates (Phase 3 enhancement)
**Implementation:**
```javascript
// Dashboard JavaScript - poll for status
function pollScanStatus(scanId) {
const interval = setInterval(async () => {
const response = await fetch(`/api/scans/${scanId}/status`);
const data = await response.json();
if (data.status === 'completed' || data.status === 'failed') {
clearInterval(interval);
refreshScanList();
}
updateProgressBar(data.progress);
}, 5000); // Poll every 5 seconds
}
```
### Challenge 4: File Cleanup on Scan Deletion
**Problem:** Must delete JSON, HTML, ZIP, screenshots when scan deleted. Missing files shouldn't cause errors.
**Impact:** Medium - affects storage and cleanup
**Solution:**
- Store all file paths in DB (already in schema)
- Implement cleanup in `ScanService.delete_scan()`
- Use pathlib for safe file operations
- Handle missing files gracefully (log warning, continue)
- Use database cascade deletion for related records
- Delete screenshot directory recursively
**Implementation:**
```python
# web/services/scan_service.py
def delete_scan(self, scan_id):
scan = self.db.query(Scan).get(scan_id)
if not scan:
raise ValueError(f"Scan {scan_id} not found")
# Delete files (handle missing gracefully)
for file_path in [scan.json_path, scan.html_path, scan.zip_path]:
if file_path:
try:
Path(file_path).unlink()
except FileNotFoundError:
logger.warning(f"File not found: {file_path}")
# Delete screenshot directory
if scan.screenshot_dir:
try:
shutil.rmtree(scan.screenshot_dir)
except FileNotFoundError:
logger.warning(f"Directory not found: {scan.screenshot_dir}")
# Delete DB record (cascade handles relationships)
self.db.delete(scan)
self.db.commit()
```
### Challenge 5: Authentication for API
**Problem:** Need to protect API endpoints but also allow programmatic access. Session cookies don't work well for API clients.
**Impact:** Medium - affects API usability
**Solution:**
- Use Flask-Login sessions for both web UI and API in Phase 2
- Require session cookie for API calls (works with curl -c/-b)
- Add `@api_auth_required` decorator to all endpoints
- Phase 5 will add token authentication for CLI client
- For now, document API usage with session cookies
**Implementation:**
```python
# web/auth/decorators.py
from functools import wraps
from flask_login import current_user
from flask import jsonify
def api_auth_required(f):
@wraps(f)
def decorated_function(*args, **kwargs):
if not current_user.is_authenticated:
return jsonify({'error': 'Authentication required'}), 401
return f(*args, **kwargs)
return decorated_function
# Usage in API
@bp.route('', methods=['POST'])
@api_auth_required
def trigger_scan():
# Protected endpoint
pass
```
**API Usage Example:**
```bash
# Login first to get session cookie
curl -X POST http://localhost:5000/auth/login \
-H "Content-Type: application/json" \
-d '{"password":"yourpassword"}' \
-c cookies.txt
# Use cookie for API calls
curl -X POST http://localhost:5000/api/scans \
-H "Content-Type: application/json" \
-d '{"config_file":"/app/configs/example.yaml"}' \
-b cookies.txt
```
### Challenge 6: Scanner Output Mapping to Database
**Problem:** Complex nested JSON structure needs to map to normalized relational database schema. Many relationships to handle.
**Impact:** High - core functionality
**Solution:**
- Create comprehensive `_map_report_to_models()` method
- Process in order: Scan → Sites → IPs → Ports → Services → Certificates → TLS Versions
- Use SQLAlchemy relationships for automatic FK handling
- Batch operations within single transaction
- Add detailed error logging for mapping issues
- Handle missing/optional fields gracefully
**Implementation Strategy:**
```python
def _map_report_to_models(self, report, scan_obj):
"""Map JSON report to database models."""
# 1. Process sites
for site_data in report['sites']:
site = ScanSite(
scan_id=scan_obj.id,
site_name=site_data['name']
)
self.db.add(site)
self.db.flush() # Get site.id
# 2. Process IPs for this site
for ip_data in site_data['ips']:
ip = ScanIP(
scan_id=scan_obj.id,
site_id=site.id,
ip_address=ip_data['address'],
ping_expected=ip_data['expected']['ping'],
ping_actual=ip_data['actual']['ping']
)
self.db.add(ip)
self.db.flush()
# 3. Process ports for this IP
for port_data in ip_data['actual']['tcp_ports']:
port = ScanPort(
scan_id=scan_obj.id,
ip_id=ip.id,
port=port_data['port'],
protocol='tcp',
expected=port_data.get('expected', False),
state='open'
)
self.db.add(port)
self.db.flush()
# 4. Process services for this port
service_data = port_data.get('service')
if service_data:
service = ScanService(
scan_id=scan_obj.id,
port_id=port.id,
service_name=service_data.get('name'),
product=service_data.get('product'),
version=service_data.get('version'),
# ... more fields
)
self.db.add(service)
self.db.flush()
# 5. Process certificate if HTTPS
cert_data = service_data.get('http_info', {}).get('certificate')
if cert_data:
cert = ScanCertificate(
scan_id=scan_obj.id,
service_id=service.id,
# ... cert fields
)
self.db.add(cert)
self.db.flush()
# 6. Process TLS versions
for tls_data in cert_data.get('tls_versions', []):
tls = ScanTLSVersion(
scan_id=scan_obj.id,
certificate_id=cert.id,
# ... tls fields
)
self.db.add(tls)
# Commit entire transaction
self.db.commit()
```
### Challenge 7: Long-Running Scans Timeout
**Problem:** HTTP request might timeout during long scans (5-10 minutes). Browser/client gives up waiting.
**Impact:** High - affects UX
**Solution:**
- Queue scan job immediately
- Return scan_id right away (within seconds)
- Client polls `GET /api/scans/<id>/status` for progress
- Store scan status and progress in database
- Background job runs independently of HTTP connection
- Dashboard auto-refreshes scan list
**Flow:**
1. User clicks "Run Scan" button
2. POST /api/scans → creates Scan record, queues job, returns scan_id
3. JavaScript starts polling status endpoint every 5 seconds
4. Background job runs scanner, updates status in DB
5. When status changes to 'completed', stop polling and refresh scan list
---
## Success Criteria
Phase 2 is **COMPLETE** when all criteria are met:
### API Functionality
- [ ] `POST /api/scans` triggers background scan and returns scan_id
- [ ] `GET /api/scans` lists scans with pagination (page, per_page params)
- [ ] `GET /api/scans/<id>` returns full scan details from database
- [ ] `DELETE /api/scans/<id>` removes scan records and files
- [ ] `GET /api/scans/<id>/status` shows current scan progress
### Database Integration
- [ ] Scan results automatically saved to database after completion
- [ ] All relationships populated correctly (sites, IPs, ports, services, certs, TLS)
- [ ] Database queries work efficiently (indexes in place)
- [ ] Cascade deletion works for related records
### Background Jobs
- [ ] Scans execute in background (don't block HTTP requests)
- [ ] Multiple scans can run concurrently
- [ ] Scan status updates correctly (running → completed/failed)
- [ ] Failed scans marked appropriately with error message
### Authentication
- [ ] Login page renders and accepts password
- [ ] Successful login creates session and redirects to dashboard
- [ ] Invalid password shows error message
- [ ] Logout destroys session
- [ ] Protected routes require authentication
- [ ] API endpoints require authentication
### User Interface
- [ ] Dashboard displays welcome message and stats
- [ ] Dashboard shows recent scans in table
- [ ] "Run Scan Now" button triggers scan
- [ ] Login page has clean design
- [ ] Templates use Bootstrap 5 dark theme
- [ ] Navigation works between pages
### File Management
- [ ] JSON, HTML, ZIP files still generated (backward compatible)
- [ ] Screenshot directory created with images
- [ ] Files referenced correctly in database
- [ ] Delete scan removes all associated files
### Deployment
- [ ] Docker Compose starts web app successfully
- [ ] Database persists across container restarts
- [ ] Scan files persist in mounted volume
- [ ] Healthcheck endpoint responds correctly
- [ ] Logs written to volume
### Testing
- [ ] All unit tests pass
- [ ] All integration tests pass
- [ ] Test coverage >80%
- [ ] Manual testing checklist complete
### Documentation
- [ ] API endpoints documented with examples
- [ ] README.md updated with Phase 2 features
- [ ] PHASE2_COMPLETE.md created
- [ ] ROADMAP.md updated
---
## Migration Path
### From Phase 1 to Phase 2
**No Breaking Changes:**
- Database schema already complete (Phase 1)
- Existing `scanner.py` code unchanged (backward compatible)
- YAML config format unchanged
- JSON/HTML/ZIP output format unchanged
- Docker deployment configuration compatible
**Additions:**
- New API endpoint implementations (replace stubs)
- New service layer (ScanService, SchedulerService)
- New authentication system
- New UI templates
- Background job system
**Migration Steps:**
1. Pull latest code
2. Install new dependency: `pip install Flask-APScheduler`
3. Run new Alembic migration: `alembic upgrade head`
4. Set application password if not set: `python3 init_db.py --password YOUR_PASSWORD`
5. Rebuild Docker image: `docker-compose -f docker-compose-web.yml build`
6. Start services: `docker-compose -f docker-compose-web.yml up -d`
### Backward Compatibility
**CLI Scanner:**
- Continues to work standalone: `python3 src/scanner.py configs/example.yaml`
- Still generates JSON/HTML/ZIP files
- No changes to command-line interface
**Existing Scans:**
- Old scan JSON files not automatically imported to database
- Can be imported manually if needed (not in Phase 2 scope)
- New scans saved to both files and database
**Configuration:**
- Existing YAML configs work without modification
- Settings from Phase 1 preserved
- No config changes required
---
## Estimated Timeline
**Total Duration:** 14 working days (2 weeks)
### Week 1: Backend Foundation
- **Days 1-2:** Database & Service Layer
- ScanService implementation
- Database mapping logic
- Unit tests
- **Days 3-4:** Scan API Endpoints
- All 5 endpoints implemented
- Input validation
- Integration tests
- **Days 5-6:** Background Job Queue
- APScheduler integration
- Job execution logic
- Concurrent scan testing
- **Day 7:** Authentication System (Part 1)
- Flask-Login setup
- User model
- Login/logout routes
### Week 2: Frontend & Polish
- **Day 8:** Authentication System (Part 2)
- Decorators
- Apply to all endpoints
- Authentication tests
- **Days 9-10:** Basic UI Templates
- Base template
- Login page
- Dashboard
- Web routes
- **Day 11:** Docker & Deployment
- Docker Compose updates
- Deployment testing
- Production configuration
- **Day 12:** Error Handling & Logging
- Error pages
- Logging enhancements
- Error scenarios testing
- **Days 13-14:** Testing & Documentation
- Complete test suite
- API documentation
- README updates
- PHASE2_COMPLETE.md
### Critical Path
The critical path (tasks that must complete before others):
1. Service Layer (Days 1-2) → Everything depends on this
2. API Endpoints (Days 3-4) → Required for UI and background jobs
3. Background Jobs (Days 5-6) → Required for async scan execution
4. Authentication (Days 7-8) → Required for security
UI templates and documentation can proceed in parallel after Day 8.
---
## Key Design Decisions
### Decision 1: Background Job Processing
**Choice:** APScheduler BackgroundScheduler
**Alternatives Considered:**
- Celery with Redis/RabbitMQ
- Python threading module
- Subprocess with cron
**Rationale:**
- APScheduler is lightweight (no external dependencies)
- BackgroundScheduler runs in threads (simple, no message broker needed)
- Sufficient for single-user application
- Can handle concurrent scans
- Easy to integrate with Flask
- Meets all Phase 2 requirements
**Trade-offs:**
- ✅ Simple deployment (no Redis needed)
- ✅ Low resource usage
- ✅ Built-in job scheduling
- ❌ Less scalable than Celery (but not needed for single-user)
- ❌ Jobs lost if app crashes (acceptable for this use case)
### Decision 2: Authentication System
**Choice:** Flask-Login with single-user password
**Alternatives Considered:**
- Multi-user with SQLite user table
- JWT tokens
- Basic HTTP auth
- No authentication
**Rationale:**
- Simple and meets requirements (single-user, self-hosted)
- Flask-Login is well-maintained and integrated with Flask
- Session-based auth works for both UI and API
- Sufficient security for local/internal deployment
- Easy to implement and test
**Trade-offs:**
- ✅ Simple implementation
- ✅ Works for UI and API
- ✅ Secure (bcrypt password hashing)
- ❌ Not suitable for multi-user (not a requirement)
- ❌ Session cookies don't work well for CLI clients (Phase 5 adds tokens)
### Decision 3: Database Storage Strategy
**Choice:** Store complete scan results in normalized database schema
**Alternatives Considered:**
- Store only metadata, keep JSON files for details
- Store JSON blob in database
- Hybrid approach (metadata in DB, details in files)
**Rationale:**
- Enables powerful queries (find all scans with cert expiring in 30 days)
- Required for trending and comparison features (Phase 4)
- Normalized schema is more flexible for future features
- Small storage overhead acceptable (scans are small)
- Still generate JSON/HTML/ZIP for backward compatibility
**Trade-offs:**
- ✅ Enables advanced queries
- ✅ Required for Phase 3-4 features
- ✅ Clean separation of concerns
- ❌ More complex mapping logic
- ❌ Slightly larger database size (minimal impact)
### Decision 4: Scanner Execution Model
**Choice:** Execute scanner in subprocess from web app
**Alternatives Considered:**
- Refactor scanner into library (import directly)
- Separate scanner service (microservice)
- CLI wrapper
**Rationale:**
- Maintains isolation (scanner errors don't crash web app)
- Reuses existing scanner code (no refactoring needed)
- Handles privileged operations via Docker
- Simple to implement
- Backward compatible with CLI usage
**Trade-offs:**
- ✅ Isolation and stability
- ✅ Reuses existing code
- ✅ Backward compatible
- ❌ Slightly more overhead than library import (minimal)
- ❌ Inter-process communication needed (solved with DB)
### Decision 5: API Authentication (Phase 2)
**Choice:** Session-based authentication via Flask-Login
**Alternatives Considered:**
- API tokens (Bearer authentication)
- OAuth2
- No authentication for API, only UI
**Rationale:**
- Consistent with web UI authentication
- Simple to implement (already using Flask-Login)
- Works for testing and initial API usage
- Phase 5 will add token authentication for CLI client
- Secure enough for single-user self-hosted deployment
**Trade-offs:**
- ✅ Consistent with UI auth
- ✅ Simple implementation
- ✅ Secure for intended use case
- ❌ Not ideal for programmatic access (Phase 5 improvement)
- ❌ Requires cookie management in API clients
**API Usage Pattern (Phase 2):**
```bash
# Login to get session cookie
curl -X POST http://localhost:5000/auth/login \
-d '{"password":"yourpass"}' \
-c cookies.txt
# Use session cookie for API calls
curl -X GET http://localhost:5000/api/scans \
-b cookies.txt
```
---
## Documentation Deliverables
### 1. API Documentation (`docs/ai/API_REFERENCE.md`)
**Contents:**
- Endpoint reference (all 5 scan endpoints)
- Request/response examples
- Authentication instructions
- Error codes and messages
- Pagination parameters
- Status codes
**Example:**
```markdown
## POST /api/scans
Trigger a new scan.
**Authentication:** Required (session cookie)
**Request Body:**
```json
{
"config_file": "/app/configs/example.yaml"
}
```
**Response (201 Created):**
```json
{
"scan_id": 42,
"status": "running",
"message": "Scan queued successfully"
}
```
```
### 2. Deployment Guide (`docs/ai/DEPLOYMENT.md`)
**Contents:**
- Docker Compose setup
- Environment variables
- Volume configuration
- Database initialization
- First-time setup
- Troubleshooting
### 3. Developer Guide (`docs/ai/DEVELOPMENT.md`)
**Contents:**
- Project structure
- Architecture overview
- Database schema
- Service layer design
- Adding new features
- Running tests
- Code style guide
### 4. User Guide (`README.md` updates)
**Contents:**
- Phase 2 features
- Web UI usage
- API usage
- Configuration
- Common tasks
- FAQ
### 5. Phase 2 Completion Summary (`docs/ai/PHASE2_COMPLETE.md`)
**Contents:**
- What was delivered
- Success criteria checklist
- Known limitations
- Next steps (Phase 3)
- Migration instructions
- Testing results
---
## Appendix: Example API Calls
### Authentication
```bash
# Login
curl -X POST http://localhost:5000/auth/login \
-H "Content-Type: application/json" \
-d '{"password":"yourpassword"}' \
-c cookies.txt
# Logout
curl -X GET http://localhost:5000/auth/logout \
-b cookies.txt
```
### Scan Management
```bash
# Trigger scan
curl -X POST http://localhost:5000/api/scans \
-H "Content-Type: application/json" \
-d '{"config_file":"/app/configs/example.yaml"}' \
-b cookies.txt
# List scans (with pagination)
curl -X GET "http://localhost:5000/api/scans?page=1&per_page=20" \
-b cookies.txt
# Get scan details
curl -X GET http://localhost:5000/api/scans/42 \
-b cookies.txt
# Get scan status
curl -X GET http://localhost:5000/api/scans/42/status \
-b cookies.txt
# Delete scan
curl -X DELETE http://localhost:5000/api/scans/42 \
-b cookies.txt
```
### Example Responses
**GET /api/scans/42:**
```json
{
"id": 42,
"timestamp": "2025-11-14T10:30:00Z",
"duration": 125.5,
"status": "completed",
"title": "Production Network Scan",
"config_file": "/app/configs/production.yaml",
"json_path": "/app/output/scan_report_20251114_103000.json",
"html_path": "/app/output/scan_report_20251114_103000.html",
"zip_path": "/app/output/scan_report_20251114_103000.zip",
"screenshot_dir": "/app/output/scan_report_20251114_103000_screenshots",
"triggered_by": "api",
"sites": [
{
"id": 101,
"name": "Production DC",
"ips": [
{
"id": 201,
"address": "192.168.1.10",
"ping_expected": true,
"ping_actual": true,
"ports": [
{
"id": 301,
"port": 443,
"protocol": "tcp",
"state": "open",
"expected": true,
"services": [
{
"id": 401,
"service_name": "https",
"product": "nginx",
"version": "1.24.0",
"http_protocol": "https",
"screenshot_path": "scan_report_20251114_103000_screenshots/192_168_1_10_443.png"
}
]
}
]
}
]
}
]
}
```
**GET /api/scans/42/status:**
```json
{
"scan_id": 42,
"status": "running",
"progress": 60,
"current_phase": "service_detection",
"started_at": "2025-11-14T10:30:00Z"
}
```
---
**End of Phase 2 Plan**
This plan will be followed during Phase 2 implementation. Upon completion, a new document `PHASE2_COMPLETE.md` will summarize actual implementation, challenges encountered, and lessons learned.