From 6fe24c39074e3ce98b099c7a8ef03d0302cf0a85 Mon Sep 17 00:00:00 2001 From: Phillip Tarrant Date: Mon, 17 Nov 2025 12:05:11 -0600 Subject: [PATCH] adding Phase4 --- docs/ai/Phase4.md | 1483 ++++++++++++++++++++++++++++++++++++++++++++ docs/ai/ROADMAP.md | 9 +- 2 files changed, 1488 insertions(+), 4 deletions(-) create mode 100644 docs/ai/Phase4.md diff --git a/docs/ai/Phase4.md b/docs/ai/Phase4.md new file mode 100644 index 0000000..af693f3 --- /dev/null +++ b/docs/ai/Phase4.md @@ -0,0 +1,1483 @@ +# Phase 4: Config Creator - CSV Upload & Management + +**Status:** Ready to Start +**Priority:** HIGH - Core usability feature +**Estimated Duration:** 4-5 days +**Dependencies:** Phase 2 Complete (REST API, Authentication) + +--- + +## Overview + +Phase 4 introduces a **Config Creator** feature that allows users to manage scan configurations through the web UI instead of manually creating YAML files. This dramatically improves usability by providing: + +1. **CSV Template Download** - Pre-formatted CSV template for config creation +2. **CSV Upload & Conversion** - Upload filled CSV, automatically convert to YAML +3. **YAML Upload** - Direct YAML upload for advanced users +4. **Config Management UI** - List, view, download, and delete configs +5. **Integration** - Seamlessly works with existing scan triggers and schedules + +### User Workflow + +**Current State (Manual):** +``` +1. User creates YAML file locally (requires YAML knowledge) +2. User uploads to server via Docker volume or container shell +3. User references config in scan trigger/schedule +``` + +**New Workflow (Phase 4):** +``` +1. User clicks "Download CSV Template" in web UI +2. User fills CSV with sites, IPs, expected ports in Excel/Google Sheets +3. User uploads CSV via drag-drop or file picker +4. System validates CSV and converts to YAML +5. User previews generated YAML and confirms +6. Config saved and immediately available for scans/schedules +``` + +--- + +## Design Decisions + +Based on project requirements and complexity analysis: + +### 1. CSV Scope: One CSV = One Config ✓ +- Each CSV file creates a single YAML config file +- All rows share the same `scan_title` (first column) +- Simpler to implement and understand +- Users create multiple CSVs for multiple configs + +### 2. Creation Methods: CSV/YAML Upload Only ✓ +- CSV upload with conversion (primary method) +- Direct YAML upload (for advanced users) +- Form-based editor deferred to future phase +- Focused scope for faster delivery + +### 3. Versioning: No Version History ✓ +- Configs overwrite when updated (no `.bak` files) +- Simpler implementation, less storage +- Users can download existing configs before editing + +### 4. Deletion Safety: Block if Used by Schedules ✓ +- Prevent deletion of configs referenced by active schedules +- Show error message listing which schedules use the config +- Safest approach, prevents schedule failures + +### 5. Additional Scope Decisions +- **File naming:** Auto-generated from scan title (sanitized) +- **File extensions:** Accept `.yaml`, `.yml` for uploads +- **CSV export:** Not in Phase 4 (future enhancement) +- **Config editing:** Download → Edit locally → Re-upload (no inline editor yet) + +--- + +## CSV Template Specification + +### CSV Format + +**Columns (in order):** +``` +scan_title,site_name,ip_address,ping_expected,tcp_ports,udp_ports,services +``` + +**Example CSV:** +```csv +scan_title,site_name,ip_address,ping_expected,tcp_ports,udp_ports,services +Sneaky Infra Scan,Production Web Servers,10.10.20.4,true,"22,53,80",53,"ssh,domain,http" +Sneaky Infra Scan,Production Web Servers,10.10.20.11,true,"22,111,3128,8006","","ssh,rpcbind,squid" +Sneaky Infra Scan,Database Servers,10.10.30.5,true,"22,3306","","" +Sneaky Infra Scan,Database Servers,10.10.30.6,false,"22,5432","","" +``` + +### Field Specifications + +| Column | Type | Required | Format | Example | Description | +|--------|------|----------|--------|---------|-------------| +| `scan_title` | string | Yes | Any text | `"Sneaky Infra Scan"` | Becomes YAML `title` field. Must be same for all rows. | +| `site_name` | string | Yes | Any text | `"Production Web Servers"` | Logical grouping. Rows with same site_name are grouped together. | +| `ip_address` | string | Yes | IPv4 or IPv6 | `"10.10.20.4"` | Target IP address to scan. | +| `ping_expected` | boolean | No | `true`/`false` (case-insensitive) | `true` | Whether host should respond to ping. Default: `false` | +| `tcp_ports` | list[int] | No | Comma-separated, quoted if multiple | `"22,80,443"` or `22` | Expected TCP ports. Empty = no expected TCP ports. | +| `udp_ports` | list[int] | No | Comma-separated, quoted if multiple | `"53,123"` or `53` | Expected UDP ports. Empty = no expected UDP ports. | +| `services` | list[str] | No | Comma-separated, quoted if multiple | `"ssh,http,https"` | Expected service names (optional). | + +### Validation Rules + +**Row-level validation:** +1. `scan_title` must be non-empty and same across all rows +2. `site_name` must be non-empty +3. `ip_address` must be valid IPv4 or IPv6 format +4. `ping_expected` must be `true`, `false`, `TRUE`, `FALSE`, or empty (default false) +5. Port numbers must be integers 1-65535 +6. Port lists must be comma-separated (spaces optional) +7. Duplicate IPs within same config are allowed (different expected values) + +**File-level validation:** +1. CSV must have exactly 7 columns with correct headers +2. Must have at least 1 data row (besides header) +3. Must have at least 1 site defined +4. Must have at least 1 IP per site + +**Filename validation:** +1. Generated filename from scan_title (lowercase, spaces→hyphens, special chars removed) +2. Must not conflict with existing config files +3. Max filename length: 200 characters + +### CSV-to-YAML Conversion Logic + +**Python pseudocode:** +```python +def csv_to_yaml(csv_content: str) -> str: + rows = parse_csv(csv_content) + + # Extract scan title (same for all rows) + scan_title = rows[0]['scan_title'] + + # Group rows by site_name + sites = {} + for row in rows: + site_name = row['site_name'] + if site_name not in sites: + sites[site_name] = [] + + # Parse ports and services + tcp_ports = parse_port_list(row['tcp_ports']) + udp_ports = parse_port_list(row['udp_ports']) + services = parse_string_list(row['services']) + ping = parse_bool(row['ping_expected'], default=False) + + sites[site_name].append({ + 'address': row['ip_address'], + 'expected': { + 'ping': ping, + 'tcp_ports': tcp_ports, + 'udp_ports': udp_ports, + 'services': services # Optional, omit if empty + } + }) + + # Build YAML structure + yaml_data = { + 'title': scan_title, + 'sites': [ + { + 'name': site_name, + 'ips': ips + } + for site_name, ips in sites.items() + ] + } + + return yaml.dump(yaml_data, sort_keys=False) +``` + +**Example Conversion:** + +**Input CSV:** +```csv +scan_title,site_name,ip_address,ping_expected,tcp_ports,udp_ports,services +Prod Scan,Web Servers,10.10.20.4,true,"22,80,443",53,ssh +Prod Scan,Web Servers,10.10.20.5,true,"22,80",,"ssh,http" +``` + +**Output YAML:** +```yaml +title: Prod Scan +sites: + - name: Web Servers + ips: + - address: 10.10.20.4 + expected: + ping: true + tcp_ports: [22, 80, 443] + udp_ports: [53] + services: [ssh] + - address: 10.10.20.5 + expected: + ping: true + tcp_ports: [22, 80] + udp_ports: [] + services: [ssh, http] +``` + +--- + +## Architecture + +### Backend Components + +#### 1. API Blueprint: `web/api/configs.py` + +**New endpoints:** + +| Method | Endpoint | Description | Auth | Request Body | Response | +|--------|----------|-------------|------|--------------|----------| +| `GET` | `/api/configs` | List all config files | Required | - | `{ "configs": [{filename, title, path, created_at, used_by_schedules}] }` | +| `GET` | `/api/configs/` | Get config content (YAML) | Required | - | `{ "filename": "...", "content": "...", "parsed": {...} }` | +| `POST` | `/api/configs/upload-csv` | Upload CSV and convert to YAML | Required | `multipart/form-data` with file | `{ "filename": "...", "preview": "...", "success": true }` | +| `POST` | `/api/configs/upload-yaml` | Upload YAML directly | Required | `multipart/form-data` with file | `{ "filename": "...", "success": true }` | +| `GET` | `/api/configs/template` | Download CSV template | Required | - | CSV file download | +| `DELETE` | `/api/configs/` | Delete config file | Required | - | `{ "success": true }` or error if used by schedules | + +**Error responses:** +- `400` - Invalid CSV format, validation errors +- `404` - Config file not found +- `409` - Config filename conflict +- `422` - Cannot delete (used by schedules) +- `500` - Server error + +**Blueprint structure:** +```python +# web/api/configs.py +from flask import Blueprint, jsonify, request, send_file +from werkzeug.utils import secure_filename +from web.auth.decorators import api_auth_required +from web.services.config_service import ConfigService +import logging + +bp = Blueprint('configs', __name__) +logger = logging.getLogger(__name__) + +@bp.route('', methods=['GET']) +@api_auth_required +def list_configs(): + """List all config files with metadata""" + pass + +@bp.route('/', methods=['GET']) +@api_auth_required +def get_config(filename: str): + """Get config file content""" + pass + +@bp.route('/upload-csv', methods=['POST']) +@api_auth_required +def upload_csv(): + """Upload CSV and convert to YAML""" + pass + +@bp.route('/upload-yaml', methods=['POST']) +@api_auth_required +def upload_yaml(): + """Upload YAML file directly""" + pass + +@bp.route('/template', methods=['GET']) +@api_auth_required +def download_template(): + """Download CSV template""" + pass + +@bp.route('/', methods=['DELETE']) +@api_auth_required +def delete_config(filename: str): + """Delete config file""" + pass +``` + +#### 2. Service Layer: `web/services/config_service.py` + +**Class definition:** +```python +class ConfigService: + """Business logic for config management""" + + def __init__(self, configs_dir: str = '/app/configs'): + self.configs_dir = configs_dir + + def list_configs(self) -> List[Dict[str, Any]]: + """ + List all config files with metadata. + + Returns: + [ + { + "filename": "prod-scan.yaml", + "title": "Prod Scan", + "path": "/app/configs/prod-scan.yaml", + "created_at": "2025-11-15T10:30:00Z", + "size_bytes": 1234, + "used_by_schedules": ["Daily Scan", "Weekly Audit"] + } + ] + """ + pass + + def get_config(self, filename: str) -> Dict[str, Any]: + """ + Get config file content and parsed data. + + Returns: + { + "filename": "prod-scan.yaml", + "content": "title: Prod Scan\n...", + "parsed": {"title": "Prod Scan", "sites": [...]} + } + """ + pass + + def create_from_yaml(self, filename: str, content: str) -> str: + """ + Create config from YAML content. + + Args: + filename: Desired filename (will be sanitized) + content: YAML content string + + Returns: + Final filename + + Raises: + ValueError: If content invalid or filename conflict + """ + pass + + def create_from_csv(self, csv_file, suggested_filename: str = None) -> Tuple[str, str]: + """ + Create config from CSV file. + + Args: + csv_file: File object from request.files + suggested_filename: Optional filename (else auto-generate from title) + + Returns: + (final_filename, yaml_preview) + + Raises: + ValueError: If CSV invalid + """ + pass + + def delete_config(self, filename: str) -> None: + """ + Delete config file if not used by schedules. + + Raises: + FileNotFoundError: If config doesn't exist + ValueError: If config used by active schedules + """ + pass + + def validate_config_content(self, content: Dict) -> Tuple[bool, str]: + """ + Validate parsed YAML config structure. + + Returns: + (is_valid, error_message) + """ + pass + + def get_schedules_using_config(self, filename: str) -> List[str]: + """ + Get list of schedule names using this config. + + Returns: + ["Daily Scan", "Weekly Audit"] + """ + pass + + def generate_filename_from_title(self, title: str) -> str: + """ + Generate safe filename from scan title. + + Example: "Prod Scan 2025" -> "prod-scan-2025.yaml" + """ + pass +``` + +#### 3. CSV Parser: `web/utils/csv_parser.py` + +**Class definition:** +```python +class CSVConfigParser: + """Parse and validate CSV config files""" + + REQUIRED_COLUMNS = [ + 'scan_title', 'site_name', 'ip_address', + 'ping_expected', 'tcp_ports', 'udp_ports', 'services' + ] + + def __init__(self): + pass + + def parse_csv_to_yaml(self, csv_file) -> str: + """ + Convert CSV file to YAML string. + + Args: + csv_file: File object or file path + + Returns: + YAML string + + Raises: + ValueError: If CSV invalid + """ + pass + + def validate_csv_structure(self, csv_file) -> Tuple[bool, List[str]]: + """ + Validate CSV structure and content. + + Returns: + (is_valid, error_messages) + """ + pass + + def _parse_port_list(self, value: str) -> List[int]: + """Parse comma-separated port list""" + pass + + def _parse_string_list(self, value: str) -> List[str]: + """Parse comma-separated string list""" + pass + + def _parse_bool(self, value: str, default: bool = False) -> bool: + """Parse boolean value (true/false/1/0)""" + pass + + def _validate_ip_address(self, ip: str) -> bool: + """Validate IPv4/IPv6 address format""" + pass + + def _validate_port(self, port: int) -> bool: + """Validate port number (1-65535)""" + pass +``` + +#### 4. Template Generator: `web/utils/template_generator.py` + +**Function:** +```python +def generate_csv_template() -> str: + """ + Generate CSV template with headers and example rows. + + Returns: + CSV string with headers and 2 example rows + """ + template = [ + ['scan_title', 'site_name', 'ip_address', 'ping_expected', 'tcp_ports', 'udp_ports', 'services'], + ['Example Infrastructure Scan', 'Production Web Servers', '10.10.20.4', 'true', '22,80,443', '53', 'ssh,http,https'], + ['Example Infrastructure Scan', 'Production Web Servers', '10.10.20.5', 'true', '22,3306', '', 'ssh,mysql'], + ] + + output = io.StringIO() + writer = csv.writer(output) + writer.writerows(template) + return output.getvalue() +``` + +### Frontend Components + +#### 1. New Route: Configs Management Page + +**File:** `web/routes/main.py` + +```python +@bp.route('/configs') +@login_required +def configs(): + """Config management page""" + return render_template('configs.html') + +@bp.route('/configs/upload') +@login_required +def upload_config(): + """Config upload page""" + return render_template('config_upload.html') +``` + +#### 2. Template: Config List Page + +**File:** `web/templates/configs.html` + +**Features:** +- Table listing all configs with columns: + - Filename + - Title (from YAML) + - Created date + - Size + - Used by schedules (badge count) + - Actions (view, download, delete) +- "Create New Config" button → redirects to upload page +- "Download CSV Template" button +- Delete confirmation modal +- Search/filter box (client-side) + +**Layout:** +```html +{% extends "base.html" %} + +{% block content %} +
+ + +
+
+
+ +
+ + + + + + + + + + + + + + + +
FilenameTitleCreatedSizeUsed ByActions
+
+
+
+ + + +{% endblock %} +``` + +#### 3. Template: Config Upload Page + +**File:** `web/templates/config_upload.html` + +**Features:** +- Two upload methods (tabs): + - **Tab 1: CSV Upload** (default) + - Drag-drop zone or file picker (`.csv` only) + - Instructions: "Download template, fill with your data, upload here" + - Preview pane showing generated YAML after upload + - "Save Config" button (disabled until valid upload) + - **Tab 2: YAML Upload** (for advanced users) + - Drag-drop zone or file picker (`.yaml`, `.yml` only) + - Direct upload without conversion +- Real-time validation feedback +- Error messages with specific issues +- Success message with link to configs list + +**Layout:** +```html +{% extends "base.html" %} + +{% block content %} +
+

Create New Configuration

+ + + +
+ +
+
+
+
+
+
Step 1: Upload CSV
+

+ + Download template + first if you haven't already. +

+ +
+ +

Drag & drop CSV file here or click to browse

+ +
+ + +
+
+
+ +
+
+
+
Step 2: Preview & Save
+ +
+ Upload a CSV file to see preview +
+
+
+
+
+
+ + +
+
+
+
Upload YAML Configuration
+

+ For advanced users: upload a YAML config file directly. +

+ +
+ +

Drag & drop YAML file here or click to browse

+ +
+ + + + +
+
+
+
+
+{% endblock %} +``` + +#### 4. JavaScript: Config Manager + +**File:** `web/static/js/config-manager.js` + +**Functions:** +```javascript +class ConfigManager { + constructor() { + this.apiBase = '/api/configs'; + } + + // List configs and populate table + async loadConfigs() { + const response = await fetch(this.apiBase); + const data = await response.json(); + this.renderConfigsTable(data.configs); + } + + // Upload CSV file + async uploadCSV(file) { + const formData = new FormData(); + formData.append('file', file); + + const response = await fetch(`${this.apiBase}/upload-csv`, { + method: 'POST', + body: formData + }); + + if (!response.ok) { + const error = await response.json(); + throw new Error(error.message); + } + + return await response.json(); + } + + // Upload YAML file + async uploadYAML(file, filename) { + // Similar to uploadCSV + } + + // Delete config + async deleteConfig(filename) { + const response = await fetch(`${this.apiBase}/${filename}`, { + method: 'DELETE' + }); + + if (!response.ok) { + const error = await response.json(); + throw new Error(error.message); + } + + return await response.json(); + } + + // Show preview of YAML + showPreview(yamlContent) { + document.getElementById('yaml-content').textContent = yamlContent; + document.getElementById('yaml-preview').style.display = 'block'; + document.getElementById('preview-placeholder').style.display = 'none'; + } + + // Render configs table + renderConfigsTable(configs) { + // Populate table with config data + } + + // Client-side search filter + filterConfigs(searchTerm) { + // Filter table rows by search term + } +} + +// Drag-drop handlers +function setupDropzone(elementId, fileInputId, fileType) { + const dropzone = document.getElementById(elementId); + const fileInput = document.getElementById(fileInputId); + + dropzone.addEventListener('click', () => fileInput.click()); + + dropzone.addEventListener('dragover', (e) => { + e.preventDefault(); + dropzone.classList.add('dragover'); + }); + + dropzone.addEventListener('drop', (e) => { + e.preventDefault(); + dropzone.classList.remove('dragover'); + const file = e.dataTransfer.files[0]; + handleFileUpload(file, fileType); + }); + + fileInput.addEventListener('change', (e) => { + const file = e.target.files[0]; + handleFileUpload(file, fileType); + }); +} +``` + +#### 5. CSS Styling + +**File:** `web/static/css/config-manager.css` + +```css +.dropzone { + border: 2px dashed #6c757d; + border-radius: 8px; + padding: 40px; + text-align: center; + cursor: pointer; + transition: all 0.3s ease; + background-color: #f8f9fa; +} + +.dropzone:hover, +.dropzone.dragover { + border-color: #0d6efd; + background-color: #e7f1ff; +} + +.dropzone i { + font-size: 48px; + color: #6c757d; + margin-bottom: 16px; +} + +#yaml-preview pre { + background-color: #f8f9fa; + border: 1px solid #dee2e6; + border-radius: 4px; + padding: 16px; + max-height: 400px; + overflow-y: auto; +} + +.config-actions { + white-space: nowrap; +} + +.config-actions .btn { + margin-right: 4px; +} + +.schedule-badge { + background-color: #0d6efd; + color: white; + padding: 4px 8px; + border-radius: 4px; + font-size: 0.85em; +} +``` + +### Navigation Integration + +**Update:** `web/templates/base.html` + +Add "Configs" link to navigation menu: +```html + +``` + +--- + +## Implementation Tasks + +### Task Breakdown (15 tasks) + +#### Backend (8 tasks) + +1. **Create CSV parser utility** (`web/utils/csv_parser.py`) + - Implement `CSVConfigParser` class + - Parse CSV rows into dict structure + - Validate CSV structure (columns, data types) + - Handle edge cases (empty cells, quotes, commas in values) + - Write unit tests (10+ test cases) + +2. **Create template generator** (`web/utils/template_generator.py`) + - Generate CSV template with headers + example rows + - Write unit tests + +3. **Create config service** (`web/services/config_service.py`) + - Implement `ConfigService` class with all methods + - CSV-to-YAML conversion logic + - Filename sanitization and conflict detection + - Schedule dependency checking + - File operations (read, write, delete) + - Write unit tests (15+ test cases) + +4. **Create configs API blueprint** (`web/api/configs.py`) + - Implement 6 API endpoints + - Error handling with proper HTTP status codes + - Request validation + - File upload handling (multipart/form-data) + - File download headers + - Integrate with ConfigService + +5. **Register configs blueprint** (`web/app.py`) + - Import and register blueprint + - Add to API routes + +6. **Add file upload limits** (`web/app.py`) + - Set MAX_CONTENT_LENGTH for uploads (2MB for CSV/YAML) + - Add file size validation + +7. **Update existing services** (if needed) + - `schedule_service.py`: Add config dependency check method + - Ensure config validation called before scan trigger + +8. **Write integration tests** (`tests/test_config_api.py`) + - Test all API endpoints (20+ test cases) + - Test CSV upload → conversion → scan trigger workflow + - Test error cases (invalid CSV, filename conflict, delete protection) + - Test file download + +#### Frontend (5 tasks) + +9. **Create configs list template** (`web/templates/configs.html`) + - Table layout with columns + - Action buttons (view, download, delete) + - Delete confirmation modal + - Search box + - Links to upload page and template download + +10. **Create config upload template** (`web/templates/config_upload.html`) + - Two-tab interface (CSV upload, YAML upload) + - Drag-drop zones for both tabs + - YAML preview pane + - Error message displays + - Success/failure feedback + +11. **Add routes to main blueprint** (`web/routes/main.py`) + - `/configs` - List configs + - `/configs/upload` - Upload page + +12. **Create JavaScript module** (`web/static/js/config-manager.js`) + - ConfigManager class + - API fetch wrappers for all endpoints + - Drag-drop handlers + - File upload with FormData + - Table rendering and search filtering + - Preview display + - Error display helpers + +13. **Create CSS styling** (`web/static/css/config-manager.css`) + - Dropzone styling with hover effects + - Preview pane styling + - Table action buttons + - Responsive layout adjustments + +#### Integration & Documentation (2 tasks) + +14. **Update navigation** (`web/templates/base.html`) + - Add "Configs" link to navbar + - Set active state based on current route + +15. **Update documentation** + - Update README.md with Config Creator section + - Update API_REFERENCE.md with new endpoints + - Create this document (Phase4.md) ✓ + +--- + +## Testing Strategy + +### Unit Tests + +**File:** `tests/test_csv_parser.py` + +Test cases (10+): +- Valid CSV with multiple sites and IPs +- Single site, single IP +- Empty optional fields (udp_ports, services) +- Boolean parsing (true/false/TRUE/FALSE) +- Port list parsing (single port, multiple ports, quoted/unquoted) +- Invalid IP addresses +- Invalid port numbers (0, 65536, negative) +- Missing required columns +- Inconsistent scan_title across rows +- Empty CSV (no data rows) +- Malformed CSV (wrong number of columns) + +**File:** `tests/test_config_service.py` + +Test cases (15+): +- List configs (empty directory, multiple configs) +- Get config (valid, non-existent) +- Create from YAML (valid, invalid syntax, duplicate filename) +- Create from CSV (valid, invalid CSV, duplicate filename) +- Delete config (valid, non-existent, used by schedule) +- Validate config content (valid, missing title, missing sites, invalid structure) +- Get schedules using config (none, multiple) +- Generate filename from title (various titles, special characters, long titles) + +### Integration Tests + +**File:** `tests/test_config_api.py` + +Test cases (20+): +- **GET /api/configs** + - List configs successfully + - Empty list when no configs exist + - Includes schedule usage counts +- **GET /api/configs/** + - Get existing config + - 404 for non-existent config + - Returns parsed YAML data +- **POST /api/configs/upload-csv** + - Upload valid CSV → creates YAML + - Upload invalid CSV → 400 error with details + - Upload CSV with duplicate filename → 409 error + - Upload non-CSV file → 400 error + - Upload file too large → 413 error +- **POST /api/configs/upload-yaml** + - Upload valid YAML → creates config + - Upload invalid YAML syntax → 400 error + - Upload YAML missing required fields → 400 error +- **GET /api/configs/template** + - Download CSV template + - Returns text/csv mimetype + - Template has correct headers and examples +- **DELETE /api/configs/** + - Delete unused config → success + - Delete used-by-schedule config → 422 error with schedule list + - Delete non-existent config → 404 error +- **Authentication** + - All endpoints require auth + - Unauthenticated requests → 401 error + +### End-to-End Tests + +**File:** `tests/test_config_workflow.py` + +Test cases (5): +1. **Complete CSV workflow:** + - Download template + - Upload modified CSV + - Verify YAML created correctly + - Trigger scan with new config + - Verify scan completes successfully + +2. **Config deletion protection:** + - Create config + - Create schedule using config + - Attempt to delete config → fails with error + - Disable schedule + - Delete config → succeeds + +3. **Filename conflict handling:** + - Create config "test-scan.yaml" + - Upload CSV with same title + - Verify error returned + - User changes filename → succeeds + +4. **YAML direct upload:** + - Upload valid YAML + - Config immediately usable + - Trigger scan → works + +5. **CSV validation errors:** + - Upload CSV with invalid IP + - Verify clear error message returned + - Fix CSV and re-upload → succeeds + +### Manual Testing Checklist + +**UI/UX:** +- [ ] Drag-drop file upload works +- [ ] File picker works +- [ ] Preview shows correct YAML +- [ ] Error messages are clear and actionable +- [ ] Success messages appear after save +- [ ] Table search/filter works +- [ ] Delete confirmation modal works +- [ ] Navigation links work +- [ ] Responsive layout on mobile + +**File Handling:** +- [ ] CSV template downloads correctly +- [ ] CSV with special characters (commas, quotes) parses correctly +- [ ] Large CSV files upload successfully +- [ ] YAML files with UTF-8 characters work +- [ ] Generated YAML is valid and scanner accepts it + +**Error Cases:** +- [ ] Invalid CSV format shows helpful error +- [ ] Duplicate filename shows conflict error +- [ ] Delete protected config shows which schedules use it +- [ ] Network errors handled gracefully +- [ ] File too large shows size limit error + +--- + +## Security Considerations + +### Input Validation + +1. **Filename sanitization:** + - Use `werkzeug.utils.secure_filename()` + - Remove path traversal attempts (`../`, `./`) + - Limit filename length (200 chars) + - Only allow alphanumeric, hyphens, underscores + +2. **File type validation:** + - Check file extension (`.csv`, `.yaml`, `.yml`) + - Verify MIME type matches extension + - Reject executable or script file extensions + +3. **CSV content validation:** + - Validate all IPs with `ipaddress` module + - Validate port ranges (1-65535) + - Limit CSV row count (max 1000 rows) + - Limit CSV file size (max 2MB) + +4. **YAML parsing security:** + - Always use `yaml.safe_load()` (never `yaml.load()`) + - Prevents arbitrary code execution + - Only load basic data types (dict, list, str, int, bool) + +### Access Control + +1. **Authentication required:** + - All API endpoints require `@api_auth_required` + - All web routes require `@login_required` + - Single-user model (no multi-tenant concerns) + +2. **File system access:** + - Restrict all operations to `/app/configs/` directory + - Validate no path traversal in any file operations + - Use `os.path.join()` and `os.path.abspath()` safely + +### File Upload Security + +1. **Size limits:** + - CSV: 2MB max + - YAML: 2MB max + - Configurable in Flask config + +2. **Rate limiting (future consideration):** + - Limit upload frequency per session + - Prevent DoS via repeated large uploads + +3. **Virus scanning (future consideration):** + - For production deployments + - Scan uploaded files with ClamAV + +--- + +## Database Changes + +**No database schema changes required for Phase 4.** + +Configs are stored as files, not in database. However, for future enhancement, consider adding: + +```python +# Optional future enhancement (not Phase 4) +class Config(Base): + __tablename__ = 'configs' + + id = Column(Integer, primary_key=True) + filename = Column(String(255), unique=True, nullable=False, index=True) + title = Column(String(255), nullable=False) + created_at = Column(DateTime, default=datetime.utcnow, nullable=False) + updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow) + created_by = Column(String(50)) # 'csv', 'yaml', 'manual' + file_hash = Column(String(64)) # SHA256 hash for change detection +``` + +Benefits of database tracking: +- Faster metadata queries (no need to parse YAML for title) +- Change detection and versioning +- Usage statistics (how often config used) +- Search and filtering + +**Decision:** Keep Phase 4 simple (file-based only). Add database tracking in Phase 5+ if needed. + +--- + +## Integration with Existing Features + +### Dashboard Integration + +**Current:** Dashboard has "Run Scan Now" button with config dropdown + +**Change:** Config dropdown populated via `/api/configs` instead of filesystem scan + +**File:** `web/templates/dashboard.html` + +```javascript +// Before (Phase 3) +fetch('/api/configs-list') // Doesn't exist yet + +// After (Phase 4) +fetch('/api/configs') + .then(res => res.json()) + .then(data => { + const select = document.getElementById('config-select'); + data.configs.forEach(config => { + const option = new Option(config.title, config.filename); + select.add(option); + }); + }); +``` + +### Schedule Management Integration + +**Current:** Schedule form has config file input (text field or dropdown) + +**Change:** Config selector uses `/api/configs` to show available configs + +**File:** `web/templates/schedule_form.html` + +```html + + + + + +

+ Don't see your config? Create a new one +

+``` + +### Scan Trigger Integration + +**Current:** Scan trigger validates config file exists + +**Change:** No changes needed, validation already in place via `validators.validate_config_file()` + +**File:** `web/services/scan_service.py` + +```python +def trigger_scan(self, config_file: str, triggered_by: str = 'manual'): + # Existing validation + is_valid, error = validate_config_file(config_file) + if not is_valid: + raise ValueError(f"Invalid config file: {error}") + + # Continue with scan... +``` + +--- + +## Success Criteria + +### Phase 4 Complete When: + +1. **CSV Template Download:** + - ✓ User can download CSV template from UI + - ✓ Template includes headers and example rows + - ✓ Template format matches specification + +2. **CSV Upload & Conversion:** + - ✓ User can upload CSV via drag-drop or file picker + - ✓ CSV validates correctly (structure, data types, IPs, ports) + - ✓ CSV converts to valid YAML + - ✓ Generated YAML matches expected structure + - ✓ Config saved to `/app/configs/` directory + +3. **YAML Upload:** + - ✓ User can upload YAML directly + - ✓ YAML validates before saving + - ✓ Invalid YAML shows clear error message + +4. **Config Management:** + - ✓ User can list all configs with metadata + - ✓ User can view config content (YAML) + - ✓ User can download existing configs + - ✓ User can delete unused configs + - ✓ Deletion blocked if config used by schedules + +5. **Integration:** + - ✓ Generated configs work with scan trigger + - ✓ Configs appear in dashboard scan selector + - ✓ Configs appear in schedule form selector + - ✓ No breaking changes to existing workflows + +6. **Testing:** + - ✓ All unit tests pass (25+ tests) + - ✓ All integration tests pass (20+ tests) + - ✓ Manual testing checklist complete + +7. **Documentation:** + - ✓ README.md updated with Config Creator section + - ✓ API_REFERENCE.md updated with new endpoints + - ✓ Phase4.md created (this document) + +--- + +## Timeline Estimate + +**Total Duration:** 4-5 days + +### Day 1: Backend Foundation (CSV Parser & Service) +- Create `csv_parser.py` with CSVConfigParser class (3 hours) +- Write unit tests for CSV parser (2 hours) +- Create `template_generator.py` (1 hour) +- Create `config_service.py` skeleton (2 hours) + +### Day 2: Backend Implementation (Config Service & API) +- Implement all ConfigService methods (4 hours) +- Write unit tests for ConfigService (3 hours) +- Create `configs.py` API blueprint (2 hours) + +### Day 3: Backend Testing & Frontend Start +- Write integration tests for API endpoints (3 hours) +- Create `configs.html` template (2 hours) +- Create `config_upload.html` template (2 hours) +- Add routes to `main.py` (1 hour) + +### Day 4: Frontend JavaScript & Styling +- Create `config-manager.js` with all functionality (4 hours) +- Implement drag-drop handlers (2 hours) +- Create `config-manager.css` styling (1 hour) +- Update navigation in `base.html` (30 min) +- Manual testing and bug fixes (2 hours) + +### Day 5: Integration & Documentation +- End-to-end testing (2 hours) +- Update dashboard/schedule integration (2 hours) +- Update README.md and API_REFERENCE.md (2 hours) +- Final manual testing checklist (2 hours) +- Bug fixes and polish (2 hours) + +**Buffer:** +1 day for unexpected issues or scope additions + +--- + +## Future Enhancements (Post-Phase 4) + +Not in scope for Phase 4, but consider for future phases: + +1. **Config Editor UI (Phase 5+):** + - Web form to create configs without CSV + - Add/remove sites and IPs dynamically + - Port picker with common presets + - Inline YAML editor with syntax highlighting + +2. **Config Versioning (Phase 5+):** + - Track config changes over time + - Compare versions (diff view) + - Rollback to previous version + - Store versions in database + +3. **CSV Export (Phase 5+):** + - Export existing YAML configs to CSV + - Edit in Excel and re-upload + - Useful for bulk updates + +4. **Config Templates (Phase 6+):** + - Pre-built templates for common scenarios: + - Web server infrastructure + - Database cluster + - Network devices + - User selects template, fills IPs, done + +5. **Bulk Import (Phase 6+):** + - Upload multiple CSV files at once + - ZIP archive with multiple configs + - Import from external sources (CMDB, spreadsheet) + +6. **Config Validation on Schedule (Phase 6+):** + - Periodic validation of all configs + - Alert if config becomes invalid (file deleted, corrupted) + - Auto-disable schedules with invalid configs + +7. **Config Metadata & Tags (Phase 6+):** + - Add tags/labels to configs (environment, team, etc.) + - Filter/search by tags + - Group related configs + +8. **Config Diff Tool (Phase 6+):** + - Compare two configs side-by-side + - Highlight differences (IPs, ports, sites) + - Useful for environment parity checks + +--- + +## Open Questions + +### Resolved: +- ✓ One CSV = one config or multiple? **One config** +- ✓ Form editor or upload only? **Upload only (form later)** +- ✓ Config versioning? **No (Phase 4), maybe later** +- ✓ Delete protection? **Block if used by schedules** + +### Remaining: +- **Filename handling:** Auto-generate from title or let user specify? + - **Recommendation:** Auto-generate with option to customize (add filename input field) +- **Duplicate IP handling:** Allow or reject duplicate IPs in same config? + - **Recommendation:** Allow (user might scan same IP with different expected ports) +- **Config validation frequency:** Validate on upload only, or re-validate periodically? + - **Recommendation:** Upload only (Phase 4), periodic validation in Phase 6+ +- **CSV encoding:** Support only UTF-8 or other encodings? + - **Recommendation:** UTF-8 only (standard, avoids complexity) + +--- + +## Risk Assessment + +### Low Risk: +- CSV parsing (standard library, well-tested) +- File upload handling (Flask/werkzeug built-in) +- YAML generation (PyYAML library) + +### Medium Risk: +- Complex CSV validation edge cases (commas in values, quotes) + - **Mitigation:** Comprehensive unit tests, use csv.reader() +- Filename conflicts and race conditions + - **Mitigation:** Check existence before write, use atomic operations + +### High Risk: +- Breaking existing scan/schedule workflows + - **Mitigation:** Extensive integration tests, no changes to existing validation +- Security vulnerabilities (path traversal, code injection) + - **Mitigation:** Use secure_filename(), yaml.safe_load(), input validation + +### Contingency: +- If CSV parsing too complex: Start with YAML upload only, add CSV in Phase 4.5 +- If schedule deletion check too slow: Cache schedule-config mappings +- If file-based config management becomes bottleneck: Migrate to database in Phase 5 + +--- + +## Deployment Notes + +### Requirements: +- No new Python dependencies (csv, yaml, os, io all in stdlib) +- No new system dependencies +- No database migrations + +### Configuration: +Add to `.env` (optional): +```bash +# Config upload limits +MAX_CONFIG_SIZE_MB=2 +MAX_CSV_ROWS=1000 +CONFIGS_DIR=/app/configs +``` + +### Deployment Steps: +1. Pull latest code +2. Restart Flask app: `docker-compose -f docker-compose-web.yml restart web` +3. Verify `/api/configs/template` endpoint works +4. Test CSV upload with template + +### Rollback Plan: +- No database changes, so rollback is safe +- Revert code changes and restart +- Configs created during Phase 4 remain valid (files are backward compatible) + +--- + +## References + +### Related Documentation: +- [Phase 2 Complete](PHASE2_COMPLETE.md) - REST API patterns, authentication +- [API Reference](API_REFERENCE.md) - Existing API structure +- [Deployment Guide](DEPLOYMENT.md) - Production deployment + +### External Resources: +- [Python csv module docs](https://docs.python.org/3/library/csv.html) +- [PyYAML documentation](https://pyyaml.org/wiki/PyYAMLDocumentation) +- [Flask file upload guide](https://flask.palletsprojects.com/en/3.0.x/patterns/fileuploads/) +- [Werkzeug secure_filename](https://werkzeug.palletsprojects.com/en/3.0.x/utils/#werkzeug.utils.secure_filename) + +### Code References: +- `src/scanner.py:42-54` - Config loading and validation +- `web/utils/validators.py:14-85` - Existing validation patterns +- `web/services/scan_service.py` - Scan trigger with config validation +- `web/api/scans.py` - API endpoint patterns + +--- + +## Changelog + +| Date | Version | Changes | +|------|---------|---------| +| 2025-11-17 | 1.0 | Initial Phase 4 plan created based on user requirements and design decisions | + +--- + +**Last Updated:** 2025-11-17 +**Next Review:** After Phase 4 implementation complete +**Approval Status:** Ready for Implementation + +--- + +**Phase 4 Goal:** Enable non-technical users to create scan configurations via CSV upload, eliminating the need for manual YAML editing and server file access. diff --git a/docs/ai/ROADMAP.md b/docs/ai/ROADMAP.md index ab427bd..f70f14c 100644 --- a/docs/ai/ROADMAP.md +++ b/docs/ai/ROADMAP.md @@ -15,10 +15,11 @@ - Basic UI templates (dashboard, scans, login) - Comprehensive error handling and logging - 100 tests passing (1,825 lines of test code) -- ⏳ **Phase 3: Dashboard & Scheduling** - Next up (Weeks 5-6) -- 📋 **Phase 4: Email & Comparisons** - Planned (Weeks 7-8) -- 📋 **Phase 5: CLI as API Client** - Planned (Week 9) -- 📋 **Phase 6: Advanced Features** - Planned (Weeks 10+) +- ✅ **Phase 3: Dashboard & Scheduling** - Complete (2025-11-14) +- 📋 **Phase 4: Config Creator ** -Next up +- 📋 **Phase 5: Email & Comparisons** - Planned (Weeks 7-8) +- 📋 **Phase 6: CLI as API Client** - Planned (Week 9) +- 📋 **Phase 7: Advanced Features** - Planned (Weeks 10+) ## Vision & Goals