Files
SneakyScan/docs/ai/Phase4.md
2025-11-17 12:05:11 -06:00

1484 lines
45 KiB
Markdown

# Phase 4: Config Creator - CSV Upload & Management
**Status:** Ready to Start
**Priority:** HIGH - Core usability feature
**Estimated Duration:** 4-5 days
**Dependencies:** Phase 2 Complete (REST API, Authentication)
---
## Overview
Phase 4 introduces a **Config Creator** feature that allows users to manage scan configurations through the web UI instead of manually creating YAML files. This dramatically improves usability by providing:
1. **CSV Template Download** - Pre-formatted CSV template for config creation
2. **CSV Upload & Conversion** - Upload filled CSV, automatically convert to YAML
3. **YAML Upload** - Direct YAML upload for advanced users
4. **Config Management UI** - List, view, download, and delete configs
5. **Integration** - Seamlessly works with existing scan triggers and schedules
### User Workflow
**Current State (Manual):**
```
1. User creates YAML file locally (requires YAML knowledge)
2. User uploads to server via Docker volume or container shell
3. User references config in scan trigger/schedule
```
**New Workflow (Phase 4):**
```
1. User clicks "Download CSV Template" in web UI
2. User fills CSV with sites, IPs, expected ports in Excel/Google Sheets
3. User uploads CSV via drag-drop or file picker
4. System validates CSV and converts to YAML
5. User previews generated YAML and confirms
6. Config saved and immediately available for scans/schedules
```
---
## Design Decisions
Based on project requirements and complexity analysis:
### 1. CSV Scope: One CSV = One Config ✓
- Each CSV file creates a single YAML config file
- All rows share the same `scan_title` (first column)
- Simpler to implement and understand
- Users create multiple CSVs for multiple configs
### 2. Creation Methods: CSV/YAML Upload Only ✓
- CSV upload with conversion (primary method)
- Direct YAML upload (for advanced users)
- Form-based editor deferred to future phase
- Focused scope for faster delivery
### 3. Versioning: No Version History ✓
- Configs overwrite when updated (no `.bak` files)
- Simpler implementation, less storage
- Users can download existing configs before editing
### 4. Deletion Safety: Block if Used by Schedules ✓
- Prevent deletion of configs referenced by active schedules
- Show error message listing which schedules use the config
- Safest approach, prevents schedule failures
### 5. Additional Scope Decisions
- **File naming:** Auto-generated from scan title (sanitized)
- **File extensions:** Accept `.yaml`, `.yml` for uploads
- **CSV export:** Not in Phase 4 (future enhancement)
- **Config editing:** Download → Edit locally → Re-upload (no inline editor yet)
---
## CSV Template Specification
### CSV Format
**Columns (in order):**
```
scan_title,site_name,ip_address,ping_expected,tcp_ports,udp_ports,services
```
**Example CSV:**
```csv
scan_title,site_name,ip_address,ping_expected,tcp_ports,udp_ports,services
Sneaky Infra Scan,Production Web Servers,10.10.20.4,true,"22,53,80",53,"ssh,domain,http"
Sneaky Infra Scan,Production Web Servers,10.10.20.11,true,"22,111,3128,8006","","ssh,rpcbind,squid"
Sneaky Infra Scan,Database Servers,10.10.30.5,true,"22,3306","",""
Sneaky Infra Scan,Database Servers,10.10.30.6,false,"22,5432","",""
```
### Field Specifications
| Column | Type | Required | Format | Example | Description |
|--------|------|----------|--------|---------|-------------|
| `scan_title` | string | Yes | Any text | `"Sneaky Infra Scan"` | Becomes YAML `title` field. Must be same for all rows. |
| `site_name` | string | Yes | Any text | `"Production Web Servers"` | Logical grouping. Rows with same site_name are grouped together. |
| `ip_address` | string | Yes | IPv4 or IPv6 | `"10.10.20.4"` | Target IP address to scan. |
| `ping_expected` | boolean | No | `true`/`false` (case-insensitive) | `true` | Whether host should respond to ping. Default: `false` |
| `tcp_ports` | list[int] | No | Comma-separated, quoted if multiple | `"22,80,443"` or `22` | Expected TCP ports. Empty = no expected TCP ports. |
| `udp_ports` | list[int] | No | Comma-separated, quoted if multiple | `"53,123"` or `53` | Expected UDP ports. Empty = no expected UDP ports. |
| `services` | list[str] | No | Comma-separated, quoted if multiple | `"ssh,http,https"` | Expected service names (optional). |
### Validation Rules
**Row-level validation:**
1. `scan_title` must be non-empty and same across all rows
2. `site_name` must be non-empty
3. `ip_address` must be valid IPv4 or IPv6 format
4. `ping_expected` must be `true`, `false`, `TRUE`, `FALSE`, or empty (default false)
5. Port numbers must be integers 1-65535
6. Port lists must be comma-separated (spaces optional)
7. Duplicate IPs within same config are allowed (different expected values)
**File-level validation:**
1. CSV must have exactly 7 columns with correct headers
2. Must have at least 1 data row (besides header)
3. Must have at least 1 site defined
4. Must have at least 1 IP per site
**Filename validation:**
1. Generated filename from scan_title (lowercase, spaces→hyphens, special chars removed)
2. Must not conflict with existing config files
3. Max filename length: 200 characters
### CSV-to-YAML Conversion Logic
**Python pseudocode:**
```python
def csv_to_yaml(csv_content: str) -> str:
rows = parse_csv(csv_content)
# Extract scan title (same for all rows)
scan_title = rows[0]['scan_title']
# Group rows by site_name
sites = {}
for row in rows:
site_name = row['site_name']
if site_name not in sites:
sites[site_name] = []
# Parse ports and services
tcp_ports = parse_port_list(row['tcp_ports'])
udp_ports = parse_port_list(row['udp_ports'])
services = parse_string_list(row['services'])
ping = parse_bool(row['ping_expected'], default=False)
sites[site_name].append({
'address': row['ip_address'],
'expected': {
'ping': ping,
'tcp_ports': tcp_ports,
'udp_ports': udp_ports,
'services': services # Optional, omit if empty
}
})
# Build YAML structure
yaml_data = {
'title': scan_title,
'sites': [
{
'name': site_name,
'ips': ips
}
for site_name, ips in sites.items()
]
}
return yaml.dump(yaml_data, sort_keys=False)
```
**Example Conversion:**
**Input CSV:**
```csv
scan_title,site_name,ip_address,ping_expected,tcp_ports,udp_ports,services
Prod Scan,Web Servers,10.10.20.4,true,"22,80,443",53,ssh
Prod Scan,Web Servers,10.10.20.5,true,"22,80",,"ssh,http"
```
**Output YAML:**
```yaml
title: Prod Scan
sites:
- name: Web Servers
ips:
- address: 10.10.20.4
expected:
ping: true
tcp_ports: [22, 80, 443]
udp_ports: [53]
services: [ssh]
- address: 10.10.20.5
expected:
ping: true
tcp_ports: [22, 80]
udp_ports: []
services: [ssh, http]
```
---
## Architecture
### Backend Components
#### 1. API Blueprint: `web/api/configs.py`
**New endpoints:**
| Method | Endpoint | Description | Auth | Request Body | Response |
|--------|----------|-------------|------|--------------|----------|
| `GET` | `/api/configs` | List all config files | Required | - | `{ "configs": [{filename, title, path, created_at, used_by_schedules}] }` |
| `GET` | `/api/configs/<filename>` | Get config content (YAML) | Required | - | `{ "filename": "...", "content": "...", "parsed": {...} }` |
| `POST` | `/api/configs/upload-csv` | Upload CSV and convert to YAML | Required | `multipart/form-data` with file | `{ "filename": "...", "preview": "...", "success": true }` |
| `POST` | `/api/configs/upload-yaml` | Upload YAML directly | Required | `multipart/form-data` with file | `{ "filename": "...", "success": true }` |
| `GET` | `/api/configs/template` | Download CSV template | Required | - | CSV file download |
| `DELETE` | `/api/configs/<filename>` | Delete config file | Required | - | `{ "success": true }` or error if used by schedules |
**Error responses:**
- `400` - Invalid CSV format, validation errors
- `404` - Config file not found
- `409` - Config filename conflict
- `422` - Cannot delete (used by schedules)
- `500` - Server error
**Blueprint structure:**
```python
# web/api/configs.py
from flask import Blueprint, jsonify, request, send_file
from werkzeug.utils import secure_filename
from web.auth.decorators import api_auth_required
from web.services.config_service import ConfigService
import logging
bp = Blueprint('configs', __name__)
logger = logging.getLogger(__name__)
@bp.route('', methods=['GET'])
@api_auth_required
def list_configs():
"""List all config files with metadata"""
pass
@bp.route('/<filename>', methods=['GET'])
@api_auth_required
def get_config(filename: str):
"""Get config file content"""
pass
@bp.route('/upload-csv', methods=['POST'])
@api_auth_required
def upload_csv():
"""Upload CSV and convert to YAML"""
pass
@bp.route('/upload-yaml', methods=['POST'])
@api_auth_required
def upload_yaml():
"""Upload YAML file directly"""
pass
@bp.route('/template', methods=['GET'])
@api_auth_required
def download_template():
"""Download CSV template"""
pass
@bp.route('/<filename>', methods=['DELETE'])
@api_auth_required
def delete_config(filename: str):
"""Delete config file"""
pass
```
#### 2. Service Layer: `web/services/config_service.py`
**Class definition:**
```python
class ConfigService:
"""Business logic for config management"""
def __init__(self, configs_dir: str = '/app/configs'):
self.configs_dir = configs_dir
def list_configs(self) -> List[Dict[str, Any]]:
"""
List all config files with metadata.
Returns:
[
{
"filename": "prod-scan.yaml",
"title": "Prod Scan",
"path": "/app/configs/prod-scan.yaml",
"created_at": "2025-11-15T10:30:00Z",
"size_bytes": 1234,
"used_by_schedules": ["Daily Scan", "Weekly Audit"]
}
]
"""
pass
def get_config(self, filename: str) -> Dict[str, Any]:
"""
Get config file content and parsed data.
Returns:
{
"filename": "prod-scan.yaml",
"content": "title: Prod Scan\n...",
"parsed": {"title": "Prod Scan", "sites": [...]}
}
"""
pass
def create_from_yaml(self, filename: str, content: str) -> str:
"""
Create config from YAML content.
Args:
filename: Desired filename (will be sanitized)
content: YAML content string
Returns:
Final filename
Raises:
ValueError: If content invalid or filename conflict
"""
pass
def create_from_csv(self, csv_file, suggested_filename: str = None) -> Tuple[str, str]:
"""
Create config from CSV file.
Args:
csv_file: File object from request.files
suggested_filename: Optional filename (else auto-generate from title)
Returns:
(final_filename, yaml_preview)
Raises:
ValueError: If CSV invalid
"""
pass
def delete_config(self, filename: str) -> None:
"""
Delete config file if not used by schedules.
Raises:
FileNotFoundError: If config doesn't exist
ValueError: If config used by active schedules
"""
pass
def validate_config_content(self, content: Dict) -> Tuple[bool, str]:
"""
Validate parsed YAML config structure.
Returns:
(is_valid, error_message)
"""
pass
def get_schedules_using_config(self, filename: str) -> List[str]:
"""
Get list of schedule names using this config.
Returns:
["Daily Scan", "Weekly Audit"]
"""
pass
def generate_filename_from_title(self, title: str) -> str:
"""
Generate safe filename from scan title.
Example: "Prod Scan 2025" -> "prod-scan-2025.yaml"
"""
pass
```
#### 3. CSV Parser: `web/utils/csv_parser.py`
**Class definition:**
```python
class CSVConfigParser:
"""Parse and validate CSV config files"""
REQUIRED_COLUMNS = [
'scan_title', 'site_name', 'ip_address',
'ping_expected', 'tcp_ports', 'udp_ports', 'services'
]
def __init__(self):
pass
def parse_csv_to_yaml(self, csv_file) -> str:
"""
Convert CSV file to YAML string.
Args:
csv_file: File object or file path
Returns:
YAML string
Raises:
ValueError: If CSV invalid
"""
pass
def validate_csv_structure(self, csv_file) -> Tuple[bool, List[str]]:
"""
Validate CSV structure and content.
Returns:
(is_valid, error_messages)
"""
pass
def _parse_port_list(self, value: str) -> List[int]:
"""Parse comma-separated port list"""
pass
def _parse_string_list(self, value: str) -> List[str]:
"""Parse comma-separated string list"""
pass
def _parse_bool(self, value: str, default: bool = False) -> bool:
"""Parse boolean value (true/false/1/0)"""
pass
def _validate_ip_address(self, ip: str) -> bool:
"""Validate IPv4/IPv6 address format"""
pass
def _validate_port(self, port: int) -> bool:
"""Validate port number (1-65535)"""
pass
```
#### 4. Template Generator: `web/utils/template_generator.py`
**Function:**
```python
def generate_csv_template() -> str:
"""
Generate CSV template with headers and example rows.
Returns:
CSV string with headers and 2 example rows
"""
template = [
['scan_title', 'site_name', 'ip_address', 'ping_expected', 'tcp_ports', 'udp_ports', 'services'],
['Example Infrastructure Scan', 'Production Web Servers', '10.10.20.4', 'true', '22,80,443', '53', 'ssh,http,https'],
['Example Infrastructure Scan', 'Production Web Servers', '10.10.20.5', 'true', '22,3306', '', 'ssh,mysql'],
]
output = io.StringIO()
writer = csv.writer(output)
writer.writerows(template)
return output.getvalue()
```
### Frontend Components
#### 1. New Route: Configs Management Page
**File:** `web/routes/main.py`
```python
@bp.route('/configs')
@login_required
def configs():
"""Config management page"""
return render_template('configs.html')
@bp.route('/configs/upload')
@login_required
def upload_config():
"""Config upload page"""
return render_template('config_upload.html')
```
#### 2. Template: Config List Page
**File:** `web/templates/configs.html`
**Features:**
- Table listing all configs with columns:
- Filename
- Title (from YAML)
- Created date
- Size
- Used by schedules (badge count)
- Actions (view, download, delete)
- "Create New Config" button → redirects to upload page
- "Download CSV Template" button
- Delete confirmation modal
- Search/filter box (client-side)
**Layout:**
```html
{% extends "base.html" %}
{% block content %}
<div class="container mt-4">
<div class="d-flex justify-content-between align-items-center mb-4">
<h2>Configuration Files</h2>
<div>
<a href="{{ url_for('main.upload_config') }}" class="btn btn-primary">
<i class="bi bi-plus-circle"></i> Create New Config
</a>
<a href="{{ url_for('api_configs.download_template') }}"
class="btn btn-outline-secondary" download>
<i class="bi bi-download"></i> Download CSV Template
</a>
</div>
</div>
<div class="card">
<div class="card-body">
<div class="mb-3">
<input type="text" id="search" class="form-control"
placeholder="Search configs...">
</div>
<table class="table table-hover" id="configs-table">
<thead>
<tr>
<th>Filename</th>
<th>Title</th>
<th>Created</th>
<th>Size</th>
<th>Used By</th>
<th>Actions</th>
</tr>
</thead>
<tbody>
<!-- Populated via JavaScript -->
</tbody>
</table>
</div>
</div>
</div>
<!-- Delete Confirmation Modal -->
<div class="modal fade" id="deleteModal" tabindex="-1">
<!-- Modal content -->
</div>
{% endblock %}
```
#### 3. Template: Config Upload Page
**File:** `web/templates/config_upload.html`
**Features:**
- Two upload methods (tabs):
- **Tab 1: CSV Upload** (default)
- Drag-drop zone or file picker (`.csv` only)
- Instructions: "Download template, fill with your data, upload here"
- Preview pane showing generated YAML after upload
- "Save Config" button (disabled until valid upload)
- **Tab 2: YAML Upload** (for advanced users)
- Drag-drop zone or file picker (`.yaml`, `.yml` only)
- Direct upload without conversion
- Real-time validation feedback
- Error messages with specific issues
- Success message with link to configs list
**Layout:**
```html
{% extends "base.html" %}
{% block content %}
<div class="container mt-4">
<h2>Create New Configuration</h2>
<ul class="nav nav-tabs mb-4" id="uploadTabs" role="tablist">
<li class="nav-item">
<a class="nav-link active" id="csv-tab" data-bs-toggle="tab"
href="#csv" role="tab">Upload CSV</a>
</li>
<li class="nav-item">
<a class="nav-link" id="yaml-tab" data-bs-toggle="tab"
href="#yaml" role="tab">Upload YAML</a>
</li>
</ul>
<div class="tab-content" id="uploadTabsContent">
<!-- CSV Upload Tab -->
<div class="tab-pane fade show active" id="csv" role="tabpanel">
<div class="row">
<div class="col-md-6">
<div class="card">
<div class="card-body">
<h5>Step 1: Upload CSV</h5>
<p class="text-muted">
<a href="{{ url_for('api_configs.download_template') }}" download>
Download template
</a> first if you haven't already.
</p>
<div id="csv-dropzone" class="dropzone">
<i class="bi bi-cloud-upload"></i>
<p>Drag & drop CSV file here or click to browse</p>
<input type="file" id="csv-file-input" accept=".csv" hidden>
</div>
<div id="csv-errors" class="alert alert-danger mt-3" style="display:none;">
</div>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card">
<div class="card-body">
<h5>Step 2: Preview & Save</h5>
<div id="yaml-preview" style="display:none;">
<pre><code id="yaml-content" class="language-yaml"></code></pre>
<button id="save-config-btn" class="btn btn-success mt-3">
Save Configuration
</button>
</div>
<div id="preview-placeholder" class="text-muted text-center py-5">
Upload a CSV file to see preview
</div>
</div>
</div>
</div>
</div>
</div>
<!-- YAML Upload Tab -->
<div class="tab-pane fade" id="yaml" role="tabpanel">
<div class="card">
<div class="card-body">
<h5>Upload YAML Configuration</h5>
<p class="text-muted">
For advanced users: upload a YAML config file directly.
</p>
<div id="yaml-dropzone" class="dropzone">
<i class="bi bi-cloud-upload"></i>
<p>Drag & drop YAML file here or click to browse</p>
<input type="file" id="yaml-file-input" accept=".yaml,.yml" hidden>
</div>
<button id="upload-yaml-btn" class="btn btn-primary mt-3" disabled>
Upload YAML
</button>
<div id="yaml-errors" class="alert alert-danger mt-3" style="display:none;">
</div>
</div>
</div>
</div>
</div>
</div>
{% endblock %}
```
#### 4. JavaScript: Config Manager
**File:** `web/static/js/config-manager.js`
**Functions:**
```javascript
class ConfigManager {
constructor() {
this.apiBase = '/api/configs';
}
// List configs and populate table
async loadConfigs() {
const response = await fetch(this.apiBase);
const data = await response.json();
this.renderConfigsTable(data.configs);
}
// Upload CSV file
async uploadCSV(file) {
const formData = new FormData();
formData.append('file', file);
const response = await fetch(`${this.apiBase}/upload-csv`, {
method: 'POST',
body: formData
});
if (!response.ok) {
const error = await response.json();
throw new Error(error.message);
}
return await response.json();
}
// Upload YAML file
async uploadYAML(file, filename) {
// Similar to uploadCSV
}
// Delete config
async deleteConfig(filename) {
const response = await fetch(`${this.apiBase}/${filename}`, {
method: 'DELETE'
});
if (!response.ok) {
const error = await response.json();
throw new Error(error.message);
}
return await response.json();
}
// Show preview of YAML
showPreview(yamlContent) {
document.getElementById('yaml-content').textContent = yamlContent;
document.getElementById('yaml-preview').style.display = 'block';
document.getElementById('preview-placeholder').style.display = 'none';
}
// Render configs table
renderConfigsTable(configs) {
// Populate table with config data
}
// Client-side search filter
filterConfigs(searchTerm) {
// Filter table rows by search term
}
}
// Drag-drop handlers
function setupDropzone(elementId, fileInputId, fileType) {
const dropzone = document.getElementById(elementId);
const fileInput = document.getElementById(fileInputId);
dropzone.addEventListener('click', () => fileInput.click());
dropzone.addEventListener('dragover', (e) => {
e.preventDefault();
dropzone.classList.add('dragover');
});
dropzone.addEventListener('drop', (e) => {
e.preventDefault();
dropzone.classList.remove('dragover');
const file = e.dataTransfer.files[0];
handleFileUpload(file, fileType);
});
fileInput.addEventListener('change', (e) => {
const file = e.target.files[0];
handleFileUpload(file, fileType);
});
}
```
#### 5. CSS Styling
**File:** `web/static/css/config-manager.css`
```css
.dropzone {
border: 2px dashed #6c757d;
border-radius: 8px;
padding: 40px;
text-align: center;
cursor: pointer;
transition: all 0.3s ease;
background-color: #f8f9fa;
}
.dropzone:hover,
.dropzone.dragover {
border-color: #0d6efd;
background-color: #e7f1ff;
}
.dropzone i {
font-size: 48px;
color: #6c757d;
margin-bottom: 16px;
}
#yaml-preview pre {
background-color: #f8f9fa;
border: 1px solid #dee2e6;
border-radius: 4px;
padding: 16px;
max-height: 400px;
overflow-y: auto;
}
.config-actions {
white-space: nowrap;
}
.config-actions .btn {
margin-right: 4px;
}
.schedule-badge {
background-color: #0d6efd;
color: white;
padding: 4px 8px;
border-radius: 4px;
font-size: 0.85em;
}
```
### Navigation Integration
**Update:** `web/templates/base.html`
Add "Configs" link to navigation menu:
```html
<nav class="navbar navbar-expand-lg navbar-dark bg-dark">
<div class="container-fluid">
<a class="navbar-brand" href="/">SneakyScanner</a>
<ul class="navbar-nav">
<li class="nav-item">
<a class="nav-link" href="{{ url_for('main.dashboard') }}">Dashboard</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{ url_for('main.scans') }}">Scans</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{ url_for('main.schedules') }}">Schedules</a>
</li>
<li class="nav-item">
<a class="nav-link" href="{{ url_for('main.configs') }}">Configs</a> <!-- NEW -->
</li>
<li class="nav-item">
<a class="nav-link" href="{{ url_for('main.settings') }}">Settings</a>
</li>
</ul>
</div>
</nav>
```
---
## Implementation Tasks
### Task Breakdown (15 tasks)
#### Backend (8 tasks)
1. **Create CSV parser utility** (`web/utils/csv_parser.py`)
- Implement `CSVConfigParser` class
- Parse CSV rows into dict structure
- Validate CSV structure (columns, data types)
- Handle edge cases (empty cells, quotes, commas in values)
- Write unit tests (10+ test cases)
2. **Create template generator** (`web/utils/template_generator.py`)
- Generate CSV template with headers + example rows
- Write unit tests
3. **Create config service** (`web/services/config_service.py`)
- Implement `ConfigService` class with all methods
- CSV-to-YAML conversion logic
- Filename sanitization and conflict detection
- Schedule dependency checking
- File operations (read, write, delete)
- Write unit tests (15+ test cases)
4. **Create configs API blueprint** (`web/api/configs.py`)
- Implement 6 API endpoints
- Error handling with proper HTTP status codes
- Request validation
- File upload handling (multipart/form-data)
- File download headers
- Integrate with ConfigService
5. **Register configs blueprint** (`web/app.py`)
- Import and register blueprint
- Add to API routes
6. **Add file upload limits** (`web/app.py`)
- Set MAX_CONTENT_LENGTH for uploads (2MB for CSV/YAML)
- Add file size validation
7. **Update existing services** (if needed)
- `schedule_service.py`: Add config dependency check method
- Ensure config validation called before scan trigger
8. **Write integration tests** (`tests/test_config_api.py`)
- Test all API endpoints (20+ test cases)
- Test CSV upload → conversion → scan trigger workflow
- Test error cases (invalid CSV, filename conflict, delete protection)
- Test file download
#### Frontend (5 tasks)
9. **Create configs list template** (`web/templates/configs.html`)
- Table layout with columns
- Action buttons (view, download, delete)
- Delete confirmation modal
- Search box
- Links to upload page and template download
10. **Create config upload template** (`web/templates/config_upload.html`)
- Two-tab interface (CSV upload, YAML upload)
- Drag-drop zones for both tabs
- YAML preview pane
- Error message displays
- Success/failure feedback
11. **Add routes to main blueprint** (`web/routes/main.py`)
- `/configs` - List configs
- `/configs/upload` - Upload page
12. **Create JavaScript module** (`web/static/js/config-manager.js`)
- ConfigManager class
- API fetch wrappers for all endpoints
- Drag-drop handlers
- File upload with FormData
- Table rendering and search filtering
- Preview display
- Error display helpers
13. **Create CSS styling** (`web/static/css/config-manager.css`)
- Dropzone styling with hover effects
- Preview pane styling
- Table action buttons
- Responsive layout adjustments
#### Integration & Documentation (2 tasks)
14. **Update navigation** (`web/templates/base.html`)
- Add "Configs" link to navbar
- Set active state based on current route
15. **Update documentation**
- Update README.md with Config Creator section
- Update API_REFERENCE.md with new endpoints
- Create this document (Phase4.md) ✓
---
## Testing Strategy
### Unit Tests
**File:** `tests/test_csv_parser.py`
Test cases (10+):
- Valid CSV with multiple sites and IPs
- Single site, single IP
- Empty optional fields (udp_ports, services)
- Boolean parsing (true/false/TRUE/FALSE)
- Port list parsing (single port, multiple ports, quoted/unquoted)
- Invalid IP addresses
- Invalid port numbers (0, 65536, negative)
- Missing required columns
- Inconsistent scan_title across rows
- Empty CSV (no data rows)
- Malformed CSV (wrong number of columns)
**File:** `tests/test_config_service.py`
Test cases (15+):
- List configs (empty directory, multiple configs)
- Get config (valid, non-existent)
- Create from YAML (valid, invalid syntax, duplicate filename)
- Create from CSV (valid, invalid CSV, duplicate filename)
- Delete config (valid, non-existent, used by schedule)
- Validate config content (valid, missing title, missing sites, invalid structure)
- Get schedules using config (none, multiple)
- Generate filename from title (various titles, special characters, long titles)
### Integration Tests
**File:** `tests/test_config_api.py`
Test cases (20+):
- **GET /api/configs**
- List configs successfully
- Empty list when no configs exist
- Includes schedule usage counts
- **GET /api/configs/<filename>**
- Get existing config
- 404 for non-existent config
- Returns parsed YAML data
- **POST /api/configs/upload-csv**
- Upload valid CSV → creates YAML
- Upload invalid CSV → 400 error with details
- Upload CSV with duplicate filename → 409 error
- Upload non-CSV file → 400 error
- Upload file too large → 413 error
- **POST /api/configs/upload-yaml**
- Upload valid YAML → creates config
- Upload invalid YAML syntax → 400 error
- Upload YAML missing required fields → 400 error
- **GET /api/configs/template**
- Download CSV template
- Returns text/csv mimetype
- Template has correct headers and examples
- **DELETE /api/configs/<filename>**
- Delete unused config → success
- Delete used-by-schedule config → 422 error with schedule list
- Delete non-existent config → 404 error
- **Authentication**
- All endpoints require auth
- Unauthenticated requests → 401 error
### End-to-End Tests
**File:** `tests/test_config_workflow.py`
Test cases (5):
1. **Complete CSV workflow:**
- Download template
- Upload modified CSV
- Verify YAML created correctly
- Trigger scan with new config
- Verify scan completes successfully
2. **Config deletion protection:**
- Create config
- Create schedule using config
- Attempt to delete config → fails with error
- Disable schedule
- Delete config → succeeds
3. **Filename conflict handling:**
- Create config "test-scan.yaml"
- Upload CSV with same title
- Verify error returned
- User changes filename → succeeds
4. **YAML direct upload:**
- Upload valid YAML
- Config immediately usable
- Trigger scan → works
5. **CSV validation errors:**
- Upload CSV with invalid IP
- Verify clear error message returned
- Fix CSV and re-upload → succeeds
### Manual Testing Checklist
**UI/UX:**
- [ ] Drag-drop file upload works
- [ ] File picker works
- [ ] Preview shows correct YAML
- [ ] Error messages are clear and actionable
- [ ] Success messages appear after save
- [ ] Table search/filter works
- [ ] Delete confirmation modal works
- [ ] Navigation links work
- [ ] Responsive layout on mobile
**File Handling:**
- [ ] CSV template downloads correctly
- [ ] CSV with special characters (commas, quotes) parses correctly
- [ ] Large CSV files upload successfully
- [ ] YAML files with UTF-8 characters work
- [ ] Generated YAML is valid and scanner accepts it
**Error Cases:**
- [ ] Invalid CSV format shows helpful error
- [ ] Duplicate filename shows conflict error
- [ ] Delete protected config shows which schedules use it
- [ ] Network errors handled gracefully
- [ ] File too large shows size limit error
---
## Security Considerations
### Input Validation
1. **Filename sanitization:**
- Use `werkzeug.utils.secure_filename()`
- Remove path traversal attempts (`../`, `./`)
- Limit filename length (200 chars)
- Only allow alphanumeric, hyphens, underscores
2. **File type validation:**
- Check file extension (`.csv`, `.yaml`, `.yml`)
- Verify MIME type matches extension
- Reject executable or script file extensions
3. **CSV content validation:**
- Validate all IPs with `ipaddress` module
- Validate port ranges (1-65535)
- Limit CSV row count (max 1000 rows)
- Limit CSV file size (max 2MB)
4. **YAML parsing security:**
- Always use `yaml.safe_load()` (never `yaml.load()`)
- Prevents arbitrary code execution
- Only load basic data types (dict, list, str, int, bool)
### Access Control
1. **Authentication required:**
- All API endpoints require `@api_auth_required`
- All web routes require `@login_required`
- Single-user model (no multi-tenant concerns)
2. **File system access:**
- Restrict all operations to `/app/configs/` directory
- Validate no path traversal in any file operations
- Use `os.path.join()` and `os.path.abspath()` safely
### File Upload Security
1. **Size limits:**
- CSV: 2MB max
- YAML: 2MB max
- Configurable in Flask config
2. **Rate limiting (future consideration):**
- Limit upload frequency per session
- Prevent DoS via repeated large uploads
3. **Virus scanning (future consideration):**
- For production deployments
- Scan uploaded files with ClamAV
---
## Database Changes
**No database schema changes required for Phase 4.**
Configs are stored as files, not in database. However, for future enhancement, consider adding:
```python
# Optional future enhancement (not Phase 4)
class Config(Base):
__tablename__ = 'configs'
id = Column(Integer, primary_key=True)
filename = Column(String(255), unique=True, nullable=False, index=True)
title = Column(String(255), nullable=False)
created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
created_by = Column(String(50)) # 'csv', 'yaml', 'manual'
file_hash = Column(String(64)) # SHA256 hash for change detection
```
Benefits of database tracking:
- Faster metadata queries (no need to parse YAML for title)
- Change detection and versioning
- Usage statistics (how often config used)
- Search and filtering
**Decision:** Keep Phase 4 simple (file-based only). Add database tracking in Phase 5+ if needed.
---
## Integration with Existing Features
### Dashboard Integration
**Current:** Dashboard has "Run Scan Now" button with config dropdown
**Change:** Config dropdown populated via `/api/configs` instead of filesystem scan
**File:** `web/templates/dashboard.html`
```javascript
// Before (Phase 3)
fetch('/api/configs-list') // Doesn't exist yet
// After (Phase 4)
fetch('/api/configs')
.then(res => res.json())
.then(data => {
const select = document.getElementById('config-select');
data.configs.forEach(config => {
const option = new Option(config.title, config.filename);
select.add(option);
});
});
```
### Schedule Management Integration
**Current:** Schedule form has config file input (text field or dropdown)
**Change:** Config selector uses `/api/configs` to show available configs
**File:** `web/templates/schedule_form.html`
```html
<!-- Before -->
<input type="text" name="config_file" placeholder="/app/configs/example.yaml">
<!-- After -->
<select name="config_file" id="config-select">
<option value="">Select a configuration...</option>
<!-- Populated via JavaScript from /api/configs -->
</select>
<p class="text-muted">
Don't see your config? <a href="{{ url_for('main.upload_config') }}">Create a new one</a>
</p>
```
### Scan Trigger Integration
**Current:** Scan trigger validates config file exists
**Change:** No changes needed, validation already in place via `validators.validate_config_file()`
**File:** `web/services/scan_service.py`
```python
def trigger_scan(self, config_file: str, triggered_by: str = 'manual'):
# Existing validation
is_valid, error = validate_config_file(config_file)
if not is_valid:
raise ValueError(f"Invalid config file: {error}")
# Continue with scan...
```
---
## Success Criteria
### Phase 4 Complete When:
1. **CSV Template Download:**
- ✓ User can download CSV template from UI
- ✓ Template includes headers and example rows
- ✓ Template format matches specification
2. **CSV Upload & Conversion:**
- ✓ User can upload CSV via drag-drop or file picker
- ✓ CSV validates correctly (structure, data types, IPs, ports)
- ✓ CSV converts to valid YAML
- ✓ Generated YAML matches expected structure
- ✓ Config saved to `/app/configs/` directory
3. **YAML Upload:**
- ✓ User can upload YAML directly
- ✓ YAML validates before saving
- ✓ Invalid YAML shows clear error message
4. **Config Management:**
- ✓ User can list all configs with metadata
- ✓ User can view config content (YAML)
- ✓ User can download existing configs
- ✓ User can delete unused configs
- ✓ Deletion blocked if config used by schedules
5. **Integration:**
- ✓ Generated configs work with scan trigger
- ✓ Configs appear in dashboard scan selector
- ✓ Configs appear in schedule form selector
- ✓ No breaking changes to existing workflows
6. **Testing:**
- ✓ All unit tests pass (25+ tests)
- ✓ All integration tests pass (20+ tests)
- ✓ Manual testing checklist complete
7. **Documentation:**
- ✓ README.md updated with Config Creator section
- ✓ API_REFERENCE.md updated with new endpoints
- ✓ Phase4.md created (this document)
---
## Timeline Estimate
**Total Duration:** 4-5 days
### Day 1: Backend Foundation (CSV Parser & Service)
- Create `csv_parser.py` with CSVConfigParser class (3 hours)
- Write unit tests for CSV parser (2 hours)
- Create `template_generator.py` (1 hour)
- Create `config_service.py` skeleton (2 hours)
### Day 2: Backend Implementation (Config Service & API)
- Implement all ConfigService methods (4 hours)
- Write unit tests for ConfigService (3 hours)
- Create `configs.py` API blueprint (2 hours)
### Day 3: Backend Testing & Frontend Start
- Write integration tests for API endpoints (3 hours)
- Create `configs.html` template (2 hours)
- Create `config_upload.html` template (2 hours)
- Add routes to `main.py` (1 hour)
### Day 4: Frontend JavaScript & Styling
- Create `config-manager.js` with all functionality (4 hours)
- Implement drag-drop handlers (2 hours)
- Create `config-manager.css` styling (1 hour)
- Update navigation in `base.html` (30 min)
- Manual testing and bug fixes (2 hours)
### Day 5: Integration & Documentation
- End-to-end testing (2 hours)
- Update dashboard/schedule integration (2 hours)
- Update README.md and API_REFERENCE.md (2 hours)
- Final manual testing checklist (2 hours)
- Bug fixes and polish (2 hours)
**Buffer:** +1 day for unexpected issues or scope additions
---
## Future Enhancements (Post-Phase 4)
Not in scope for Phase 4, but consider for future phases:
1. **Config Editor UI (Phase 5+):**
- Web form to create configs without CSV
- Add/remove sites and IPs dynamically
- Port picker with common presets
- Inline YAML editor with syntax highlighting
2. **Config Versioning (Phase 5+):**
- Track config changes over time
- Compare versions (diff view)
- Rollback to previous version
- Store versions in database
3. **CSV Export (Phase 5+):**
- Export existing YAML configs to CSV
- Edit in Excel and re-upload
- Useful for bulk updates
4. **Config Templates (Phase 6+):**
- Pre-built templates for common scenarios:
- Web server infrastructure
- Database cluster
- Network devices
- User selects template, fills IPs, done
5. **Bulk Import (Phase 6+):**
- Upload multiple CSV files at once
- ZIP archive with multiple configs
- Import from external sources (CMDB, spreadsheet)
6. **Config Validation on Schedule (Phase 6+):**
- Periodic validation of all configs
- Alert if config becomes invalid (file deleted, corrupted)
- Auto-disable schedules with invalid configs
7. **Config Metadata & Tags (Phase 6+):**
- Add tags/labels to configs (environment, team, etc.)
- Filter/search by tags
- Group related configs
8. **Config Diff Tool (Phase 6+):**
- Compare two configs side-by-side
- Highlight differences (IPs, ports, sites)
- Useful for environment parity checks
---
## Open Questions
### Resolved:
- ✓ One CSV = one config or multiple? **One config**
- ✓ Form editor or upload only? **Upload only (form later)**
- ✓ Config versioning? **No (Phase 4), maybe later**
- ✓ Delete protection? **Block if used by schedules**
### Remaining:
- **Filename handling:** Auto-generate from title or let user specify?
- **Recommendation:** Auto-generate with option to customize (add filename input field)
- **Duplicate IP handling:** Allow or reject duplicate IPs in same config?
- **Recommendation:** Allow (user might scan same IP with different expected ports)
- **Config validation frequency:** Validate on upload only, or re-validate periodically?
- **Recommendation:** Upload only (Phase 4), periodic validation in Phase 6+
- **CSV encoding:** Support only UTF-8 or other encodings?
- **Recommendation:** UTF-8 only (standard, avoids complexity)
---
## Risk Assessment
### Low Risk:
- CSV parsing (standard library, well-tested)
- File upload handling (Flask/werkzeug built-in)
- YAML generation (PyYAML library)
### Medium Risk:
- Complex CSV validation edge cases (commas in values, quotes)
- **Mitigation:** Comprehensive unit tests, use csv.reader()
- Filename conflicts and race conditions
- **Mitigation:** Check existence before write, use atomic operations
### High Risk:
- Breaking existing scan/schedule workflows
- **Mitigation:** Extensive integration tests, no changes to existing validation
- Security vulnerabilities (path traversal, code injection)
- **Mitigation:** Use secure_filename(), yaml.safe_load(), input validation
### Contingency:
- If CSV parsing too complex: Start with YAML upload only, add CSV in Phase 4.5
- If schedule deletion check too slow: Cache schedule-config mappings
- If file-based config management becomes bottleneck: Migrate to database in Phase 5
---
## Deployment Notes
### Requirements:
- No new Python dependencies (csv, yaml, os, io all in stdlib)
- No new system dependencies
- No database migrations
### Configuration:
Add to `.env` (optional):
```bash
# Config upload limits
MAX_CONFIG_SIZE_MB=2
MAX_CSV_ROWS=1000
CONFIGS_DIR=/app/configs
```
### Deployment Steps:
1. Pull latest code
2. Restart Flask app: `docker-compose -f docker-compose-web.yml restart web`
3. Verify `/api/configs/template` endpoint works
4. Test CSV upload with template
### Rollback Plan:
- No database changes, so rollback is safe
- Revert code changes and restart
- Configs created during Phase 4 remain valid (files are backward compatible)
---
## References
### Related Documentation:
- [Phase 2 Complete](PHASE2_COMPLETE.md) - REST API patterns, authentication
- [API Reference](API_REFERENCE.md) - Existing API structure
- [Deployment Guide](DEPLOYMENT.md) - Production deployment
### External Resources:
- [Python csv module docs](https://docs.python.org/3/library/csv.html)
- [PyYAML documentation](https://pyyaml.org/wiki/PyYAMLDocumentation)
- [Flask file upload guide](https://flask.palletsprojects.com/en/3.0.x/patterns/fileuploads/)
- [Werkzeug secure_filename](https://werkzeug.palletsprojects.com/en/3.0.x/utils/#werkzeug.utils.secure_filename)
### Code References:
- `src/scanner.py:42-54` - Config loading and validation
- `web/utils/validators.py:14-85` - Existing validation patterns
- `web/services/scan_service.py` - Scan trigger with config validation
- `web/api/scans.py` - API endpoint patterns
---
## Changelog
| Date | Version | Changes |
|------|---------|---------|
| 2025-11-17 | 1.0 | Initial Phase 4 plan created based on user requirements and design decisions |
---
**Last Updated:** 2025-11-17
**Next Review:** After Phase 4 implementation complete
**Approval Status:** Ready for Implementation
---
**Phase 4 Goal:** Enable non-technical users to create scan configurations via CSV upload, eliminating the need for manual YAML editing and server file access.