Files
SneakyScan/web/api/scans.py
Phillip Tarrant ee0c5a2c3c Phase 2 Step 3: Implement Background Job Queue
Implemented APScheduler integration for background scan execution,
enabling async job processing without blocking HTTP requests.

## Changes

### Background Jobs (web/jobs/)
- scan_job.py - Execute scans in background threads
  - execute_scan() with isolated database sessions
  - Comprehensive error handling and logging
  - Scan status lifecycle tracking
  - Timing and error message storage

### Scheduler Service (web/services/scheduler_service.py)
- SchedulerService class for job management
- APScheduler BackgroundScheduler integration
- ThreadPoolExecutor for concurrent jobs (max 3 workers)
- queue_scan() - Immediate job execution
- Job monitoring: list_jobs(), get_job_status()
- Graceful shutdown handling

### Flask Integration (web/app.py)
- init_scheduler() function
- Scheduler initialization in app factory
- Stored scheduler in app context (app.scheduler)

### Database Schema (migration 003)
- Added scan timing fields:
  - started_at - Scan execution start time
  - completed_at - Scan execution completion time
  - error_message - Error details for failed scans

### Service Layer Updates (web/services/scan_service.py)
- trigger_scan() accepts scheduler parameter
- Queues background jobs after creating scan record
- get_scan_status() includes new timing and error fields
- _save_scan_to_db() sets completed_at timestamp

### API Updates (web/api/scans.py)
- POST /api/scans passes scheduler to trigger_scan()
- Scans now execute in background automatically

### Model Updates (web/models.py)
- Added started_at, completed_at, error_message to Scan model

### Testing (tests/test_background_jobs.py)
- 13 unit tests for background job execution
- Scheduler initialization and configuration tests
- Job queuing and status tracking tests
- Scan timing field tests
- Error handling and storage tests
- Integration test for full workflow (skipped by default)

## Features

- Async scan execution without blocking HTTP requests
- Concurrent scan support (configurable max workers)
- Isolated database sessions per background thread
- Scan lifecycle tracking: created → running → completed/failed
- Error messages captured and stored in database
- Job monitoring and management capabilities
- Graceful shutdown waits for running jobs

## Implementation Notes

- Scanner runs in subprocess from background thread
- Docker provides necessary privileges (--privileged, --network host)
- Each job gets isolated SQLAlchemy session (avoid locking)
- Job IDs follow pattern: scan_{scan_id}
- Background jobs survive across requests
- Failed jobs store error messages in database

## Documentation (docs/ai/PHASE2.md)
- Updated progress: 6/14 days complete (43%)
- Marked Step 3 as complete
- Added detailed implementation notes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 09:24:00 -06:00

301 lines
8.8 KiB
Python

"""
Scans API blueprint.
Handles endpoints for triggering scans, listing scan history, and retrieving
scan results.
"""
import logging
from flask import Blueprint, current_app, jsonify, request
from sqlalchemy.exc import SQLAlchemyError
from web.services.scan_service import ScanService
from web.utils.validators import validate_config_file, validate_page_params
bp = Blueprint('scans', __name__)
logger = logging.getLogger(__name__)
@bp.route('', methods=['GET'])
def list_scans():
"""
List all scans with pagination.
Query params:
page: Page number (default: 1)
per_page: Items per page (default: 20, max: 100)
status: Filter by status (running, completed, failed)
Returns:
JSON response with scans list and pagination info
"""
try:
# Get and validate query parameters
page = request.args.get('page', 1, type=int)
per_page = request.args.get('per_page', 20, type=int)
status_filter = request.args.get('status', None, type=str)
# Validate pagination params
page, per_page = validate_page_params(page, per_page)
# Get scans from service
scan_service = ScanService(current_app.db_session)
paginated_result = scan_service.list_scans(
page=page,
per_page=per_page,
status_filter=status_filter
)
logger.info(f"Listed scans: page={page}, per_page={per_page}, status={status_filter}, total={paginated_result.total}")
return jsonify({
'scans': paginated_result.items,
'total': paginated_result.total,
'page': paginated_result.page,
'per_page': paginated_result.per_page,
'total_pages': paginated_result.total_pages,
'has_prev': paginated_result.has_prev,
'has_next': paginated_result.has_next
})
except ValueError as e:
logger.warning(f"Invalid request parameters: {str(e)}")
return jsonify({
'error': 'Invalid request',
'message': str(e)
}), 400
except SQLAlchemyError as e:
logger.error(f"Database error listing scans: {str(e)}")
return jsonify({
'error': 'Database error',
'message': 'Failed to retrieve scans'
}), 500
except Exception as e:
logger.error(f"Unexpected error listing scans: {str(e)}", exc_info=True)
return jsonify({
'error': 'Internal server error',
'message': 'An unexpected error occurred'
}), 500
@bp.route('/<int:scan_id>', methods=['GET'])
def get_scan(scan_id):
"""
Get details for a specific scan.
Args:
scan_id: Scan ID
Returns:
JSON response with scan details
"""
try:
# Get scan from service
scan_service = ScanService(current_app.db_session)
scan = scan_service.get_scan(scan_id)
if not scan:
logger.warning(f"Scan not found: {scan_id}")
return jsonify({
'error': 'Not found',
'message': f'Scan with ID {scan_id} not found'
}), 404
logger.info(f"Retrieved scan details: {scan_id}")
return jsonify(scan)
except SQLAlchemyError as e:
logger.error(f"Database error retrieving scan {scan_id}: {str(e)}")
return jsonify({
'error': 'Database error',
'message': 'Failed to retrieve scan'
}), 500
except Exception as e:
logger.error(f"Unexpected error retrieving scan {scan_id}: {str(e)}", exc_info=True)
return jsonify({
'error': 'Internal server error',
'message': 'An unexpected error occurred'
}), 500
@bp.route('', methods=['POST'])
def trigger_scan():
"""
Trigger a new scan.
Request body:
config_file: Path to YAML config file
Returns:
JSON response with scan_id and status
"""
try:
# Get request data
data = request.get_json() or {}
config_file = data.get('config_file')
# Validate required fields
if not config_file:
logger.warning("Scan trigger request missing config_file")
return jsonify({
'error': 'Invalid request',
'message': 'config_file is required'
}), 400
# Trigger scan via service
scan_service = ScanService(current_app.db_session)
scan_id = scan_service.trigger_scan(
config_file=config_file,
triggered_by='api',
scheduler=current_app.scheduler
)
logger.info(f"Scan {scan_id} triggered via API: config={config_file}")
return jsonify({
'scan_id': scan_id,
'status': 'running',
'message': 'Scan queued successfully'
}), 201
except ValueError as e:
# Config file validation error
logger.warning(f"Invalid config file: {str(e)}")
return jsonify({
'error': 'Invalid request',
'message': str(e)
}), 400
except SQLAlchemyError as e:
logger.error(f"Database error triggering scan: {str(e)}")
return jsonify({
'error': 'Database error',
'message': 'Failed to create scan'
}), 500
except Exception as e:
logger.error(f"Unexpected error triggering scan: {str(e)}", exc_info=True)
return jsonify({
'error': 'Internal server error',
'message': 'An unexpected error occurred'
}), 500
@bp.route('/<int:scan_id>', methods=['DELETE'])
def delete_scan(scan_id):
"""
Delete a scan and its associated files.
Args:
scan_id: Scan ID to delete
Returns:
JSON response with deletion status
"""
try:
# Delete scan via service
scan_service = ScanService(current_app.db_session)
scan_service.delete_scan(scan_id)
logger.info(f"Scan {scan_id} deleted successfully")
return jsonify({
'scan_id': scan_id,
'message': 'Scan deleted successfully'
}), 200
except ValueError as e:
# Scan not found
logger.warning(f"Scan deletion failed: {str(e)}")
return jsonify({
'error': 'Not found',
'message': str(e)
}), 404
except SQLAlchemyError as e:
logger.error(f"Database error deleting scan {scan_id}: {str(e)}")
return jsonify({
'error': 'Database error',
'message': 'Failed to delete scan'
}), 500
except Exception as e:
logger.error(f"Unexpected error deleting scan {scan_id}: {str(e)}", exc_info=True)
return jsonify({
'error': 'Internal server error',
'message': 'An unexpected error occurred'
}), 500
@bp.route('/<int:scan_id>/status', methods=['GET'])
def get_scan_status(scan_id):
"""
Get current status of a running scan.
Args:
scan_id: Scan ID
Returns:
JSON response with scan status and progress
"""
try:
# Get scan status from service
scan_service = ScanService(current_app.db_session)
status = scan_service.get_scan_status(scan_id)
if not status:
logger.warning(f"Scan not found for status check: {scan_id}")
return jsonify({
'error': 'Not found',
'message': f'Scan with ID {scan_id} not found'
}), 404
logger.debug(f"Retrieved status for scan {scan_id}: {status['status']}")
return jsonify(status)
except SQLAlchemyError as e:
logger.error(f"Database error retrieving scan status {scan_id}: {str(e)}")
return jsonify({
'error': 'Database error',
'message': 'Failed to retrieve scan status'
}), 500
except Exception as e:
logger.error(f"Unexpected error retrieving scan status {scan_id}: {str(e)}", exc_info=True)
return jsonify({
'error': 'Internal server error',
'message': 'An unexpected error occurred'
}), 500
@bp.route('/<int:scan_id1>/compare/<int:scan_id2>', methods=['GET'])
def compare_scans(scan_id1, scan_id2):
"""
Compare two scans and show differences.
Args:
scan_id1: First scan ID
scan_id2: Second scan ID
Returns:
JSON response with comparison results
"""
# TODO: Implement in Phase 4
return jsonify({
'scan_id1': scan_id1,
'scan_id2': scan_id2,
'diff': {},
'message': 'Scan comparison endpoint - to be implemented in Phase 4'
})
# Health check endpoint
@bp.route('/health', methods=['GET'])
def health_check():
"""
Health check endpoint for monitoring.
Returns:
JSON response with API health status
"""
return jsonify({
'status': 'healthy',
'api': 'scans',
'version': '1.0.0-phase1'
})