Phase 2 Step 3: Implement Background Job Queue
Implemented APScheduler integration for background scan execution,
enabling async job processing without blocking HTTP requests.
## Changes
### Background Jobs (web/jobs/)
- scan_job.py - Execute scans in background threads
- execute_scan() with isolated database sessions
- Comprehensive error handling and logging
- Scan status lifecycle tracking
- Timing and error message storage
### Scheduler Service (web/services/scheduler_service.py)
- SchedulerService class for job management
- APScheduler BackgroundScheduler integration
- ThreadPoolExecutor for concurrent jobs (max 3 workers)
- queue_scan() - Immediate job execution
- Job monitoring: list_jobs(), get_job_status()
- Graceful shutdown handling
### Flask Integration (web/app.py)
- init_scheduler() function
- Scheduler initialization in app factory
- Stored scheduler in app context (app.scheduler)
### Database Schema (migration 003)
- Added scan timing fields:
- started_at - Scan execution start time
- completed_at - Scan execution completion time
- error_message - Error details for failed scans
### Service Layer Updates (web/services/scan_service.py)
- trigger_scan() accepts scheduler parameter
- Queues background jobs after creating scan record
- get_scan_status() includes new timing and error fields
- _save_scan_to_db() sets completed_at timestamp
### API Updates (web/api/scans.py)
- POST /api/scans passes scheduler to trigger_scan()
- Scans now execute in background automatically
### Model Updates (web/models.py)
- Added started_at, completed_at, error_message to Scan model
### Testing (tests/test_background_jobs.py)
- 13 unit tests for background job execution
- Scheduler initialization and configuration tests
- Job queuing and status tracking tests
- Scan timing field tests
- Error handling and storage tests
- Integration test for full workflow (skipped by default)
## Features
- Async scan execution without blocking HTTP requests
- Concurrent scan support (configurable max workers)
- Isolated database sessions per background thread
- Scan lifecycle tracking: created → running → completed/failed
- Error messages captured and stored in database
- Job monitoring and management capabilities
- Graceful shutdown waits for running jobs
## Implementation Notes
- Scanner runs in subprocess from background thread
- Docker provides necessary privileges (--privileged, --network host)
- Each job gets isolated SQLAlchemy session (avoid locking)
- Job IDs follow pattern: scan_{scan_id}
- Background jobs survive across requests
- Failed jobs store error messages in database
## Documentation (docs/ai/PHASE2.md)
- Updated progress: 6/14 days complete (43%)
- Marked Step 3 as complete
- Added detailed implementation notes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
# Phase 2 Implementation Plan: Flask Web App Core
|
||||
|
||||
**Status:** Step 2 Complete ✅ - Scan API Endpoints (Days 3-4)
|
||||
**Progress:** 4/14 days complete (29%)
|
||||
**Status:** Step 3 Complete ✅ - Background Job Queue (Days 5-6)
|
||||
**Progress:** 6/14 days complete (43%)
|
||||
**Estimated Duration:** 14 days (2 weeks)
|
||||
**Dependencies:** Phase 1 Complete ✅
|
||||
|
||||
@@ -18,8 +18,14 @@
|
||||
- Comprehensive error handling and logging
|
||||
- 24 integration tests written
|
||||
- 300+ lines of code added
|
||||
- ⏳ **Step 3: Background Job Queue** (Days 5-6) - NEXT
|
||||
- 📋 **Step 4: Authentication System** (Days 7-8) - Pending
|
||||
- ✅ **Step 3: Background Job Queue** (Days 5-6) - COMPLETE
|
||||
- APScheduler integration with BackgroundScheduler
|
||||
- Scan execution in background threads
|
||||
- SchedulerService with job management
|
||||
- Database migration for scan timing fields
|
||||
- 13 unit tests (scheduler, timing, errors)
|
||||
- 600+ lines of code added
|
||||
- ⏳ **Step 4: Authentication System** (Days 7-8) - NEXT
|
||||
- 📋 **Step 5: Basic UI Templates** (Days 9-10) - Pending
|
||||
- 📋 **Step 6: Docker & Deployment** (Day 11) - Pending
|
||||
- 📋 **Step 7: Error Handling & Logging** (Day 12) - Pending
|
||||
@@ -667,35 +673,83 @@ Update with Phase 2 progress.
|
||||
|
||||
**Solution Implemented:** POST /api/scans immediately returns scan_id with status 'running', client polls GET /api/scans/<id>/status for updates
|
||||
|
||||
### Step 3: Background Job Queue ⏱️ Days 5-6
|
||||
### Step 3: Background Job Queue ✅ COMPLETE (Days 5-6)
|
||||
**Priority: HIGH** - Async scan execution
|
||||
|
||||
**Tasks:**
|
||||
1. Create `web/jobs/` package
|
||||
2. Implement `scan_job.py`:
|
||||
- `execute_scan()` function runs scanner
|
||||
- Update scan status in DB (running → completed/failed)
|
||||
- Handle exceptions and timeouts
|
||||
3. Create `SchedulerService` class (basic version)
|
||||
- Initialize APScheduler with BackgroundScheduler
|
||||
- Add job management methods
|
||||
4. Integrate APScheduler with Flask app
|
||||
- Initialize in app factory
|
||||
- Store scheduler instance in app context
|
||||
5. Update `POST /api/scans` to queue job instead of blocking
|
||||
6. Test background execution
|
||||
**Status:** ✅ Complete - Committed: [pending]
|
||||
|
||||
**Testing:**
|
||||
- Trigger scan via API
|
||||
- Verify scan runs in background
|
||||
- Check status updates correctly
|
||||
- Test scan failure scenarios
|
||||
- Verify scanner subprocess isolation
|
||||
- Test concurrent scans
|
||||
**Tasks Completed:**
|
||||
1. ✅ Created `web/jobs/` package structure
|
||||
2. ✅ Implemented `web/jobs/scan_job.py` (130 lines):
|
||||
- `execute_scan()` - Runs scanner in background thread
|
||||
- Creates isolated database session per thread
|
||||
- Updates scan status: running → completed/failed
|
||||
- Handles exceptions with detailed error logging
|
||||
- Stores error messages in database
|
||||
- Tracks timing with started_at/completed_at
|
||||
3. ✅ Created `SchedulerService` class (web/services/scheduler_service.py - 220 lines):
|
||||
- Initialized APScheduler with BackgroundScheduler
|
||||
- ThreadPoolExecutor for concurrent jobs (max 3 workers)
|
||||
- `queue_scan()` - Queue immediate scan execution
|
||||
- `add_scheduled_scan()` - Placeholder for future scheduled scans
|
||||
- `remove_scheduled_scan()` - Remove scheduled jobs
|
||||
- `list_jobs()` and `get_job_status()` - Job monitoring
|
||||
- Graceful shutdown handling
|
||||
4. ✅ Integrated APScheduler with Flask app (web/app.py):
|
||||
- Created `init_scheduler()` function
|
||||
- Initialized in app factory after extensions
|
||||
- Stored scheduler in app context (`app.scheduler`)
|
||||
5. ✅ Updated `ScanService.trigger_scan()` to queue background jobs:
|
||||
- Added `scheduler` parameter
|
||||
- Queues job immediately after creating scan record
|
||||
- Handles job queuing failures gracefully
|
||||
6. ✅ Added database fields for scan timing (migration 003):
|
||||
- `started_at` - When scan execution began
|
||||
- `completed_at` - When scan finished
|
||||
- `error_message` - Error details for failed scans
|
||||
7. ✅ Updated `ScanService.get_scan_status()` to include new fields
|
||||
8. ✅ Updated API endpoint `POST /api/scans` to pass scheduler
|
||||
|
||||
**Key Challenge:** Scanner requires privileged operations (masscan/nmap)
|
||||
**Testing Results:**
|
||||
- ✅ 13 unit tests for background jobs and scheduler
|
||||
- ✅ Tests for scheduler initialization
|
||||
- ✅ Tests for job queuing and status tracking
|
||||
- ✅ Tests for scan timing fields
|
||||
- ✅ Tests for error handling and storage
|
||||
- ✅ Tests for job listing and monitoring
|
||||
- ✅ Integration test for full workflow (skipped by default - requires scanner)
|
||||
|
||||
**Solution:** Run in subprocess with proper privileges via Docker
|
||||
**Files Created:**
|
||||
- web/jobs/__init__.py (6 lines)
|
||||
- web/jobs/scan_job.py (130 lines)
|
||||
- web/services/scheduler_service.py (220 lines)
|
||||
- migrations/versions/003_add_scan_timing_fields.py (38 lines)
|
||||
- tests/test_background_jobs.py (232 lines)
|
||||
|
||||
**Files Modified:**
|
||||
- web/app.py (added init_scheduler function and call)
|
||||
- web/models.py (added 3 fields to Scan model)
|
||||
- web/services/scan_service.py (updated trigger_scan and get_scan_status)
|
||||
- web/api/scans.py (pass scheduler to trigger_scan)
|
||||
|
||||
**Total:** 5 files created, 4 files modified, 626 lines added
|
||||
|
||||
**Key Implementation Details:**
|
||||
- BackgroundScheduler runs in separate thread pool
|
||||
- Each background job gets isolated database session
|
||||
- Scan status tracked through lifecycle: created → running → completed/failed
|
||||
- Error messages captured and stored in database
|
||||
- Graceful shutdown waits for running jobs
|
||||
- Job IDs follow pattern: `scan_{scan_id}`
|
||||
- Support for concurrent scans (max 3 default, configurable)
|
||||
|
||||
**Key Challenge Addressed:** Scanner requires privileged operations (masscan/nmap)
|
||||
|
||||
**Solution Implemented:**
|
||||
- Scanner runs in subprocess from background thread
|
||||
- Docker container provides necessary privileges (--privileged, --network host)
|
||||
- Background thread isolation prevents web app crashes
|
||||
- Database session per thread avoids SQLite locking issues
|
||||
|
||||
### Step 4: Authentication System ⏱️ Days 7-8
|
||||
**Priority: HIGH** - Security
|
||||
|
||||
Reference in New Issue
Block a user