Add webpage screenshot capture with Playwright

Implements automated screenshot capture for all discovered HTTP/HTTPS services using Playwright with headless Chromium. Screenshots are saved as PNG files and referenced in JSON reports. Features: - Separate ScreenshotCapture module for code organization - Viewport screenshots (1280x720) with 15-second timeout - Graceful handling of self-signed certificates - Browser reuse for optimal performance - Screenshots stored in timestamped directories - Comprehensive documentation in README.md and new CLAUDE.md Technical changes: - Added src/screenshot_capture.py: Screenshot capture module with context manager pattern - Updated src/scanner.py: Integrated screenshot capture into HTTP/HTTPS analysis phase - Updated Dockerfile: Added Chromium and Playwright browser installation - Updated requirements.txt: Added playwright==1.40.0 - Added CLAUDE.md: Developer documentation and implementation guide - Updated README.md: Enhanced features section, added screenshot details and troubleshooting - Updated .gitignore: Ignore entire output/ directory including screenshots 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 00:57:36 +00:00
parent 48755a8539
commit 61cc24f8d2
7 changed files with 822 additions and 25 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -1,5 +1,5 @@
-# Output files
+# Output files (scan reports and screenshots)
-output/*.json
+output/
 # Python
 __pycache__/
@@ -21,7 +21,6 @@ ENV/
 #AI helpers
 .claude/
 CLAUDE.md
 # OS
 .DS_Store
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,492 @@
 # CLAUDE.md
 This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
 ## Project Overview
 SneakyScanner is a dockerized network scanning tool that uses a five-phase approach: masscan for fast port discovery, nmap for service detection, sslyze for HTTP/HTTPS and SSL/TLS analysis, and Playwright for webpage screenshots. It accepts YAML configuration files defining scan targets and expected network behavior, then produces comprehensive JSON reports with service information, SSL certificates, TLS versions, cipher suites, and webpage screenshots - comparing expected vs. actual results.
 ## Essential Commands
 ### Building and Running
 ```bash
 # Build the Docker image
 docker build -t sneakyscanner .
 # Run with docker-compose (easiest method)
 docker-compose build
 docker-compose up
 # Run directly with Docker
 docker run --rm --privileged --network host \
  -v $(pwd)/configs:/app/configs:ro \
  -v $(pwd)/output:/app/output \
  sneakyscanner /app/configs/your-config.yaml
 ```
 ### Development
 ```bash
 # Test the Python script locally (requires masscan and nmap installed)
 python3 src/scanner.py configs/example-site.yaml -o ./output
 # Validate YAML config
 python3 -c "import yaml; yaml.safe_load(open('configs/example-site.yaml'))"
 ```
 ## Architecture
 ### Core Components
 1. **src/scanner.py** - Main application
   - `SneakyScanner` class: Orchestrates scanning workflow
   - `_load_config()`: Parses and validates YAML config
   - `_run_masscan()`: Executes masscan for TCP/UDP scanning
   - `_run_ping_scan()`: Executes masscan ICMP ping scanning
   - `_run_nmap_service_detection()`: Executes nmap service detection on discovered TCP ports
   - `_parse_nmap_xml()`: Parses nmap XML output to extract service information
   - `_is_likely_web_service()`: Identifies web services based on nmap results
   - `_detect_http_https()`: Detects HTTP vs HTTPS using socket connections
   - `_analyze_ssl_tls()`: Analyzes SSL/TLS certificates and supported versions using sslyze
   - `_run_http_analysis()`: Orchestrates HTTP/HTTPS and SSL/TLS analysis phase
   - `scan()`: Main workflow - collects IPs, runs scans, performs service detection, HTTP/HTTPS analysis, compiles results
   - `save_report()`: Writes JSON output with timestamp and scan duration
 2. **src/screenshot_capture.py** - Screenshot capture module
   - `ScreenshotCapture` class: Handles webpage screenshot capture
   - `capture()`: Captures screenshot of a web service (HTTP/HTTPS)
   - `_launch_browser()`: Initializes Playwright with Chromium in headless mode
   - `_close_browser()`: Cleanup browser resources
   - `_get_screenshot_dir()`: Creates screenshots subdirectory
   - `_generate_filename()`: Generates filename for screenshot (IP_PORT.png)
 3. **configs/** - YAML configuration files
   - Define scan title, sites, IPs, and expected network behavior
   - Each IP includes expected ping response and TCP/UDP ports
 4. **output/** - JSON scan reports and screenshots
   - Timestamped JSON files: `scan_report_YYYYMMDD_HHMMSS.json`
   - Screenshot directory: `scan_report_YYYYMMDD_HHMMSS_screenshots/`
   - Contains actual vs. expected comparison for each IP
 ### Scan Workflow
 1. Parse YAML config and extract all unique IPs
 2. Run ping scan on all IPs using `masscan --ping`
 3. Run TCP scan on all IPs for ports 0-65535
 4. Run UDP scan on all IPs for ports 0-65535
 5. Run service detection on discovered TCP ports using `nmap -sV`
 6. Run HTTP/HTTPS analysis on web services identified by nmap:
   - Detect HTTP vs HTTPS using socket connections
   - Capture webpage screenshot using Playwright (viewport 1280x720, 15s timeout)
   - For HTTPS: Extract certificate details (subject, issuer, expiry, SANs)
   - Test TLS version support (TLS 1.0, 1.1, 1.2, 1.3)
   - List accepted cipher suites for each TLS version
 7. Aggregate results by IP and site
 8. Generate JSON report with timestamp, scan duration, screenshot references, and complete service details
 ### Why Dockerized
 - Masscan and nmap require raw socket access (root/CAP_NET_RAW)
 - Isolates privileged operations in container
 - Ensures consistent masscan and nmap versions and dependencies
 - Uses `--privileged` and `--network host` for network access
 ### Masscan Integration
 - Masscan is built from source in Dockerfile
 - Writes output to temporary JSON files
 - Results parsed line-by-line (masscan uses comma-separated JSON lines)
 - Temporary files cleaned up after each scan
 ### Nmap Integration
 - Nmap installed via apt package in Dockerfile
 - Runs service detection (`-sV`) with intensity level 5 (balanced speed/accuracy)
 - Outputs XML format for structured parsing
 - XML parsed using Python's ElementTree library (xml.etree.ElementTree)
 - Extracts service name, product, version, extrainfo, and ostype
 - Runs sequentially per IP to avoid overwhelming the target
 - 10-minute timeout per host, 5-minute host timeout
 ### HTTP/HTTPS and SSL/TLS Analysis
 - Uses sslyze library for comprehensive SSL/TLS scanning
 - HTTP/HTTPS detection using Python's built-in socket and ssl modules
 - Analyzes services based on:
  - Nmap service identification (http, https, ssl, http-proxy, etc.)
  - Common web ports (80, 443, 8000, 8006, 8008, 8080, 8081, 8443, 8888, 9443)
  - This ensures non-standard ports (like Proxmox 8006) are analyzed even if nmap misidentifies them
 - For HTTPS services:
  - Extracts certificate information using cryptography library
  - Tests TLS versions: 1.0, 1.1, 1.2, 1.3
  - Lists all accepted cipher suites for each supported TLS version
  - Calculates days until certificate expiration
  - Extracts SANs (Subject Alternative Names) from certificate
 - Graceful error handling: if SSL analysis fails, still reports HTTP/HTTPS detection
 - 5-second timeout per HTTP/HTTPS detection
 - Results merged into service data structure under `http_info` key
 - **Note**: Uses sslyze 6.0 API which accesses scan results as attributes (e.g., `certificate_info`, `tls_1_2_cipher_suites`) rather than through `.scan_commands_results.get()`
 ### Webpage Screenshot Capture
 **Implementation**: `src/screenshot_capture.py` - Separate module for code organization
 **Technology Stack**:
 - Playwright 1.40.0 with Chromium in headless mode
 - System Chromium and chromium-driver installed via apt (Dockerfile)
 - Python's pathlib for cross-platform file path handling
 **Screenshot Process**:
 1. Screenshots captured for all successfully detected HTTP/HTTPS services
 2. Services identified by:
   - Nmap service names: http, https, ssl, http-proxy, http-alt, etc.
   - Common web ports: 80, 443, 8000, 8006, 8008, 8080, 8081, 8443, 8888, 9443
 3. Browser lifecycle managed via context manager pattern (`__enter__`, `__exit__`)
 **Configuration** (default values):
 - **Viewport size**: 1280x720 pixels (viewport only, not full page)
 - **Timeout**: 15 seconds per screenshot (15000ms in Playwright)
 - **Wait strategy**: `wait_until='networkidle'` - waits for network activity to settle
 - **SSL handling**: `ignore_https_errors=True` - handles self-signed certs
 - **User agent**: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
 - **Browser args**: `--no-sandbox`, `--disable-setuid-sandbox`, `--disable-dev-shm-usage`, `--disable-gpu`
 **Storage Architecture**:
 - Screenshots saved as PNG files in subdirectory: `scan_report_YYYYMMDD_HHMMSS_screenshots/`
 - Filename format: `{ip}_{port}.png` (dots in IP replaced with underscores)
  - Example: `192_168_1_10_443.png` for 192.168.1.10:443
 - Path stored in JSON as relative reference: `http_info.screenshot` field
 - Relative paths ensure portability of output directory
 **Error Handling** (graceful degradation):
 - If screenshot fails (timeout, connection error, etc.), scan continues
 - Failed screenshots logged as warnings, not errors
 - Services without screenshots simply omit the `screenshot` field in JSON output
 - Browser launch failure disables all screenshots for the scan
 **Browser Lifecycle** (optimized for performance):
 1. Browser launched once at scan start (in `scan()` method)
 2. Reused for all screenshots via single browser instance
 3. New context + page created per screenshot (isolated state)
 4. Context and page closed after each screenshot
 5. Browser closed at scan completion (cleanup in `scan()` method)
 **Integration Points**:
 - Initialized in `scanner.py:scan()` with scan timestamp
 - Called from `scanner.py:_run_http_analysis()` after protocol detection
 - Cleanup called in `scanner.py:scan()` after all analysis complete
 **Code Reference Locations**:
 - `src/screenshot_capture.py`: Complete screenshot module (lines 1-202)
 - `src/scanner.py:scan()`: Browser initialization and cleanup
 - `src/scanner.py:_run_http_analysis()`: Screenshot capture invocation
 ## Configuration Schema
 ```yaml
 title: string                    # Report title (required)
 sites:                           # List of sites (required)
  - name: string                 # Site name
    ips:                         # List of IPs for this site
      - address: string          # IP address (IPv4)
        expected:                # Expected network behavior
          ping: boolean          # Should respond to ping
          tcp_ports: [int]       # Expected TCP ports
          udp_ports: [int]       # Expected UDP ports
          services: [string]     # Expected services (optional)
 ```
 ## Key Design Decisions
 1. **Five-phase scanning**: Masscan for fast port discovery (10,000 pps), nmap for service detection, then HTTP/HTTPS and SSL/TLS analysis for web services
 2. **All-port scanning**: TCP and UDP scans cover entire port range (0-65535) to detect unexpected services
 3. **Selective web analysis**: Only analyze services identified by nmap as web-related to optimize scan time
 4. **Machine-readable output**: JSON format enables automated report generation and comparison
 5. **Expected vs. Actual**: Config includes expected behavior to identify infrastructure drift
 6. **Site grouping**: IPs organized by logical site for better reporting
 7. **Temporary files**: Masscan and nmap output written to temp files to avoid conflicts in parallel scans
 8. **Service details**: Extract product name, version, and additional info for each discovered service
 9. **SSL/TLS security**: Comprehensive certificate analysis and TLS version testing with cipher suite enumeration
 ## Testing Strategy
 When testing changes:
 1. Use a controlled test environment with known services (including HTTP/HTTPS)
 2. Create a test config with 1-2 IPs
 3. Verify JSON output structure matches schema
 4. Check that ping, TCP, and UDP results are captured
 5. Verify service detection results include service name, product, and version
 6. For web services, verify http_info includes:
   - Correct protocol detection (http vs https)
   - Screenshot path reference (relative to output directory)
   - Verify screenshot PNG file exists at the referenced path
   - Certificate details for HTTPS (subject, issuer, expiry, SANs)
   - TLS version support (1.0-1.3) with cipher suites
 7. Ensure temp files are cleaned up (masscan JSON, nmap XML)
 8. Verify screenshot directory created with correct naming convention
 9. Test screenshot capture with HTTP, HTTPS, and self-signed certificate services
 ## Common Tasks
 ### Modifying Scan Parameters
 **Masscan rate limiting:**
 - `--rate`: Currently set to 10000 packets/second in src/scanner.py:80, 132
 - `--wait`: Set to 0 (don't wait for late responses)
 - Adjust these in `_run_masscan()` and `_run_ping_scan()` methods
 **Nmap service detection intensity:**
 - `--version-intensity`: Currently set to 5 (balanced) in src/scanner.py:201
 - Range: 0-9 (0=light, 9=comprehensive)
 - Lower values are faster but less accurate
 - Adjust in `_run_nmap_service_detection()` method
 **Nmap timeouts:**
 - `--host-timeout`: Currently 5 minutes in src/scanner.py:204
 - Overall subprocess timeout: 600 seconds (10 minutes) in src/scanner.py:208
 - Adjust based on network conditions and number of ports
 ### Adding New Scan Types
 To add additional scan functionality (e.g., OS detection, vulnerability scanning):
 1. Add new method to `SneakyScanner` class (follow pattern of `_run_nmap_service_detection()`)
 2. Update `scan()` workflow to call new method
 3. Add results to `actual` section of output JSON
 4. Update YAML schema if expected values needed
 5. Update documentation (README.md, CLAUDE.md)
 ### Changing Output Format
 JSON structure defined in src/scanner.py:365+. To modify:
 1. Update the report dictionary structure
 2. Ensure backward compatibility or version the schema
 3. Update README.md output format documentation
 4. Update example output in both README.md and CLAUDE.md
 ### Customizing Screenshot Capture
 **Change viewport size** (src/screenshot_capture.py:35):
 ```python
 self.viewport = viewport or {'width': 1920, 'height': 1080}  # Full HD
 ```
 **Change timeout** (src/screenshot_capture.py:34):
 ```python
 self.timeout = timeout * 1000  # Default is 15 seconds
 # Pass different value when initializing: ScreenshotCapture(..., timeout=30)
 ```
 **Capture full-page screenshots** (src/screenshot_capture.py:173):
 ```python
 page.screenshot(path=str(screenshot_path), type='png', full_page=True)
 ```
 **Change wait strategy** (src/screenshot_capture.py:170):
 ```python
 # Options: 'load', 'domcontentloaded', 'networkidle', 'commit'
 page.goto(url, wait_until='load', timeout=self.timeout)
 ```
 **Add custom request headers** (src/screenshot_capture.py:157-161):
 ```python
 context = self.browser.new_context(
    viewport=self.viewport,
    ignore_https_errors=True,
    user_agent='CustomUserAgent/1.0',
    extra_http_headers={'Authorization': 'Bearer token'}
 )
 ```
 **Disable screenshot capture entirely**:
 In src/scanner.py:scan(), comment out or skip initialization:
 ```python
 # self.screenshot_capture = ScreenshotCapture(...)
 self.screenshot_capture = None  # This disables all screenshots
 ```
 **Add authentication** (for services requiring login):
 In src/screenshot_capture.py:capture(), before taking screenshot:
 ```python
 # Navigate to login page first
 page.goto(f"{protocol}://{ip}:{port}/login")
 page.fill('#username', 'admin')
 page.fill('#password', 'password')
 page.click('#login-button')
 page.wait_for_url(f"{protocol}://{ip}:{port}/dashboard")
 # Then take screenshot
 page.screenshot(path=str(screenshot_path), type='png')
 ```
 ### Performance Optimization
 Current bottlenecks:
 1. **Port scanning**: ~30 seconds for 2 IPs (65535 ports each at 10k pps)
 2. **Service detection**: ~20-60 seconds per IP with open ports
 3. **HTTP/HTTPS analysis**: ~5-10 seconds per web service (includes SSL/TLS analysis)
 4. **Screenshot capture**: ~5-15 seconds per web service (depends on page load time)
 Optimization strategies:
 - Parallelize nmap scans across IPs (currently sequential)
 - Parallelize HTTP/HTTPS analysis and screenshot capture across services using ThreadPoolExecutor
 - Reduce port range for faster scanning (if full range not needed)
 - Lower nmap intensity (trade accuracy for speed)
 - Skip service detection on high ports (>1024) if desired
 - Reduce SSL/TLS analysis scope (e.g., test only TLS 1.2+ if legacy support not needed)
 - Adjust HTTP/HTTPS detection timeout (currently 5 seconds in src/scanner.py:510)
 - Adjust screenshot timeout (currently 15 seconds in src/screenshot_capture.py:34)
 - Disable screenshot capture for faster scans (set screenshot_capture to None)
 ## Planned Features (Future Development)
 The following features are planned for future implementation:
 ### 1. HTML Report Generation
 Build comprehensive HTML reports from JSON scan data with interactive visualizations.
 **Report Features:**
 - Service details and SSL/TLS information tables
 - Visual comparison of expected vs. actual results (red/green highlighting)
 - Certificate expiration warnings with countdown timers
 - TLS version compliance reports (highlight weak configurations)
 - Embedded webpage screenshots
 - Sortable/filterable tables
 - Timeline view of scan history
 - Export to PDF capability
 **Implementation Considerations:**
 - Template engine: Jinja2 or similar
 - CSS framework: Bootstrap or Tailwind for responsive design
 - Charts/graphs: Chart.js or Plotly for visualizations
 - Store templates in `templates/` directory
 - Generate static HTML that can be opened without server
 **Architecture:**
 ```python
 class HTMLReportGenerator:
    def __init__(self, json_report_path, template_dir='templates'):
        pass
    def generate_report(self, output_path):
        # Parse JSON
        # Render template with data
        # Include screenshots
        # Write HTML file
        pass
    def _compare_expected_actual(self, expected, actual):
        # Generate diff/comparison data
        pass
    def _generate_cert_warnings(self, services):
        # Identify expiring certs, weak TLS, etc.
        pass
 ```
 ### 2. Comparison Reports (Scan Diffs)
 Generate reports showing changes between scans over time.
 **Features:**
 - Compare two scan reports
 - Highlight new/removed services
 - Track certificate changes
 - Detect TLS configuration drift
 - Show port changes
 ### 3. Additional Enhancements
 - **Email Notifications**: Alert on unexpected changes or certificate expirations
 - **Scheduled Scanning**: Automated periodic scans with cron integration
 - **Vulnerability Detection**: Integration with CVE databases for known vulnerabilities
 - **API Mode**: REST API for triggering scans and retrieving results
 - **Multi-threading**: Parallel scanning of multiple IPs for better performance
 ## Development Notes
 ### Current Dependencies
 - PyYAML==6.0.1 (YAML parsing)
 - python-libnmap==0.7.3 (nmap XML parsing)
 - sslyze==6.0.0 (SSL/TLS analysis)
 - playwright==1.40.0 (webpage screenshot capture)
 - Built-in: socket, ssl, subprocess, xml.etree.ElementTree, logging
 - System: chromium, chromium-driver (installed via Dockerfile)
 ### For HTML Reports, Will Need:
 - Jinja2 (template engine)
 - Optional: weasyprint or pdfkit for PDF export
 ### Key Files to Modify for New Features:
 1. **src/scanner.py** - Core scanning logic (add new phases/methods)
 2. **src/screenshot_capture.py** - ✅ Implemented: Webpage screenshot capture module
 3. **src/report_generator.py** - New file for HTML report generation (planned)
 4. **templates/** - New directory for HTML templates (planned)
 5. **requirements.txt** - Add new dependencies
 6. **Dockerfile** - Install additional system dependencies (browsers, etc.)
 ### Testing Strategy for New Features:
 **Screenshot Capture Testing** (✅ Implemented):
 1. Test with HTTP services (port 80, 8080, etc.)
 2. Test with HTTPS services with valid certificates (port 443, 8443)
 3. Test with HTTPS services with self-signed certificates
 4. Test with non-standard web ports (e.g., Proxmox on 8006)
 5. Test with slow-loading pages (verify 15s timeout works)
 6. Test with services that return errors (404, 500, etc.)
 7. Verify screenshot files are created with correct naming
 8. Verify JSON references point to correct screenshot files
 9. Verify browser cleanup occurs properly (no zombie processes)
 10. Test with multiple IPs and services to ensure browser reuse works
 **HTML Report Testing** (Planned):
 1. Validate HTML report rendering across browsers
 2. Ensure large scans don't cause memory issues with screenshots
 3. Test report generation with missing/incomplete data
 4. Verify all URLs and links work in generated reports
 5. Test embedded screenshots display correctly
 ## Troubleshooting
 ### Screenshot Capture Issues
 **Problem**: Screenshots not being captured
 - **Check**: Verify Chromium installed: `chromium --version` in container
 - **Check**: Verify Playwright browsers installed: `playwright install --dry-run chromium`
 - **Check**: Look for browser launch errors in stderr output
 - **Solution**: Rebuild Docker image ensuring Dockerfile steps complete
 **Problem**: "Failed to launch browser" error
 - **Check**: Ensure container has sufficient memory (Chromium needs ~200MB)
 - **Check**: Docker runs with `--privileged` or appropriate capabilities
 - **Solution**: Add `--shm-size=2gb` to docker run command if `/dev/shm` is too small
 **Problem**: Screenshots timing out
 - **Check**: Network connectivity to target services
 - **Check**: Services actually serve webpages (not just open ports)
 - **Solution**: Increase timeout in `src/screenshot_capture.py:34` if needed
 - **Solution**: Check service responds to HTTP requests: `curl -I http://IP:PORT`
 **Problem**: Screenshots are blank/empty
 - **Check**: Service returns valid HTML (not just TCP banner)
 - **Check**: Page requires JavaScript (may need longer wait time)
 - **Solution**: Change `wait_until` strategy from `'networkidle'` to `'load'` or `'domcontentloaded'`
 **Problem**: HTTPS certificate errors despite `ignore_https_errors=True`
 - **Check**: System certificates up to date in container
 - **Solution**: This should not happen; file an issue if it does
 ### Nmap/Masscan Issues
 **Problem**: No ports discovered
 - **Check**: Firewall rules allow scanning
 - **Check**: Targets are actually online (`ping` test)
 - **Solution**: Run manual masscan: `masscan -p80,443 192.168.1.10 --rate 1000`
 **Problem**: "Operation not permitted" error
 - **Check**: Container runs with `--privileged` or `CAP_NET_RAW`
 - **Solution**: Add `--privileged` flag to docker run command
 **Problem**: Service detection not working
 - **Check**: Nmap can connect to ports: `nmap -p 80 192.168.1.10`
 - **Check**: Services actually respond to nmap probes (some firewall/IPS block)
 - **Solution**: Adjust nmap intensity or timeout values
--- a/6
+++ b/6
@@ -7,6 +7,8 @@ RUN apt-get update && \
    build-essential \
    libpcap-dev \
    nmap \
    chromium \
    chromium-driver \
    && rm -rf /var/lib/apt/lists/*
 # Build and install masscan from source
@@ -24,6 +26,10 @@ WORKDIR /app
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 # Install Playwright browsers (Chromium only)
 # Note: We skip --with-deps since we already installed system chromium and dependencies above
 RUN playwright install chromium
 # Copy application code
 COPY src/ ./src/
--- a/README.md
+++ b/README.md
@@ -1,27 +1,49 @@
 # SneakyScanner
-A dockerized network scanning tool that uses masscan for fast port discovery and nmap for service detection to perform comprehensive infrastructure audits. SneakyScanner accepts YAML-based configuration files to define sites, IPs, and expected network behavior, then generates machine-readable JSON reports with detailed service information.
+A dockerized network scanning tool that uses masscan for fast port discovery, nmap for service detection, and Playwright for webpage screenshots to perform comprehensive infrastructure audits. SneakyScanner accepts YAML-based configuration files to define sites, IPs, and expected network behavior, then generates machine-readable JSON reports with detailed service information and webpage screenshots.
 ## Features
- YAML-based configuration for defining scan targets and expectations
+### Network Discovery & Port Scanning
- Comprehensive scanning using masscan:
+- **YAML-based configuration** for defining scan targets and expectations
-  - Ping/ICMP echo detection
+- **Comprehensive scanning using masscan**:
-  - TCP port scanning (all 65535 ports)
+  - Ping/ICMP echo detection (masscan --ping)
-  - UDP port scanning (all 65535 ports)
+  - TCP port scanning (all 65535 ports at 10,000 pps)
- Service detection using nmap:
+  - UDP port scanning (all 65535 ports at 10,000 pps)
  - Fast network-wide discovery in seconds
 ### Service Detection & Enumeration
 - **Service detection using nmap**:
  - Identifies services running on discovered TCP ports
-  - Extracts product names and versions
+  - Extracts product names and versions (e.g., "OpenSSH 8.2p1", "nginx 1.18.0")
-  - Provides detailed service information
+  - Provides detailed service information including extra attributes
- HTTP/HTTPS analysis and SSL/TLS security assessment:
+  - Balanced intensity level (5) for accuracy and speed
 ### Security Assessment
 - **HTTP/HTTPS analysis and SSL/TLS security assessment**:
  - Detects HTTP vs HTTPS on web services
  - Extracts SSL certificate details (subject, issuer, expiration, SANs)
-  - Calculates days until certificate expiration
+  - Calculates days until certificate expiration for monitoring
  - Tests TLS version support (TLS 1.0, 1.1, 1.2, 1.3)
-  - Lists accepted cipher suites for each TLS version
+  - Lists all accepted cipher suites for each supported TLS version
- JSON output format for easy post-processing
+  - Identifies weak cryptographic configurations
- Dockerized for consistent execution environment and root privilege isolation
+
- Compare actual vs. expected network behavior
+### Visual Documentation
 - **Webpage screenshot capture** (NEW):
  - Automatically captures screenshots of all discovered web services (HTTP/HTTPS)
  - Uses Playwright with headless Chromium browser
  - Viewport screenshots (1280x720) for consistent sizing
  - 15-second timeout per page with graceful error handling
  - Handles self-signed certificates without errors
  - Saves screenshots as PNG files with references in JSON reports
  - Screenshots organized in timestamped directories
  - Browser reuse for optimal performance
 ### Reporting & Output
 - **Machine-readable JSON output** format for easy post-processing
 - **Dockerized** for consistent execution environment and root privilege isolation
 - **Expected vs. Actual comparison** to identify infrastructure drift
 - Timestamped reports with complete scan duration metrics
 ## Requirements
@@ -63,9 +85,9 @@ SneakyScanner uses a five-phase approach for comprehensive scanning:
 2. **TCP Port Discovery** (masscan): Scans all 65535 TCP ports at 10,000 packets/second - ~13 seconds per 2 IPs
 3. **UDP Port Discovery** (masscan): Scans all 65535 UDP ports at 10,000 packets/second - ~13 seconds per 2 IPs
 4. **Service Detection** (nmap): Identifies services on discovered TCP ports - ~20-60 seconds per IP with open ports
-5. **HTTP/HTTPS Analysis** (SSL/TLS): Detects web protocols and analyzes certificates - ~5-10 seconds per web service
+5. **HTTP/HTTPS Analysis** (Playwright, SSL/TLS): Detects web protocols, captures screenshots, and analyzes certificates - ~10-20 seconds per web service
-**Example**: Scanning 2 IPs with 10 open ports each (including 2-3 web services) typically takes 1.5-2.5 minutes total.
+**Example**: Scanning 2 IPs with 10 open ports each (including 2-3 web services) typically takes 2-3 minutes total.
 ### Using Docker Directly
@@ -104,7 +126,7 @@ See `configs/example-site.yaml` for a complete example.
 ## Output Format
-Scan results are saved as JSON files in the `output/` directory with timestamps. The report includes the total scan duration (in seconds) covering all phases: ping scan, TCP/UDP port discovery, and service detection.
+Scan results are saved as JSON files in the `output/` directory with timestamps. Screenshots are saved in a subdirectory with the same timestamp. The report includes the total scan duration (in seconds) covering all phases: ping scan, TCP/UDP port discovery, service detection, and screenshot capture.
 ```json
 {
@@ -142,7 +164,8 @@ Scan results are saved as JSON files in the `output/` directory with timestamps.
                "product": "nginx",
                "version": "1.18.0",
                "http_info": {
-                  "protocol": "http"
+                  "protocol": "http",
                  "screenshot": "scan_report_20250115_103000_screenshots/192_168_1_10_80.png"
                }
              },
              {
@@ -152,6 +175,7 @@ Scan results are saved as JSON files in the `output/` directory with timestamps.
                "product": "nginx",
                "http_info": {
                  "protocol": "https",
                  "screenshot": "scan_report_20250115_103000_screenshots/192_168_1_10_443.png",
                  "ssl_tls": {
                    "certificate": {
                      "subject": "CN=example.com",
@@ -207,18 +231,60 @@ Scan results are saved as JSON files in the `output/` directory with timestamps.
 }
 ```
 ## Screenshot Capture Details
 SneakyScanner automatically captures webpage screenshots for all discovered HTTP and HTTPS services, providing visual documentation of your infrastructure.
 ### How It Works
 1. **Automatic Detection**: During the HTTP/HTTPS analysis phase, SneakyScanner identifies web services based on:
   - Nmap service detection results (http, https, ssl, http-proxy)
   - Common web ports (80, 443, 8000, 8006, 8080, 8081, 8443, 8888, 9443)
 2. **Screenshot Capture**: For each web service:
   - Launches headless Chromium browser (once per scan, reused for all screenshots)
   - Navigates to the service URL (HTTP or HTTPS)
   - Waits for network to be idle (up to 15 seconds)
   - Captures viewport screenshot (1280x720 pixels)
   - Handles SSL certificate errors gracefully (e.g., self-signed certificates)
 3. **Storage**: Screenshots are saved as PNG files:
   - Directory: `output/scan_report_YYYYMMDD_HHMMSS_screenshots/`
   - Filename format: `{ip}_{port}.png` (e.g., `192_168_1_10_443.png`)
   - Referenced in JSON report under `http_info.screenshot`
 ### Screenshot Configuration
 Default settings (configured in `src/screenshot_capture.py`):
 - **Viewport size**: 1280x720 (captures visible area only, not full page)
 - **Timeout**: 15 seconds per page load
 - **Browser**: Chromium (headless mode)
 - **SSL handling**: Ignores HTTPS errors (works with self-signed certificates)
 - **User agent**: Mozilla/5.0 (Windows NT 10.0; Win64; x64)
 ### Error Handling
 Screenshots are captured on a best-effort basis:
 - If a screenshot fails (timeout, connection error, etc.), the scan continues
 - Failed screenshots are logged but don't stop the scan
 - Services without screenshots simply omit the `screenshot` field in JSON output
 ## Project Structure
 ```
 SneakyScanner/
 ├── src/
-│   └── scanner.py           # Main scanner application
+│   ├── scanner.py           # Main scanner application
 │   └── screenshot_capture.py # Webpage screenshot capture module
 ├── configs/
 │   └── example-site.yaml    # Example configuration
-├── output/                  # Scan results (JSON files)
+├── output/                  # Scan results
 │   ├── scan_report_*.json   # JSON reports with timestamps
 │   └── scan_report_*_screenshots/  # Screenshot directories
 ├── Dockerfile
 ├── docker-compose.yml
 ├── requirements.txt
 ├── CLAUDE.md                # Developer documentation
 └── README.md
 ```
@@ -232,7 +298,6 @@ Only use this tool on networks you own or have explicit authorization to scan. U
 ## Future Enhancements
 - **Webpage Screenshots**: Capture screenshots of discovered web services for visual verification
 - **HTML Report Generation**: Build comprehensive HTML reports from JSON output with:
  - Service details and SSL/TLS information
  - Visual comparison of expected vs. actual results
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,3 +1,4 @@
 PyYAML==6.0.1
 python-libnmap==0.7.3
 sslyze==6.0.0
 playwright==1.40.0
--- a/src/scanner.py
+++ b/src/scanner.py
@@ -5,6 +5,7 @@ SneakyScanner - Masscan-based network scanner with YAML configuration
 import argparse
 import json
 import logging
 import subprocess
 import sys
 import tempfile
@@ -18,6 +19,8 @@ import yaml
 from libnmap.process import NmapProcess
 from libnmap.parser import NmapParser
 from screenshot_capture import ScreenshotCapture
 # Force unbuffered output for Docker
 sys.stdout.reconfigure(line_buffering=True)
 sys.stderr.reconfigure(line_buffering=True)
@@ -31,6 +34,7 @@ class SneakyScanner:
        self.output_dir = Path(output_dir)
        self.output_dir.mkdir(parents=True, exist_ok=True)
        self.config = self._load_config()
        self.screenshot_capture = None
    def _load_config(self) -> Dict[str, Any]:
        """Load and validate YAML configuration"""
@@ -511,6 +515,16 @@ class SneakyScanner:
                result = {'protocol': protocol}
                # Capture screenshot if screenshot capture is enabled
                if self.screenshot_capture:
                    try:
                        screenshot_path = self.screenshot_capture.capture(ip, port, protocol)
                        if screenshot_path:
                            result['screenshot'] = screenshot_path
                    except Exception as e:
                        print(f"  Screenshot capture error for {ip}:{port}: {e}",
                              file=sys.stderr, flush=True)
                # If HTTPS, analyze SSL/TLS
                if protocol == 'https':
                    try:
@@ -545,6 +559,14 @@ class SneakyScanner:
        # Record start time
        start_time = time.time()
        scan_timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
        # Initialize screenshot capture
        self.screenshot_capture = ScreenshotCapture(
            output_dir=str(self.output_dir),
            scan_timestamp=scan_timestamp,
            timeout=15
        )
        # Collect all unique IPs
        all_ips = set()
@@ -658,6 +680,10 @@ class SneakyScanner:
            report['sites'].append(site_result)
        # Clean up screenshot capture browser
        if self.screenshot_capture:
            self.screenshot_capture._close_browser()
        return report
    def save_report(self, report: Dict[str, Any]) -> Path:
@@ -673,6 +699,13 @@ class SneakyScanner:
 def main():
    # Configure logging
    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
        handlers=[logging.StreamHandler(sys.stderr)]
    )
    parser = argparse.ArgumentParser(
        description='SneakyScanner - Masscan-based network scanner'
    )
--- a/src/screenshot_capture.py
+++ b/src/screenshot_capture.py
@@ -0,0 +1,201 @@
 """
 Screenshot capture module for SneakyScanner.
 Uses Playwright with Chromium to capture screenshots of discovered web services.
 """
 import os
 import logging
 from pathlib import Path
 from playwright.sync_api import sync_playwright, TimeoutError as PlaywrightTimeout
 class ScreenshotCapture:
    """
    Handles webpage screenshot capture for web services discovered during scanning.
    Uses Playwright with Chromium in headless mode to capture viewport screenshots
    of HTTP and HTTPS services. Handles SSL certificate errors gracefully.
    """
    def __init__(self, output_dir, scan_timestamp, timeout=15, viewport=None):
        """
        Initialize the screenshot capture handler.
        Args:
            output_dir (str): Base output directory for scan reports
            scan_timestamp (str): Timestamp string for this scan (format: YYYYMMDD_HHMMSS)
            timeout (int): Timeout in seconds for page load and screenshot (default: 15)
            viewport (dict): Viewport size dict with 'width' and 'height' keys
                           (default: {'width': 1280, 'height': 720})
        """
        self.output_dir = output_dir
        self.scan_timestamp = scan_timestamp
        self.timeout = timeout * 1000  # Convert to milliseconds for Playwright
        self.viewport = viewport or {'width': 1280, 'height': 720}
        self.playwright = None
        self.browser = None
        self.screenshot_dir = None
        # Set up logging
        self.logger = logging.getLogger('SneakyScanner.Screenshot')
    def _get_screenshot_dir(self):
        """
        Create and return the screenshots subdirectory for this scan.
        Returns:
            Path: Path object for the screenshots directory
        """
        if self.screenshot_dir is None:
            dir_name = f"scan_report_{self.scan_timestamp}_screenshots"
            self.screenshot_dir = Path(self.output_dir) / dir_name
            self.screenshot_dir.mkdir(parents=True, exist_ok=True)
            self.logger.info(f"Created screenshot directory: {self.screenshot_dir}")
        return self.screenshot_dir
    def _generate_filename(self, ip, port):
        """
        Generate a filename for the screenshot.
        Args:
            ip (str): IP address of the service
            port (int): Port number of the service
        Returns:
            str: Filename in format: {ip}_{port}.png
        """
        # Replace dots in IP with underscores for filesystem compatibility
        safe_ip = ip.replace('.', '_')
        return f"{safe_ip}_{port}.png"
    def _launch_browser(self):
        """
        Launch Playwright and Chromium browser in headless mode.
        Returns:
            bool: True if browser launched successfully, False otherwise
        """
        if self.browser is not None:
            return True  # Already launched
        try:
            self.logger.info("Launching Chromium browser...")
            self.playwright = sync_playwright().start()
            self.browser = self.playwright.chromium.launch(
                headless=True,
                args=[
                    '--no-sandbox',
                    '--disable-setuid-sandbox',
                    '--disable-dev-shm-usage',
                    '--disable-gpu',
                ]
            )
            self.logger.info("Chromium browser launched successfully")
            return True
        except Exception as e:
            self.logger.error(f"Failed to launch browser: {e}")
            return False
    def _close_browser(self):
        """
        Close the browser and cleanup Playwright resources.
        """
        if self.browser:
            try:
                self.browser.close()
                self.logger.info("Browser closed")
            except Exception as e:
                self.logger.warning(f"Error closing browser: {e}")
            finally:
                self.browser = None
        if self.playwright:
            try:
                self.playwright.stop()
            except Exception as e:
                self.logger.warning(f"Error stopping playwright: {e}")
            finally:
                self.playwright = None
    def capture(self, ip, port, protocol):
        """
        Capture a screenshot of a web service.
        Args:
            ip (str): IP address of the service
            port (int): Port number of the service
            protocol (str): Protocol to use ('http' or 'https')
        Returns:
            str: Relative path to the screenshot file, or None if capture failed
        """
        # Validate protocol
        if protocol not in ['http', 'https']:
            self.logger.warning(f"Invalid protocol '{protocol}' for {ip}:{port}")
            return None
        # Launch browser if not already running
        if not self._launch_browser():
            return None
        # Build URL
        url = f"{protocol}://{ip}:{port}"
        # Generate screenshot filename
        filename = self._generate_filename(ip, port)
        screenshot_dir = self._get_screenshot_dir()
        screenshot_path = screenshot_dir / filename
        try:
            self.logger.info(f"Capturing screenshot: {url}")
            # Create new browser context with viewport and SSL settings
            context = self.browser.new_context(
                viewport=self.viewport,
                ignore_https_errors=True,  # Handle self-signed certs
                user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
            )
            # Create new page
            page = context.new_page()
            # Set default timeout
            page.set_default_timeout(self.timeout)
            # Navigate to URL
            page.goto(url, wait_until='networkidle', timeout=self.timeout)
            # Take screenshot (viewport only)
            page.screenshot(path=str(screenshot_path), type='png')
            # Close page and context
            page.close()
            context.close()
            self.logger.info(f"Screenshot saved: {screenshot_path}")
            # Return relative path (relative to output directory)
            relative_path = f"{screenshot_dir.name}/{filename}"
            return relative_path
        except PlaywrightTimeout:
            self.logger.warning(f"Timeout capturing screenshot for {url}")
            return None
        except Exception as e:
            self.logger.warning(f"Failed to capture screenshot for {url}: {e}")
            return None
    def __enter__(self):
        """Context manager entry."""
        self._launch_browser()
        return self
    def __exit__(self, exc_type, exc_val, exc_tb):
        """Context manager exit - cleanup browser resources."""
        self._close_browser()
        return False  # Don't suppress exceptions