Files
SneakyScan/CLAUDE.md
Phillip Tarrant 61cc24f8d2 Add webpage screenshot capture with Playwright
Implements automated screenshot capture for all discovered HTTP/HTTPS services using Playwright with headless Chromium. Screenshots are saved as PNG files and referenced in JSON reports.

Features:
- Separate ScreenshotCapture module for code organization
- Viewport screenshots (1280x720) with 15-second timeout
- Graceful handling of self-signed certificates
- Browser reuse for optimal performance
- Screenshots stored in timestamped directories
- Comprehensive documentation in README.md and new CLAUDE.md

Technical changes:
- Added src/screenshot_capture.py: Screenshot capture module with context manager pattern
- Updated src/scanner.py: Integrated screenshot capture into HTTP/HTTPS analysis phase
- Updated Dockerfile: Added Chromium and Playwright browser installation
- Updated requirements.txt: Added playwright==1.40.0
- Added CLAUDE.md: Developer documentation and implementation guide
- Updated README.md: Enhanced features section, added screenshot details and troubleshooting
- Updated .gitignore: Ignore entire output/ directory including screenshots

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-14 00:57:36 +00:00

21 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

SneakyScanner is a dockerized network scanning tool that uses a five-phase approach: masscan for fast port discovery, nmap for service detection, sslyze for HTTP/HTTPS and SSL/TLS analysis, and Playwright for webpage screenshots. It accepts YAML configuration files defining scan targets and expected network behavior, then produces comprehensive JSON reports with service information, SSL certificates, TLS versions, cipher suites, and webpage screenshots - comparing expected vs. actual results.

Essential Commands

Building and Running

# Build the Docker image
docker build -t sneakyscanner .

# Run with docker-compose (easiest method)
docker-compose build
docker-compose up

# Run directly with Docker
docker run --rm --privileged --network host \
  -v $(pwd)/configs:/app/configs:ro \
  -v $(pwd)/output:/app/output \
  sneakyscanner /app/configs/your-config.yaml

Development

# Test the Python script locally (requires masscan and nmap installed)
python3 src/scanner.py configs/example-site.yaml -o ./output

# Validate YAML config
python3 -c "import yaml; yaml.safe_load(open('configs/example-site.yaml'))"

Architecture

Core Components

  1. src/scanner.py - Main application

    • SneakyScanner class: Orchestrates scanning workflow
    • _load_config(): Parses and validates YAML config
    • _run_masscan(): Executes masscan for TCP/UDP scanning
    • _run_ping_scan(): Executes masscan ICMP ping scanning
    • _run_nmap_service_detection(): Executes nmap service detection on discovered TCP ports
    • _parse_nmap_xml(): Parses nmap XML output to extract service information
    • _is_likely_web_service(): Identifies web services based on nmap results
    • _detect_http_https(): Detects HTTP vs HTTPS using socket connections
    • _analyze_ssl_tls(): Analyzes SSL/TLS certificates and supported versions using sslyze
    • _run_http_analysis(): Orchestrates HTTP/HTTPS and SSL/TLS analysis phase
    • scan(): Main workflow - collects IPs, runs scans, performs service detection, HTTP/HTTPS analysis, compiles results
    • save_report(): Writes JSON output with timestamp and scan duration
  2. src/screenshot_capture.py - Screenshot capture module

    • ScreenshotCapture class: Handles webpage screenshot capture
    • capture(): Captures screenshot of a web service (HTTP/HTTPS)
    • _launch_browser(): Initializes Playwright with Chromium in headless mode
    • _close_browser(): Cleanup browser resources
    • _get_screenshot_dir(): Creates screenshots subdirectory
    • _generate_filename(): Generates filename for screenshot (IP_PORT.png)
  3. configs/ - YAML configuration files

    • Define scan title, sites, IPs, and expected network behavior
    • Each IP includes expected ping response and TCP/UDP ports
  4. output/ - JSON scan reports and screenshots

    • Timestamped JSON files: scan_report_YYYYMMDD_HHMMSS.json
    • Screenshot directory: scan_report_YYYYMMDD_HHMMSS_screenshots/
    • Contains actual vs. expected comparison for each IP

Scan Workflow

  1. Parse YAML config and extract all unique IPs
  2. Run ping scan on all IPs using masscan --ping
  3. Run TCP scan on all IPs for ports 0-65535
  4. Run UDP scan on all IPs for ports 0-65535
  5. Run service detection on discovered TCP ports using nmap -sV
  6. Run HTTP/HTTPS analysis on web services identified by nmap:
    • Detect HTTP vs HTTPS using socket connections
    • Capture webpage screenshot using Playwright (viewport 1280x720, 15s timeout)
    • For HTTPS: Extract certificate details (subject, issuer, expiry, SANs)
    • Test TLS version support (TLS 1.0, 1.1, 1.2, 1.3)
    • List accepted cipher suites for each TLS version
  7. Aggregate results by IP and site
  8. Generate JSON report with timestamp, scan duration, screenshot references, and complete service details

Why Dockerized

  • Masscan and nmap require raw socket access (root/CAP_NET_RAW)
  • Isolates privileged operations in container
  • Ensures consistent masscan and nmap versions and dependencies
  • Uses --privileged and --network host for network access

Masscan Integration

  • Masscan is built from source in Dockerfile
  • Writes output to temporary JSON files
  • Results parsed line-by-line (masscan uses comma-separated JSON lines)
  • Temporary files cleaned up after each scan

Nmap Integration

  • Nmap installed via apt package in Dockerfile
  • Runs service detection (-sV) with intensity level 5 (balanced speed/accuracy)
  • Outputs XML format for structured parsing
  • XML parsed using Python's ElementTree library (xml.etree.ElementTree)
  • Extracts service name, product, version, extrainfo, and ostype
  • Runs sequentially per IP to avoid overwhelming the target
  • 10-minute timeout per host, 5-minute host timeout

HTTP/HTTPS and SSL/TLS Analysis

  • Uses sslyze library for comprehensive SSL/TLS scanning
  • HTTP/HTTPS detection using Python's built-in socket and ssl modules
  • Analyzes services based on:
    • Nmap service identification (http, https, ssl, http-proxy, etc.)
    • Common web ports (80, 443, 8000, 8006, 8008, 8080, 8081, 8443, 8888, 9443)
    • This ensures non-standard ports (like Proxmox 8006) are analyzed even if nmap misidentifies them
  • For HTTPS services:
    • Extracts certificate information using cryptography library
    • Tests TLS versions: 1.0, 1.1, 1.2, 1.3
    • Lists all accepted cipher suites for each supported TLS version
    • Calculates days until certificate expiration
    • Extracts SANs (Subject Alternative Names) from certificate
  • Graceful error handling: if SSL analysis fails, still reports HTTP/HTTPS detection
  • 5-second timeout per HTTP/HTTPS detection
  • Results merged into service data structure under http_info key
  • Note: Uses sslyze 6.0 API which accesses scan results as attributes (e.g., certificate_info, tls_1_2_cipher_suites) rather than through .scan_commands_results.get()

Webpage Screenshot Capture

Implementation: src/screenshot_capture.py - Separate module for code organization

Technology Stack:

  • Playwright 1.40.0 with Chromium in headless mode
  • System Chromium and chromium-driver installed via apt (Dockerfile)
  • Python's pathlib for cross-platform file path handling

Screenshot Process:

  1. Screenshots captured for all successfully detected HTTP/HTTPS services
  2. Services identified by:
    • Nmap service names: http, https, ssl, http-proxy, http-alt, etc.
    • Common web ports: 80, 443, 8000, 8006, 8008, 8080, 8081, 8443, 8888, 9443
  3. Browser lifecycle managed via context manager pattern (__enter__, __exit__)

Configuration (default values):

  • Viewport size: 1280x720 pixels (viewport only, not full page)
  • Timeout: 15 seconds per screenshot (15000ms in Playwright)
  • Wait strategy: wait_until='networkidle' - waits for network activity to settle
  • SSL handling: ignore_https_errors=True - handles self-signed certs
  • User agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36
  • Browser args: --no-sandbox, --disable-setuid-sandbox, --disable-dev-shm-usage, --disable-gpu

Storage Architecture:

  • Screenshots saved as PNG files in subdirectory: scan_report_YYYYMMDD_HHMMSS_screenshots/
  • Filename format: {ip}_{port}.png (dots in IP replaced with underscores)
    • Example: 192_168_1_10_443.png for 192.168.1.10:443
  • Path stored in JSON as relative reference: http_info.screenshot field
  • Relative paths ensure portability of output directory

Error Handling (graceful degradation):

  • If screenshot fails (timeout, connection error, etc.), scan continues
  • Failed screenshots logged as warnings, not errors
  • Services without screenshots simply omit the screenshot field in JSON output
  • Browser launch failure disables all screenshots for the scan

Browser Lifecycle (optimized for performance):

  1. Browser launched once at scan start (in scan() method)
  2. Reused for all screenshots via single browser instance
  3. New context + page created per screenshot (isolated state)
  4. Context and page closed after each screenshot
  5. Browser closed at scan completion (cleanup in scan() method)

Integration Points:

  • Initialized in scanner.py:scan() with scan timestamp
  • Called from scanner.py:_run_http_analysis() after protocol detection
  • Cleanup called in scanner.py:scan() after all analysis complete

Code Reference Locations:

  • src/screenshot_capture.py: Complete screenshot module (lines 1-202)
  • src/scanner.py:scan(): Browser initialization and cleanup
  • src/scanner.py:_run_http_analysis(): Screenshot capture invocation

Configuration Schema

title: string                    # Report title (required)
sites:                           # List of sites (required)
  - name: string                 # Site name
    ips:                         # List of IPs for this site
      - address: string          # IP address (IPv4)
        expected:                # Expected network behavior
          ping: boolean          # Should respond to ping
          tcp_ports: [int]       # Expected TCP ports
          udp_ports: [int]       # Expected UDP ports
          services: [string]     # Expected services (optional)

Key Design Decisions

  1. Five-phase scanning: Masscan for fast port discovery (10,000 pps), nmap for service detection, then HTTP/HTTPS and SSL/TLS analysis for web services
  2. All-port scanning: TCP and UDP scans cover entire port range (0-65535) to detect unexpected services
  3. Selective web analysis: Only analyze services identified by nmap as web-related to optimize scan time
  4. Machine-readable output: JSON format enables automated report generation and comparison
  5. Expected vs. Actual: Config includes expected behavior to identify infrastructure drift
  6. Site grouping: IPs organized by logical site for better reporting
  7. Temporary files: Masscan and nmap output written to temp files to avoid conflicts in parallel scans
  8. Service details: Extract product name, version, and additional info for each discovered service
  9. SSL/TLS security: Comprehensive certificate analysis and TLS version testing with cipher suite enumeration

Testing Strategy

When testing changes:

  1. Use a controlled test environment with known services (including HTTP/HTTPS)
  2. Create a test config with 1-2 IPs
  3. Verify JSON output structure matches schema
  4. Check that ping, TCP, and UDP results are captured
  5. Verify service detection results include service name, product, and version
  6. For web services, verify http_info includes:
    • Correct protocol detection (http vs https)
    • Screenshot path reference (relative to output directory)
    • Verify screenshot PNG file exists at the referenced path
    • Certificate details for HTTPS (subject, issuer, expiry, SANs)
    • TLS version support (1.0-1.3) with cipher suites
  7. Ensure temp files are cleaned up (masscan JSON, nmap XML)
  8. Verify screenshot directory created with correct naming convention
  9. Test screenshot capture with HTTP, HTTPS, and self-signed certificate services

Common Tasks

Modifying Scan Parameters

Masscan rate limiting:

  • --rate: Currently set to 10000 packets/second in src/scanner.py:80, 132
  • --wait: Set to 0 (don't wait for late responses)
  • Adjust these in _run_masscan() and _run_ping_scan() methods

Nmap service detection intensity:

  • --version-intensity: Currently set to 5 (balanced) in src/scanner.py:201
  • Range: 0-9 (0=light, 9=comprehensive)
  • Lower values are faster but less accurate
  • Adjust in _run_nmap_service_detection() method

Nmap timeouts:

  • --host-timeout: Currently 5 minutes in src/scanner.py:204
  • Overall subprocess timeout: 600 seconds (10 minutes) in src/scanner.py:208
  • Adjust based on network conditions and number of ports

Adding New Scan Types

To add additional scan functionality (e.g., OS detection, vulnerability scanning):

  1. Add new method to SneakyScanner class (follow pattern of _run_nmap_service_detection())
  2. Update scan() workflow to call new method
  3. Add results to actual section of output JSON
  4. Update YAML schema if expected values needed
  5. Update documentation (README.md, CLAUDE.md)

Changing Output Format

JSON structure defined in src/scanner.py:365+. To modify:

  1. Update the report dictionary structure
  2. Ensure backward compatibility or version the schema
  3. Update README.md output format documentation
  4. Update example output in both README.md and CLAUDE.md

Customizing Screenshot Capture

Change viewport size (src/screenshot_capture.py:35):

self.viewport = viewport or {'width': 1920, 'height': 1080}  # Full HD

Change timeout (src/screenshot_capture.py:34):

self.timeout = timeout * 1000  # Default is 15 seconds
# Pass different value when initializing: ScreenshotCapture(..., timeout=30)

Capture full-page screenshots (src/screenshot_capture.py:173):

page.screenshot(path=str(screenshot_path), type='png', full_page=True)

Change wait strategy (src/screenshot_capture.py:170):

# Options: 'load', 'domcontentloaded', 'networkidle', 'commit'
page.goto(url, wait_until='load', timeout=self.timeout)

Add custom request headers (src/screenshot_capture.py:157-161):

context = self.browser.new_context(
    viewport=self.viewport,
    ignore_https_errors=True,
    user_agent='CustomUserAgent/1.0',
    extra_http_headers={'Authorization': 'Bearer token'}
)

Disable screenshot capture entirely: In src/scanner.py:scan(), comment out or skip initialization:

# self.screenshot_capture = ScreenshotCapture(...)
self.screenshot_capture = None  # This disables all screenshots

Add authentication (for services requiring login): In src/screenshot_capture.py:capture(), before taking screenshot:

# Navigate to login page first
page.goto(f"{protocol}://{ip}:{port}/login")
page.fill('#username', 'admin')
page.fill('#password', 'password')
page.click('#login-button')
page.wait_for_url(f"{protocol}://{ip}:{port}/dashboard")
# Then take screenshot
page.screenshot(path=str(screenshot_path), type='png')

Performance Optimization

Current bottlenecks:

  1. Port scanning: ~30 seconds for 2 IPs (65535 ports each at 10k pps)
  2. Service detection: ~20-60 seconds per IP with open ports
  3. HTTP/HTTPS analysis: ~5-10 seconds per web service (includes SSL/TLS analysis)
  4. Screenshot capture: ~5-15 seconds per web service (depends on page load time)

Optimization strategies:

  • Parallelize nmap scans across IPs (currently sequential)
  • Parallelize HTTP/HTTPS analysis and screenshot capture across services using ThreadPoolExecutor
  • Reduce port range for faster scanning (if full range not needed)
  • Lower nmap intensity (trade accuracy for speed)
  • Skip service detection on high ports (>1024) if desired
  • Reduce SSL/TLS analysis scope (e.g., test only TLS 1.2+ if legacy support not needed)
  • Adjust HTTP/HTTPS detection timeout (currently 5 seconds in src/scanner.py:510)
  • Adjust screenshot timeout (currently 15 seconds in src/screenshot_capture.py:34)
  • Disable screenshot capture for faster scans (set screenshot_capture to None)

Planned Features (Future Development)

The following features are planned for future implementation:

1. HTML Report Generation

Build comprehensive HTML reports from JSON scan data with interactive visualizations.

Report Features:

  • Service details and SSL/TLS information tables
  • Visual comparison of expected vs. actual results (red/green highlighting)
  • Certificate expiration warnings with countdown timers
  • TLS version compliance reports (highlight weak configurations)
  • Embedded webpage screenshots
  • Sortable/filterable tables
  • Timeline view of scan history
  • Export to PDF capability

Implementation Considerations:

  • Template engine: Jinja2 or similar
  • CSS framework: Bootstrap or Tailwind for responsive design
  • Charts/graphs: Chart.js or Plotly for visualizations
  • Store templates in templates/ directory
  • Generate static HTML that can be opened without server

Architecture:

class HTMLReportGenerator:
    def __init__(self, json_report_path, template_dir='templates'):
        pass

    def generate_report(self, output_path):
        # Parse JSON
        # Render template with data
        # Include screenshots
        # Write HTML file
        pass

    def _compare_expected_actual(self, expected, actual):
        # Generate diff/comparison data
        pass

    def _generate_cert_warnings(self, services):
        # Identify expiring certs, weak TLS, etc.
        pass

2. Comparison Reports (Scan Diffs)

Generate reports showing changes between scans over time.

Features:

  • Compare two scan reports
  • Highlight new/removed services
  • Track certificate changes
  • Detect TLS configuration drift
  • Show port changes

3. Additional Enhancements

  • Email Notifications: Alert on unexpected changes or certificate expirations
  • Scheduled Scanning: Automated periodic scans with cron integration
  • Vulnerability Detection: Integration with CVE databases for known vulnerabilities
  • API Mode: REST API for triggering scans and retrieving results
  • Multi-threading: Parallel scanning of multiple IPs for better performance

Development Notes

Current Dependencies

  • PyYAML==6.0.1 (YAML parsing)
  • python-libnmap==0.7.3 (nmap XML parsing)
  • sslyze==6.0.0 (SSL/TLS analysis)
  • playwright==1.40.0 (webpage screenshot capture)
  • Built-in: socket, ssl, subprocess, xml.etree.ElementTree, logging
  • System: chromium, chromium-driver (installed via Dockerfile)

For HTML Reports, Will Need:

  • Jinja2 (template engine)
  • Optional: weasyprint or pdfkit for PDF export

Key Files to Modify for New Features:

  1. src/scanner.py - Core scanning logic (add new phases/methods)
  2. src/screenshot_capture.py - Implemented: Webpage screenshot capture module
  3. src/report_generator.py - New file for HTML report generation (planned)
  4. templates/ - New directory for HTML templates (planned)
  5. requirements.txt - Add new dependencies
  6. Dockerfile - Install additional system dependencies (browsers, etc.)

Testing Strategy for New Features:

Screenshot Capture Testing ( Implemented):

  1. Test with HTTP services (port 80, 8080, etc.)
  2. Test with HTTPS services with valid certificates (port 443, 8443)
  3. Test with HTTPS services with self-signed certificates
  4. Test with non-standard web ports (e.g., Proxmox on 8006)
  5. Test with slow-loading pages (verify 15s timeout works)
  6. Test with services that return errors (404, 500, etc.)
  7. Verify screenshot files are created with correct naming
  8. Verify JSON references point to correct screenshot files
  9. Verify browser cleanup occurs properly (no zombie processes)
  10. Test with multiple IPs and services to ensure browser reuse works

HTML Report Testing (Planned):

  1. Validate HTML report rendering across browsers
  2. Ensure large scans don't cause memory issues with screenshots
  3. Test report generation with missing/incomplete data
  4. Verify all URLs and links work in generated reports
  5. Test embedded screenshots display correctly

Troubleshooting

Screenshot Capture Issues

Problem: Screenshots not being captured

  • Check: Verify Chromium installed: chromium --version in container
  • Check: Verify Playwright browsers installed: playwright install --dry-run chromium
  • Check: Look for browser launch errors in stderr output
  • Solution: Rebuild Docker image ensuring Dockerfile steps complete

Problem: "Failed to launch browser" error

  • Check: Ensure container has sufficient memory (Chromium needs ~200MB)
  • Check: Docker runs with --privileged or appropriate capabilities
  • Solution: Add --shm-size=2gb to docker run command if /dev/shm is too small

Problem: Screenshots timing out

  • Check: Network connectivity to target services
  • Check: Services actually serve webpages (not just open ports)
  • Solution: Increase timeout in src/screenshot_capture.py:34 if needed
  • Solution: Check service responds to HTTP requests: curl -I http://IP:PORT

Problem: Screenshots are blank/empty

  • Check: Service returns valid HTML (not just TCP banner)
  • Check: Page requires JavaScript (may need longer wait time)
  • Solution: Change wait_until strategy from 'networkidle' to 'load' or 'domcontentloaded'

Problem: HTTPS certificate errors despite ignore_https_errors=True

  • Check: System certificates up to date in container
  • Solution: This should not happen; file an issue if it does

Nmap/Masscan Issues

Problem: No ports discovered

  • Check: Firewall rules allow scanning
  • Check: Targets are actually online (ping test)
  • Solution: Run manual masscan: masscan -p80,443 192.168.1.10 --rate 1000

Problem: "Operation not permitted" error

  • Check: Container runs with --privileged or CAP_NET_RAW
  • Solution: Add --privileged flag to docker run command

Problem: Service detection not working

  • Check: Nmap can connect to ports: nmap -p 80 192.168.1.10
  • Check: Services actually respond to nmap probes (some firewall/IPS block)
  • Solution: Adjust nmap intensity or timeout values