feat: HTTPS auto-normalization; robust TLS intel UI; global rules state; clean logging; preload

- Add SSL/TLS intelligence pipeline:
  - crt.sh lookup with expired-filtering and root-domain wildcard resolution
  - live TLS version/cipher probe with weak/legacy flags and probe notes
- UI: card + matrix rendering, raw JSON toggle, and host/wildcard cert lists
- Front page: checkbox to optionally fetch certificate/CT data

- Introduce `URLNormalizer` with punycode support and typo repair
  - Auto-prepend `https://` for bare domains (e.g., `google.com`)
  - Optional quick HTTPS reachability + `http://` fallback
- Provide singleton via function-cached `@singleton_loader`:
  - `get_url_normalizer()` reads defaults from Settings (if present)

- Standardize function-rule return shape to `(bool, dict|None)` across
  `form_*` and `script_*` rules; include structured payloads (`note`, hosts, ext, etc.)
- Harden `FunctionRuleAdapter`:
  - Coerce legacy returns `(bool)`, `(bool, str)` → normalized outputs
  - Adapt non-dict inputs to facts (category-aware and via provided adapter)
  - Return `(True, dict)` on match, `(False, None)` on miss
  - Bind-time logging with file:line + function id for diagnostics
- `RuleEngine`:
  - Back rules by private `self._rules`; `rules` property returns copy
  - Idempotent `add_rule(replace=False)` with in-place replace and regex (re)compile
  - Fix AttributeError from property assignment during `__init__`

- Replace hidden singleton factory with explicit builder + global state:
  - `app/rules/factory.py::build_rules_engine()` builds and logs totals
  - `app/state.py` exposes `set_rules_engine()` / `get_rules_engine()` as the SOF
  - `app/wsgi.py` builds once at preload and publishes via `set_rules_engine()`
- Add lightweight debug hooks (`SS_DEBUG_RULES=1`) to trace engine id and rule counts

- Unify logging wiring:
  - `wire_logging_once(app)` clears and attaches a single handler chain
  - Create two named loggers: `sneakyscope.app` and `sneakyscope.engine`
  - Disable propagation to prevent dupes; include pid/logger name in format
- Remove stray/duplicate handlers and import-time logging
- Optional dedup filter for bursty repeats (kept off by default)

- Gunicorn: enable `--preload` in entrypoint to avoid thread races and double registration
- Documented foreground vs background log “double consumer” caveat (attach vs `compose logs`)

- Jinja: replace `{% return %}` with structured `if/elif/else` branches
- Add toggle button to show raw JSON for TLS/CT section

- Consumers should import the rules engine via:
  - `from app.state import get_rules_engine`
- Use `build_rules_engine()` **only** during preload/init to construct the instance,
  then publish with `set_rules_engine()`. Do not call old singleton factories.

- New/changed modules (high level):
  - `app/utils/urltools.py` (+) — URLNormalizer + `get_url_normalizer()`
  - `app/rules/function_rules.py` (±) — normalized payload returns
  - `engine/function_rule_adapter.py` (±) — coercion, fact adaptation, bind logs
  - `app/utils/rules_engine.py` (±) — `_rules`, idempotent `add_rule`, fixes
  - `app/rules/factory.py` (±) — pure builder; totals logged post-registration
  - `app/state.py` (+) — process-global rules engine
  - `app/logging_setup.py` (±) — single chain, two named loggers
  - `app/wsgi.py` (±) — preload build + `set_rules_engine()`
  - `entrypoint.sh` (±) — add `--preload`
  - templates (±) — TLS card, raw toggle; front-page checkbox

Closes: flaky rule-type warnings, duplicate logs, and multi-worker race on rules init.
This commit is contained in:
2025-08-21 22:05:16 -05:00
parent f639ad0934
commit 693f7d67b9
22 changed files with 1476 additions and 256 deletions

View File

@@ -0,0 +1,182 @@
{# templates/_macros_ssl_tls.html #}
{% macro ssl_tls_card(ssl_tls) %}
<div class="card" id="ssl">
<h2 class="card-title">SSL/TLS Intelligence</h2>
{# -------- 1) Error branch -------- #}
{% if ssl_tls is none or 'error' in ssl_tls %}
<div class="badge badge-danger">Error</div>
<p class="muted">SSL/TLS enrichment failed or is unavailable.</p>
{% if ssl_tls and ssl_tls.error %}<pre class="prewrap">{{ ssl_tls.error }}</pre>{% endif %}
{# -------- 2) Skipped branch -------- #}
{% elif ssl_tls.skipped %}
<div class="badge badge-muted">Skipped</div>
{% if ssl_tls.reason %}<span class="muted small">{{ ssl_tls.reason }}</span>{% endif %}
<div class="section">
<button class="badge badge-muted" data-toggle="tls-raw">Toggle raw</button>
<pre id="tls-raw" hidden>{{ ssl_tls|tojson(indent=2) }}</pre>
</div>
{# -------- 3) Normal branch (render probe + crt.sh) -------- #}
{% else %}
{# ===================== LIVE PROBE ===================== #}
{% set probe = ssl_tls.probe if ssl_tls else None %}
<section class="section">
<div class="section-header">
<h3>Live TLS Probe</h3>
{% if probe %}
<span class="muted">Host:</span> <code>{{ probe.hostname }}:{{ probe.port }}</code>
{% endif %}
</div>
{% if not probe %}
<p class="muted">No probe data.</p>
{% else %}
<div class="tls-matrix">
{% set versions = ['TLS1.0','TLS1.1','TLS1.2','TLS1.3'] %}
{% for v in versions %}
{% set r = probe.results_by_version.get(v) if probe.results_by_version else None %}
<div class="tls-matrix-row">
<div class="tls-cell version">{{ v }}</div>
{% if r and r.supported %}
<div class="tls-cell status"><span class="badge badge-ok">Supported</span></div>
<div class="tls-cell cipher">
{% if r.selected_cipher %}
<span class="chip">{{ r.selected_cipher }}</span>
{% else %}
<span class="muted">cipher: n/a</span>
{% endif %}
</div>
<div class="tls-cell latency">
{% if r.handshake_seconds is not none %}
<span class="muted">{{ '%.0f' % (r.handshake_seconds*1000) }} ms</span>
{% else %}
<span class="muted"></span>
{% endif %}
</div>
{% else %}
<div class="tls-cell status"><span class="badge badge-muted">Not Supported</span></div>
<div class="tls-cell cipher">
{% if r and r.error %}
<span class="muted small">({{ r.error }})</span>
{% else %}
<span class="muted"></span>
{% endif %}
</div>
<div class="tls-cell latency"><span class="muted"></span></div>
{% endif %}
</div>
{% endfor %}
</div>
<div class="flag-row">
{% if probe.weak_protocols and probe.weak_protocols|length > 0 %}
<span class="badge badge-warn">Weak Protocols</span>
{% for wp in probe.weak_protocols %}
<span class="chip chip-warn">{{ wp }}</span>
{% endfor %}
{% endif %}
{% if probe.weak_ciphers and probe.weak_ciphers|length > 0 %}
<span class="badge badge-warn">Weak Ciphers</span>
{% for wc in probe.weak_ciphers %}
<span class="chip chip-warn">{{ wc }}</span>
{% endfor %}
{% endif %}
</div>
{% if probe.errors and probe.errors|length > 0 %}
<details class="details">
<summary>Probe Notes</summary>
<ul class="list">
{% for e in probe.errors %}
<li class="muted small">{{ e }}</li>
{% endfor %}
</ul>
</details>
{% endif %}
{% endif %}
</section>
<hr class="divider"/>
{# ===================== CRT.SH ===================== #}
{% set crtsh = ssl_tls.crtsh if ssl_tls else None %}
<section class="section">
<div class="section-header">
<h3>Certificate Transparency (crt.sh)</h3>
{% if crtsh %}
<span class="muted">Parsed:</span>
<code>{{ crtsh.hostname or 'n/a' }}</code>
{% if crtsh.root_domain %}
<span class="muted"> • Root:</span> <code>{{ crtsh.root_domain }}</code>
{% if crtsh.is_root_domain %}<span class="badge badge-ok">Root</span>{% else %}<span class="badge badge-muted">Subdomain</span>{% endif %}
{% endif %}
{% endif %}
</div>
{% if not crtsh %}
<p class="muted">No CT data.</p>
{% else %}
<div class="grid two">
<div>
<h4 class="muted">Host Certificates</h4>
{% set host_certs = crtsh.crtsh.host_certs if 'crtsh' in crtsh and crtsh.crtsh else None %}
{% if host_certs and host_certs|length > 0 %}
<ul class="list">
{% for c in host_certs[:10] %}
<li class="mono small">
<span class="chip">{{ c.get('issuer_name','issuer n/a') }}</span>
<span class="muted"></span>
<strong>{{ c.get('name_value','(name n/a)') }}</strong>
<span class="muted"> • not_before:</span> {{ c.get('not_before','?') }}
</li>
{% endfor %}
</ul>
{% if host_certs|length > 10 %}
<div class="muted small">(+ {{ host_certs|length - 10 }} more)</div>
{% endif %}
{% else %}
<p class="muted">No active host certs found.</p>
{% endif %}
</div>
<div>
<h4 class="muted">Wildcard on Root</h4>
{% set wc = crtsh.crtsh.wildcard_root_certs if 'crtsh' in crtsh and crtsh.crtsh else None %}
{% if wc and wc|length > 0 %}
<ul class="list">
{% for c in wc[:10] %}
<li class="mono small">
<span class="chip">{{ c.get('issuer_name','issuer n/a') }}</span>
<span class="muted"></span>
<strong>{{ c.get('name_value','(name n/a)') }}</strong>
<span class="muted"> • not_before:</span> {{ c.get('not_before','?') }}
</li>
{% endfor %}
</ul>
{% if wc|length > 10 %}
<div class="muted small">(+ {{ wc|length - 10 }} more)</div>
{% endif %}
{% else %}
<p class="muted">No wildcard/root certs found.</p>
{% endif %}
</div>
</div>
{% endif %}
</section>
{# ===================== RAW JSON TOGGLE ===================== #}
<div class="section">
<button class="badge badge-muted" data-toggle="tls-raw">Toggle raw</button>
<pre id="tls-raw" hidden>{{ ssl_tls|tojson(indent=2) }}</pre>
</div>
{% endif %}
<p><a href="#top-jump-list">Back to top</a></p>
</div>
{% endmacro %}