added notes to settings.yaml

moved core app config (name, version) out of settings and into app/app_settings.py added ability to brand SneakyScope to any name added caching of cert information from crt.sh (cache enable and lenght is configurable in settings.yaml) streamlined header/footer loading to be more correct
2025-08-23 20:37:44 -05:00
parent 5af8513e14
commit b59bf67329
17 changed files with 317 additions and 56 deletions
--- a/docs/rules-rebuild-remember.md
+++ b/docs/rules-rebuild-remember.md
@@ -0,0 +1,113 @@
+Got it — here’s a structured outline you can keep as a “memory primer” for when you want to resume this work (whether continuing the current engine or pivoting to a SaaS build). Think of it as a **checkpoint map** of what we’ve covered and what directions you flagged.
+
+---
+
+# SneakyScope – Rules Engine Refactor & SaaS Considerations (Outline)
+
+## 1. Current State
+
+* Engine: `app/rules/rules_engine.py` — mixed YAML + code rules, `(bool, str)` return shape expected.
+* Issue: function-based rules returning `(bool, dict)` → caused invalid type warnings.
+* Stopgap: `FunctionRuleAdapter` converted returns.
+* Desire: Remove YAML entirely; rules defined in code only.
+
+---
+
+## 2. Agreed Direction
+
+* **Rule definition approach**:
+
+  * Option B chosen → decorator-based registration.
+  * Every rule defined in `app/rules/` as Python functions.
+  * Rules register with metadata (`name`, `category`, `severity`, etc.).
+
+* **Return shape**:
+
+  * Always return a **Result dict** (no adapter needed).
+  * Engine enforces schema and fills in defaults.
+
+* **Engine relocation**:
+
+  * Move to `app/utils/rules_engine/`.
+  * Responsibilities: load, validate, freeze registry, run rules, aggregate results, log/report.
+
+---
+
+## 3. Result Schema (concept)
+
+* **Per RuleResult**
+
+  * Required: `ok: bool`, `message: str`.
+  * Identity: `name`, `category`, `severity`, `tags`, `rule_version`.
+  * Detail: `data: object|null`.
+  * Timing: `duration_ms`.
+  * Errors: structured `error` object if exceptions occur.
+  * Provenance: `source_module`, optional `policy` snapshot.
+
+* **Per AnalysisResult (run-level envelope)**
+
+  * Input scope: target URL, category, content hash, facts profile.
+  * Provenance: run\_id, engine\_version, ruleset\_checksum, timestamp, duration.
+  * Results: array of RuleResults.
+  * Summary: counts by severity, match count, errors, first match, top severity.
+  * Artifacts: references (screenshot, DOM snapshot, etc.).
+  * Policy snapshot: optional central policy/overrides.
+
+---
+
+## 4. Operational Standards
+
+* **Determinism**: identical inputs + ruleset\_checksum → identical results.
+* **Message stability**: avoid wording churn; expand via `data`.
+* **Size limits**: `message ≤ 256 chars`; `data ≤ 8–16 KB`.
+* **Errors**: `ok=false` if error present; always emit `message`.
+* **Severity**: rule sets default; policy may override.
+* **Tags**: controlled vocabulary; additive.
+
+---
+
+## 5. Migration Plan
+
+1. Create new `rules_engine` package in `app/utils/`.
+2. Add decorator/registry for rules.
+3. Port all rules from YAML → Python modules grouped by category.
+4. Delete YAML loader + adapters.
+5. Update call sites to build `facts` and call `engine.run(...)`.
+6. Add CI tests:
+
+   * Schema compliance.
+   * No duplicates.
+   * Ruleset checksum snapshot.
+7. Integration tests with real fixtures.
+8. Benchmark & harden (caps on input size, rule runtime).
+
+---
+
+## 6. SaaS Expansion (future)
+
+* **Multi-tenancy**: separate org/user scopes for data and rule runs.
+* **RBAC**: roles (admin, analyst, viewer).
+* **Compliance**: logging, retention, export, audit trails.
+* **Rules**: centrally maintained, not user-editable.
+* **APIs**: authenticated endpoints, per-user quotas.
+* **Observability**: per-tenant metrics, alerts.
+* **Security**: sandboxing, strict module allowlists, compliance with SOC2/ISO.
+* **Data controls**: PII redaction, encryption, retention policies.
+
+---
+
+## 7. Future-Proofing Hooks
+
+* Versioning: ruleset checksum + per-rule versions.
+* Extensibility: support `actions`, `links`, `evidence` in Result.
+* Policy: central config for thresholds/overrides.
+* Hot reload (optional, dev-only).
+* Rule provenance tracking (source\_module, commit SHA).
+
+---
+
+✅ This outline is enough to “re-hydrate” the context later — you won’t need to dig back into old logs to remember why `(bool, str)` didn’t fit, why YAML was removed, or what schema we were converging on.
+
+---
+
+Do you want me to also save this in a **short “README-spec” style** (like `RESULTS.md`) so it can live in your repo as the contract doc for rules, or should I keep this as just your personal checkpoint outline?