added notes to settings.yaml

moved core app config (name, version) out of settings and into app/app_settings.py
added ability to brand SneakyScope to any name
added caching of cert information from crt.sh (cache enable and lenght is configurable in settings.yaml)

streamlined header/footer loading to be more correct
This commit is contained in:
2025-08-23 20:37:44 -05:00
parent 5af8513e14
commit b59bf67329
17 changed files with 317 additions and 56 deletions

View File

@@ -0,0 +1,113 @@
Got it — heres a structured outline you can keep as a “memory primer” for when you want to resume this work (whether continuing the current engine or pivoting to a SaaS build). Think of it as a **checkpoint map** of what weve covered and what directions you flagged.
---
# SneakyScope Rules Engine Refactor & SaaS Considerations (Outline)
## 1. Current State
* Engine: `app/rules/rules_engine.py` — mixed YAML + code rules, `(bool, str)` return shape expected.
* Issue: function-based rules returning `(bool, dict)` → caused invalid type warnings.
* Stopgap: `FunctionRuleAdapter` converted returns.
* Desire: Remove YAML entirely; rules defined in code only.
---
## 2. Agreed Direction
* **Rule definition approach**:
* Option B chosen → decorator-based registration.
* Every rule defined in `app/rules/` as Python functions.
* Rules register with metadata (`name`, `category`, `severity`, etc.).
* **Return shape**:
* Always return a **Result dict** (no adapter needed).
* Engine enforces schema and fills in defaults.
* **Engine relocation**:
* Move to `app/utils/rules_engine/`.
* Responsibilities: load, validate, freeze registry, run rules, aggregate results, log/report.
---
## 3. Result Schema (concept)
* **Per RuleResult**
* Required: `ok: bool`, `message: str`.
* Identity: `name`, `category`, `severity`, `tags`, `rule_version`.
* Detail: `data: object|null`.
* Timing: `duration_ms`.
* Errors: structured `error` object if exceptions occur.
* Provenance: `source_module`, optional `policy` snapshot.
* **Per AnalysisResult (run-level envelope)**
* Input scope: target URL, category, content hash, facts profile.
* Provenance: run\_id, engine\_version, ruleset\_checksum, timestamp, duration.
* Results: array of RuleResults.
* Summary: counts by severity, match count, errors, first match, top severity.
* Artifacts: references (screenshot, DOM snapshot, etc.).
* Policy snapshot: optional central policy/overrides.
---
## 4. Operational Standards
* **Determinism**: identical inputs + ruleset\_checksum → identical results.
* **Message stability**: avoid wording churn; expand via `data`.
* **Size limits**: `message ≤ 256 chars`; `data ≤ 816 KB`.
* **Errors**: `ok=false` if error present; always emit `message`.
* **Severity**: rule sets default; policy may override.
* **Tags**: controlled vocabulary; additive.
---
## 5. Migration Plan
1. Create new `rules_engine` package in `app/utils/`.
2. Add decorator/registry for rules.
3. Port all rules from YAML → Python modules grouped by category.
4. Delete YAML loader + adapters.
5. Update call sites to build `facts` and call `engine.run(...)`.
6. Add CI tests:
* Schema compliance.
* No duplicates.
* Ruleset checksum snapshot.
7. Integration tests with real fixtures.
8. Benchmark & harden (caps on input size, rule runtime).
---
## 6. SaaS Expansion (future)
* **Multi-tenancy**: separate org/user scopes for data and rule runs.
* **RBAC**: roles (admin, analyst, viewer).
* **Compliance**: logging, retention, export, audit trails.
* **Rules**: centrally maintained, not user-editable.
* **APIs**: authenticated endpoints, per-user quotas.
* **Observability**: per-tenant metrics, alerts.
* **Security**: sandboxing, strict module allowlists, compliance with SOC2/ISO.
* **Data controls**: PII redaction, encryption, retention policies.
---
## 7. Future-Proofing Hooks
* Versioning: ruleset checksum + per-rule versions.
* Extensibility: support `actions`, `links`, `evidence` in Result.
* Policy: central config for thresholds/overrides.
* Hot reload (optional, dev-only).
* Rule provenance tracking (source\_module, commit SHA).
---
✅ This outline is enough to “re-hydrate” the context later — you wont need to dig back into old logs to remember why `(bool, str)` didnt fit, why YAML was removed, or what schema we were converging on.
---
Do you want me to also save this in a **short “README-spec” style** (like `RESULTS.md`) so it can live in your repo as the contract doc for rules, or should I keep this as just your personal checkpoint outline?