SneakyScope/docs/rules-rebuild-remember.md

Got it — here’s a structured outline you can keep as a “memory primer” for when you want to resume this work (whether continuing the current engine or pivoting to a SaaS build). Think of it as a **checkpoint map** of what we’ve covered and what directions you flagged.

---

# SneakyScope – Rules Engine Refactor & SaaS Considerations (Outline)

## 1. Current State

* Engine: `app/rules/rules_engine.py` — mixed YAML + code rules, `(bool, str)` return shape expected.
* Issue: function-based rules returning `(bool, dict)` → caused invalid type warnings.
* Stopgap: `FunctionRuleAdapter` converted returns.
* Desire: Remove YAML entirely; rules defined in code only.

---

## 2. Agreed Direction

* **Rule definition approach**:

  * Option B chosen → decorator-based registration.
  * Every rule defined in `app/rules/` as Python functions.
  * Rules register with metadata (`name`, `category`, `severity`, etc.).

* **Return shape**:

  * Always return a **Result dict** (no adapter needed).
  * Engine enforces schema and fills in defaults.

* **Engine relocation**:

  * Move to `app/utils/rules_engine/`.
  * Responsibilities: load, validate, freeze registry, run rules, aggregate results, log/report.

---

## 3. Result Schema (concept)

* **Per RuleResult**

  * Required: `ok: bool`, `message: str`.
  * Identity: `name`, `category`, `severity`, `tags`, `rule_version`.
  * Detail: `data: object|null`.
  * Timing: `duration_ms`.
  * Errors: structured `error` object if exceptions occur.
  * Provenance: `source_module`, optional `policy` snapshot.

* **Per AnalysisResult (run-level envelope)**

  * Input scope: target URL, category, content hash, facts profile.
  * Provenance: run\_id, engine\_version, ruleset\_checksum, timestamp, duration.
  * Results: array of RuleResults.
  * Summary: counts by severity, match count, errors, first match, top severity.
  * Artifacts: references (screenshot, DOM snapshot, etc.).
  * Policy snapshot: optional central policy/overrides.

---

## 4. Operational Standards

* **Determinism**: identical inputs + ruleset\_checksum → identical results.
* **Message stability**: avoid wording churn; expand via `data`.
* **Size limits**: `message ≤ 256 chars`; `data ≤ 8–16 KB`.
* **Errors**: `ok=false` if error present; always emit `message`.
* **Severity**: rule sets default; policy may override.
* **Tags**: controlled vocabulary; additive.

---

## 5. Migration Plan

1. Create new `rules_engine` package in `app/utils/`.
2. Add decorator/registry for rules.
3. Port all rules from YAML → Python modules grouped by category.
4. Delete YAML loader + adapters.
5. Update call sites to build `facts` and call `engine.run(...)`.
6. Add CI tests:

   * Schema compliance.
   * No duplicates.
   * Ruleset checksum snapshot.
7. Integration tests with real fixtures.
8. Benchmark & harden (caps on input size, rule runtime).

---

## 6. SaaS Expansion (future)

* **Multi-tenancy**: separate org/user scopes for data and rule runs.
* **RBAC**: roles (admin, analyst, viewer).
* **Compliance**: logging, retention, export, audit trails.
* **Rules**: centrally maintained, not user-editable.
* **APIs**: authenticated endpoints, per-user quotas.
* **Observability**: per-tenant metrics, alerts.
* **Security**: sandboxing, strict module allowlists, compliance with SOC2/ISO.
* **Data controls**: PII redaction, encryption, retention policies.

---

## 7. Future-Proofing Hooks

* Versioning: ruleset checksum + per-rule versions.
* Extensibility: support `actions`, `links`, `evidence` in Result.
* Policy: central config for thresholds/overrides.
* Hot reload (optional, dev-only).
* Rule provenance tracking (source\_module, commit SHA).

---

✅ This outline is enough to “re-hydrate” the context later — you won’t need to dig back into old logs to remember why `(bool, str)` didn’t fit, why YAML was removed, or what schema we were converging on.

---

Do you want me to also save this in a **short “README-spec” style** (like `RESULTS.md`) so it can live in your repo as the contract doc for rules, or should I keep this as just your personal checkpoint outline?