Files
chicken_babies_site/app/services/slugs.py
Phillip Tarrant 9a8506970c feat: phase 4 admin CMS — dashboard, editor, media, CSRF
Head Hen CMS end-to-end: dashboard lists all posts (drafts + published),
Markdown editor with live preview + drag-drop image upload, Pillow media
pipeline re-encoding every upload to JPEG, post CRUD + publish toggle +
hard delete, About page edit, and double-submit CSRF cookie enforced on
every admin mutating endpoint (Phase 3's TODO markers resolved).

Slug auto-generated on create and server-locked once a post has been
published. Unpublish preserves `published_at` so re-publish keeps
original date ordering. Every admin write invalidates the read-side
Post/Page TTL caches and records an `auth_events` audit row.

CSRF middleware is narrow by design — issues/refreshes the `cb_csrf`
cookie only on `GET /admin*`, and mutating endpoints opt in via
`require_csrf_form` or `require_csrf_header` Depends. Public routes,
healthz, and pre-auth login stay untouched.

64 new tests cover slugs, CSRF, media, admin posts/pages services, and
end-to-end CMS routes. Tests never mock the DB — real temp SQLite files
per the CLAUDE.md mandate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 20:42:01 -05:00

107 lines
3.4 KiB
Python

"""Slug helpers for posts (and, eventually, any other slug-keyed row).
A slug is the URL-safe identifier used in public post URLs. Keeping the
algorithm tiny, dependency-free, and in its own module makes it easy to
test in isolation and to reuse for the Phase 4 admin create/update
flow.
Rules applied by :func:`slugify`:
- lowercase the input
- replace every run of non-alphanumeric characters with a single ``-``
- collapse consecutive ``-`` runs
- strip leading and trailing ``-``
- never return an empty string — callers that pass empty / all-punctuation
input get a deterministic fallback (``"post"``) so they can still
build a valid URL.
:func:`ensure_unique` suffixes ``-2``, ``-3`` ... on collision, checking
the database row presence via a callable the caller supplies. Keeping
the DB access injectable keeps this module trivially testable.
"""
from __future__ import annotations
import re
from typing import Callable
# Single-pass regex collapses any run of non-alphanumeric characters
# into a single hyphen. Unicode letters are NOT preserved — the URL
# column is ASCII-safe by design, so exotic characters collapse away.
_NON_ALNUM_RE: re.Pattern[str] = re.compile(r"[^a-z0-9]+")
# Fallback slug when the user submits a title that slugifies to the
# empty string (e.g. only punctuation). Keeps write paths from crashing
# on pathological input while remaining human-readable in the URL.
_FALLBACK_SLUG: str = "post"
def slugify(title: str) -> str:
"""Return a URL-safe slug derived from ``title``.
Parameters
----------
title:
Human-authored title, typically from an admin form. Treated as
untrusted — no assumption about length or character set.
Returns
-------
str
A lowercased, hyphen-separated string containing only
``[a-z0-9-]`` with no leading or trailing hyphens. Never
empty; returns :data:`_FALLBACK_SLUG` if the input produced
an empty result after normalization.
"""
lowered = (title or "").lower()
collapsed = _NON_ALNUM_RE.sub("-", lowered).strip("-")
if not collapsed:
return _FALLBACK_SLUG
return collapsed
def ensure_unique(
base: str,
exists: Callable[[str], bool],
*,
max_attempts: int = 1000,
) -> str:
"""Return a slug not currently in use, suffixing ``-2`` / ``-3`` as needed.
Parameters
----------
base:
Starting slug — typically the output of :func:`slugify`.
exists:
Callable that returns ``True`` if the candidate slug is already
taken. The admin service passes a closure that hits the DB.
max_attempts:
Defensive bound on suffix-iteration so a degenerate ``exists``
callable can never spin forever. 1000 is wildly more than any
realistic collision rate.
Returns
-------
str
A slug ``exists`` returned ``False`` for. Raises
:class:`RuntimeError` in the pathological case where every
suffix is taken up to ``max_attempts``.
"""
if not exists(base):
return base
# Start at -2 because the bare slug is already taken. -1 would be
# reserved for the same row we're competing with, which is confusing
# in the DB.
for n in range(2, max_attempts + 1):
candidate = f"{base}-{n}"
if not exists(candidate):
return candidate
raise RuntimeError(
f"could not allocate a unique slug after {max_attempts} attempts"
f" (base={base!r})"
)