diff --git a/06a-ALPHA-SCOPE.md b/06a-ALPHA-SCOPE.md index 7cd9e5e..41fd2bc 100644 --- a/06a-ALPHA-SCOPE.md +++ b/06a-ALPHA-SCOPE.md @@ -95,13 +95,11 @@ GET /api/v1/projects/{id}/nodes?status=todo&assignee=agent-1&unblocked=true - Flat comment stream per node. Markdown body. Author + timestamp. -### Infrastructure (~10 containers) +### Infrastructure (~8 containers) ``` api (FastAPI + uvicorn) frontend (Vue 3 + Vite) -worker (Taskiq — webhook delivery, skeleton inference) -neo4j postgres redis centrifugo diff --git a/10-ALPHA-DB-SCHEMA.md b/10-ALPHA-DB-SCHEMA.md new file mode 100644 index 0000000..ad919ed --- /dev/null +++ b/10-ALPHA-DB-SCHEMA.md @@ -0,0 +1,604 @@ +# Non-Linear: Alpha Database Schema (Postgres-Only) + +## Design Decisions + +### Why Postgres-Only + +Alpha ships without Neo4j (~8 containers). The graph in alpha is small (<500 nodes per project) with limited lateral link types (`blocks`, `relates_to`). Postgres handles this comfortably via: + +- Adjacency list (`parent_id`) for the decomposition tree +- `ltree` materialized path for fast subtree queries without recursive CTEs +- A `lateral_links` table for typed edges between nodes + +The dual-DB architecture (Neo4j for topology, Postgres for content) remains the long-term direction for Layer 3 code connections and cross-project edges. This schema is designed to migrate cleanly: the `nodes` table splits into Neo4j (topology + lightweight props) and Postgres (content) by extracting `description`, `description_html` into a `node_content` table keyed by the same UUID. + +### Unified Nodes Table + +Components and issues share a single `nodes` table with a `node_type` discriminator. Rationale: + +- The decomposition tree mixes both types (a component's child can be an issue, an issue's child can be a sub-issue) +- Tree queries (`parent_id`, `path`) operate uniformly across types +- Issue-specific columns (`status`, `assignee_id`) are nullable and ignored for components +- Avoids polymorphic joins for tree traversal + +### ltree for Subtree Queries + +Each node stores a materialized `path` column of type `ltree`. Example: `root.comp_abc.issue_def`. This enables: + +- `SELECT * FROM nodes WHERE path <@ 'root.comp_abc'` — all descendants of a component +- `SELECT * FROM nodes WHERE path @> 'root.comp_abc.issue_def'` — all ancestors +- Index-backed, no recursion needed + +The path uses node short-integer IDs (the `seq` value) as segments for compactness. Updated on reparent via a single `UPDATE ... SET path = new_prefix || subpath(path, nlevel(old_prefix))` for the subtree. + +### Labels as Array + +Labels are stored as `text[]` with a GIN index. For alpha's freeform tags this is simpler than a join table and supports queries like `WHERE labels @> ARRAY['bug', 'p0']`. A normalized `labels` table can be introduced post-alpha if label management (rename, merge, color) becomes necessary. + +### UUIDv7 + +All primary keys use UUIDv7 (time-sortable, generated application-side). Benefits: + +- Natural chronological ordering without a separate `created_at` sort +- Safe for distributed ID generation (no coordination needed) +- Same ID used across systems (future Neo4j migration, Centrifugo channels, webhook payloads) + +### Short IDs + +Human-readable IDs like `NL-42` are generated per-project using an atomic counter (`next_short_id` on the `projects` table). The prefix is configurable per project. Short IDs are unique within a project and immutable once assigned. + +--- + +## Extensions + +```sql +CREATE EXTENSION IF NOT EXISTS ltree; +CREATE EXTENSION IF NOT EXISTS pgcrypto; -- gen_random_uuid() fallback if app doesn't supply UUIDv7 +``` + +--- + +## Enums + +```sql +CREATE TYPE node_type AS ENUM ('component', 'issue'); + +CREATE TYPE node_status AS ENUM ( + 'backlog', + 'todo', + 'in_progress', + 'in_review', + 'done', + 'cancelled' +); + +CREATE TYPE link_type AS ENUM ('blocks', 'relates_to'); + +CREATE TYPE actor_type AS ENUM ('user', 'agent'); + +CREATE TYPE workspace_role AS ENUM ('owner', 'member'); + +CREATE TYPE project_role AS ENUM ('owner', 'member', 'agent'); + +CREATE TYPE repo_provider AS ENUM ('github', 'gitlab'); + +CREATE TYPE audit_action AS ENUM ( + 'node_created', + 'node_updated', + 'node_deleted', + 'node_reparented', + 'status_changed', + 'assignee_changed', + 'labels_changed', + 'link_created', + 'link_deleted', + 'comment_created', + 'comment_updated', + 'comment_deleted', + 'member_added', + 'member_removed', + 'member_role_changed', + 'repo_connected', + 'repo_disconnected', + 'webhook_created', + 'webhook_deleted' +); +``` + +--- + +## Tables + +### workspaces + +```sql +CREATE TABLE workspaces ( + id UUID PRIMARY KEY, + name TEXT NOT NULL, + slug TEXT NOT NULL UNIQUE, + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT now() +); + +CREATE UNIQUE INDEX idx_workspaces_slug ON workspaces (slug); +``` + +### actors + +```sql +CREATE TABLE actors ( + id UUID PRIMARY KEY, + actor_type actor_type NOT NULL, + display_name TEXT NOT NULL, + email TEXT, + avatar_url TEXT, + authentik_uid TEXT UNIQUE, -- OIDC subject claim (users) + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT now() +); + +CREATE INDEX idx_actors_email ON actors (email) WHERE email IS NOT NULL; +CREATE INDEX idx_actors_authentik ON actors (authentik_uid) WHERE authentik_uid IS NOT NULL; +``` + +### workspace_members + +```sql +CREATE TABLE workspace_members ( + workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE, + actor_id UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE, + role workspace_role NOT NULL DEFAULT 'member', + joined_at TIMESTAMPTZ NOT NULL DEFAULT now(), + + PRIMARY KEY (workspace_id, actor_id) +); + +CREATE INDEX idx_wm_actor ON workspace_members (actor_id); +``` + +### projects + +```sql +CREATE TABLE projects ( + id UUID PRIMARY KEY, + workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE, + name TEXT NOT NULL, + slug TEXT NOT NULL, + short_id_prefix TEXT NOT NULL DEFAULT 'NL', -- e.g. "NL" → NL-1, NL-2 + next_short_id INTEGER NOT NULL DEFAULT 1, -- atomically incremented + root_node_id UUID, -- set after root node creation + settings JSONB NOT NULL DEFAULT '{}', -- custom statuses, defaults (post-alpha) + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), + + UNIQUE (workspace_id, slug) +); + +CREATE INDEX idx_projects_workspace ON projects (workspace_id); +``` + +### project_members + +```sql +CREATE TABLE project_members ( + project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, + actor_id UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE, + role project_role NOT NULL DEFAULT 'member', + joined_at TIMESTAMPTZ NOT NULL DEFAULT now(), + + PRIMARY KEY (project_id, actor_id) +); + +CREATE INDEX idx_pm_actor ON project_members (actor_id); +``` + +### nodes + +The core table. Stores both components and issues in a single table with the decomposition tree structure. + +```sql +CREATE TABLE nodes ( + id UUID PRIMARY KEY, + project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, + node_type node_type NOT NULL, + short_id INTEGER NOT NULL, -- numeric part: 42 in "NL-42" + title TEXT NOT NULL, + description TEXT, -- markdown source + description_html TEXT, -- pre-rendered, sanitized HTML + + -- Tree structure + parent_id UUID REFERENCES nodes(id) ON DELETE SET NULL, + path ltree NOT NULL, -- materialized path for subtree queries + + -- Issue-specific (NULL for components) + status node_status, + assignee_id UUID REFERENCES actors(id) ON DELETE SET NULL, + created_by UUID NOT NULL REFERENCES actors(id), + + -- Shared + labels TEXT[] NOT NULL DEFAULT '{}', + + -- Timestamps + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), + + -- Constraints + UNIQUE (project_id, short_id), + CONSTRAINT chk_issue_has_status CHECK ( + (node_type = 'issue' AND status IS NOT NULL) + OR node_type = 'component' + ), + CONSTRAINT chk_component_no_status CHECK ( + (node_type = 'component' AND status IS NULL) + OR node_type = 'issue' + ) +); + +-- Tree traversal +CREATE INDEX idx_nodes_parent ON nodes (parent_id) WHERE parent_id IS NOT NULL; +CREATE INDEX idx_nodes_path ON nodes USING GIST (path); + +-- Filtering +CREATE INDEX idx_nodes_project ON nodes (project_id); +CREATE INDEX idx_nodes_status ON nodes (project_id, status) WHERE status IS NOT NULL; +CREATE INDEX idx_nodes_assignee ON nodes (assignee_id) WHERE assignee_id IS NOT NULL; +CREATE INDEX idx_nodes_labels ON nodes USING GIN (labels); +CREATE INDEX idx_nodes_type ON nodes (project_id, node_type); + +-- Full-text search +ALTER TABLE nodes ADD COLUMN search_vector tsvector + GENERATED ALWAYS AS ( + setweight(to_tsvector('english', coalesce(title, '')), 'A') || + setweight(to_tsvector('english', coalesce(description, '')), 'B') + ) STORED; + +CREATE INDEX idx_nodes_fts ON nodes USING GIN (search_vector); + +-- Short ID lookup +CREATE INDEX idx_nodes_short_id ON nodes (project_id, short_id); +``` + +### lateral_links + +Typed edges between nodes (not part of the decomposition tree). + +```sql +CREATE TABLE lateral_links ( + id UUID PRIMARY KEY, + project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, + source_id UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE, + target_id UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE, + link_type link_type NOT NULL, + created_by UUID NOT NULL REFERENCES actors(id), + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + + -- No self-links, no duplicate links in same direction + CONSTRAINT chk_no_self_link CHECK (source_id != target_id), + UNIQUE (source_id, target_id, link_type) +); + +CREATE INDEX idx_links_source ON lateral_links (source_id); +CREATE INDEX idx_links_target ON lateral_links (target_id); +CREATE INDEX idx_links_project ON lateral_links (project_id); +``` + +**Semantics:** + +- `blocks`: directed — `source` blocks `target`. Query "what blocks issue X" = `WHERE target_id = X AND link_type = 'blocks'`. +- `relates_to`: undirected — stored once (lower UUID as source by convention). Query both directions. + +### comments + +Flat comment stream per node. + +```sql +CREATE TABLE comments ( + id UUID PRIMARY KEY, + node_id UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE, + author_id UUID NOT NULL REFERENCES actors(id), + body TEXT NOT NULL, -- markdown source + body_html TEXT NOT NULL, -- pre-rendered, sanitized HTML + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT now() +); + +CREATE INDEX idx_comments_node ON comments (node_id, created_at); +CREATE INDEX idx_comments_author ON comments (author_id); + +-- Full-text search on comments +ALTER TABLE comments ADD COLUMN search_vector tsvector + GENERATED ALWAYS AS ( + to_tsvector('english', coalesce(body, '')) + ) STORED; + +CREATE INDEX idx_comments_fts ON comments USING GIN (search_vector); +``` + +### audit_events + +Append-only change history. Every mutation is recorded. + +```sql +CREATE TABLE audit_events ( + id UUID PRIMARY KEY, + project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, + actor_id UUID NOT NULL REFERENCES actors(id), + action audit_action NOT NULL, + node_id UUID, -- NULL for non-node events (member changes, etc.) + before_data JSONB, -- snapshot of changed fields before mutation + after_data JSONB, -- snapshot of changed fields after mutation + metadata JSONB, -- additional context (e.g. commit SHA for linked changes) + created_at TIMESTAMPTZ NOT NULL DEFAULT now() +); + +CREATE INDEX idx_audit_project_time ON audit_events (project_id, created_at DESC); +CREATE INDEX idx_audit_node ON audit_events (node_id, created_at DESC) WHERE node_id IS NOT NULL; +CREATE INDEX idx_audit_actor ON audit_events (actor_id, created_at DESC); +``` + +### repo_connections + +Repositories linked to a project. Components reference these via `repo_connection_id`. + +```sql +CREATE TABLE repo_connections ( + id UUID PRIMARY KEY, + project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, + provider repo_provider NOT NULL, + repo_url TEXT NOT NULL, + default_branch TEXT NOT NULL DEFAULT 'main', + access_token_enc TEXT, -- encrypted OAuth token + webhook_secret_hash TEXT, -- hashed webhook secret for incoming pushes + connected_by UUID NOT NULL REFERENCES actors(id), + connected_at TIMESTAMPTZ NOT NULL DEFAULT now(), + + UNIQUE (project_id, repo_url) +); + +CREATE INDEX idx_repos_project ON repo_connections (project_id); +``` + +### node_repo_links + +Maps components to specific paths within connected repositories. + +```sql +CREATE TABLE node_repo_links ( + node_id UUID PRIMARY KEY REFERENCES nodes(id) ON DELETE CASCADE, + repo_connection_id UUID NOT NULL REFERENCES repo_connections(id) ON DELETE CASCADE, + path TEXT, -- subdirectory within repo (NULL = repo root) + branch TEXT -- branch override (NULL = repo default) +); + +CREATE INDEX idx_nrl_repo ON node_repo_links (repo_connection_id); +``` + +### api_tokens + +Bearer tokens for agent access. + +```sql +CREATE TABLE api_tokens ( + id UUID PRIMARY KEY, + actor_id UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE, + token_hash TEXT NOT NULL UNIQUE, -- SHA-256 hash of the token (never store plaintext) + name TEXT NOT NULL, -- human-readable label ("triage-agent-prod") + last_used_at TIMESTAMPTZ, + expires_at TIMESTAMPTZ, + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + + CONSTRAINT chk_expiry CHECK (expires_at IS NULL OR expires_at > created_at) +); + +CREATE INDEX idx_tokens_actor ON api_tokens (actor_id); +CREATE INDEX idx_tokens_hash ON api_tokens (token_hash); +``` + +### webhook_configs + +Minimal webhook registration per project. + +```sql +CREATE TABLE webhook_configs ( + id UUID PRIMARY KEY, + project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, + url TEXT NOT NULL, + events TEXT[] NOT NULL DEFAULT '{}', -- e.g. {'node.status_changed', 'comment.added'} + active BOOLEAN NOT NULL DEFAULT true, + consecutive_failures INTEGER NOT NULL DEFAULT 0, + last_delivery_at TIMESTAMPTZ, + created_by UUID NOT NULL REFERENCES actors(id), + created_at TIMESTAMPTZ NOT NULL DEFAULT now() +); + +CREATE INDEX idx_webhooks_project ON webhook_configs (project_id); +``` + +--- + +## ltree Path Maintenance + +### Path Format + +Each node's `path` is composed of segments representing the chain of `short_id` values from root to node, prefixed with the project's root identifier: + +``` +root.12.45.78 + │ │ │ └── this node (short_id=78) + │ │ └── parent (short_id=45) + │ └── grandparent (short_id=12) + └── project root sentinel +``` + +Using integer short IDs as segments keeps paths compact and unique within a project. + +### On Node Creation + +```sql +-- Pseudocode (application layer): +-- 1. Atomically claim a short_id: +UPDATE projects SET next_short_id = next_short_id + 1 +WHERE id = :project_id +RETURNING next_short_id - 1 AS new_short_id; + +-- 2. Compute path from parent: +-- If parent_id IS NULL (root node): path = 'root' +-- Otherwise: path = parent.path || '.' || new_short_id::text +INSERT INTO nodes (id, project_id, node_type, short_id, title, parent_id, path, ...) +VALUES (:id, :project_id, :type, :new_short_id, :title, :parent_id, :computed_path, ...); +``` + +### On Reparent + +When a node moves to a new parent, update the entire subtree's paths in one statement: + +```sql +-- :old_path = current node's path (e.g. 'root.12.45') +-- :new_parent_path = new parent's path (e.g. 'root.99') +-- :node_short_id = the moved node's short_id segment + +UPDATE nodes +SET + path = :new_parent_path || '.' || :node_short_id::text + || subpath(path, nlevel(:old_path)), + parent_id = CASE WHEN id = :node_id THEN :new_parent_id ELSE parent_id END, + updated_at = now() +WHERE path <@ :old_path; +``` + +This updates the moved node and all its descendants in a single indexed operation. + +--- + +## Short ID Generation + +Short IDs are assigned atomically using `UPDATE ... RETURNING`: + +```sql +-- Claim next short_id for a project (called from application layer) +UPDATE projects +SET next_short_id = next_short_id + 1 +WHERE id = :project_id +RETURNING next_short_id - 1 AS short_id; +``` + +The full human-readable ID is `{project.short_id_prefix}-{short_id}`, e.g. `NL-42`. This is computed at read time, not stored as a string — only the integer is persisted on the node. + +--- + +## Full-Text Search + +Search is powered by generated `tsvector` columns with GIN indexes on both `nodes` and `comments`. + +### Query Pattern + +```sql +-- Search nodes in a project +SELECT id, short_id, title, node_type, + ts_rank(search_vector, query) AS rank +FROM nodes, to_tsquery('english', :search_term) query +WHERE project_id = :project_id + AND search_vector @@ query +ORDER BY rank DESC +LIMIT 20; + +-- Search comments in a project (via join) +SELECT c.id, c.node_id, c.body, c.author_id, + ts_rank(c.search_vector, query) AS rank +FROM comments c +JOIN nodes n ON n.id = c.node_id, to_tsquery('english', :search_term) query +WHERE n.project_id = :project_id + AND c.search_vector @@ query +ORDER BY rank DESC +LIMIT 20; +``` + +### Command Palette Search + +The command palette performs a unified search across nodes (by title/description) with results ranked by relevance. The generated column approach means no trigger maintenance — the `search_vector` updates automatically on any `title` or `description` change. + +--- + +## Inbox Query + +Issues without a parent (triage inbox): + +```sql +SELECT * FROM nodes +WHERE project_id = :project_id + AND node_type = 'issue' + AND parent_id IS NULL +ORDER BY created_at DESC; +``` + +Note: the project's root node is the only component with `parent_id IS NULL`. Orphaned issues (inbox items) are distinguished by `node_type = 'issue'`. + +--- + +## Unblocked Issues Query + +Issues that are not blocked by any open issue: + +```sql +SELECT n.* FROM nodes n +WHERE n.project_id = :project_id + AND n.node_type = 'issue' + AND n.status IN ('todo', 'in_progress') + AND NOT EXISTS ( + SELECT 1 FROM lateral_links ll + JOIN nodes blocker ON blocker.id = ll.source_id + WHERE ll.target_id = n.id + AND ll.link_type = 'blocks' + AND blocker.status NOT IN ('done', 'cancelled') + ); +``` + +--- + +## Cascade and Deletion Behavior + +| FK Relationship | On Delete | +|-----------------|-----------| +| `nodes.parent_id → nodes.id` | `SET NULL` (orphan to inbox, don't cascade-delete subtrees) | +| `nodes.project_id → projects.id` | `CASCADE` (project deletion removes all nodes) | +| `lateral_links → nodes` | `CASCADE` (removing a node removes its links) | +| `comments → nodes` | `CASCADE` (removing a node removes its comments) | +| `nodes.assignee_id → actors.id` | `SET NULL` (deleting an actor unassigns them) | +| `workspace_members → workspaces/actors` | `CASCADE` | +| `project_members → projects/actors` | `CASCADE` | + +--- + +## Constraints Summary + +| Constraint | Purpose | +|-----------|---------| +| `UNIQUE (project_id, short_id)` | Short IDs unique within project | +| `UNIQUE (workspace_id, slug)` on projects | Project slugs unique per workspace | +| `UNIQUE (source_id, target_id, link_type)` on links | No duplicate links | +| `CHECK (source_id != target_id)` on links | No self-links | +| `CHECK` on nodes | Issues must have status; components must not | +| `UNIQUE (project_id, repo_url)` on repo_connections | No duplicate repo links | + +--- + +## Migration Notes + +- **Alembic** manages all migrations. The initial migration creates extensions, enums, and all tables in dependency order. +- **ltree extension** must be created by a superuser or a user with `CREATE` privilege on the database. The Alembic migration should run `CREATE EXTENSION IF NOT EXISTS ltree` in an `op.execute()` call. +- **Generated columns** (search_vector) require Postgres 12+. Target minimum: Postgres 15. +- **UUIDv7** is generated application-side (Python `uuid7` package). Postgres stores it as standard `UUID` type — no special extension needed. + +--- + +## Future Migration Path (Post-Alpha) + +When introducing Neo4j for Layer 3 code connections: + +1. Extract graph topology from `nodes` → Neo4j nodes (id, short_id, title, status, labels, assignee_id, path) +2. Move `lateral_links` → Neo4j relationships +3. Keep `nodes` in Postgres but rename to `node_content` (description, description_html only) +4. Add `artifacts` table and Neo4j `Artifact` label + `HAS_ARTIFACT` edge +5. Add `cycles` table and Neo4j `IN_CYCLE` relationship + +The schema is designed so this split is additive — the UUID primary key is the cross-database join key, and no structural changes are needed to the Postgres tables beyond extracting topology fields. diff --git a/non-linear.html b/non-linear.html new file mode 100644 index 0000000..9bd548e --- /dev/null +++ b/non-linear.html @@ -0,0 +1,1173 @@ + + +
+ +