# Non-Linear: Alpha Database Schema (Postgres-Only) ## Design Decisions ### Why Postgres-Only Alpha ships without Neo4j (~8 containers). The graph in alpha is small (<500 nodes per project) with limited lateral link types (`blocks`, `relates_to`). Postgres handles this comfortably via: - Adjacency list (`parent_id`) for the decomposition tree - `ltree` materialized path for fast subtree queries without recursive CTEs - A `lateral_links` table for typed edges between nodes The dual-DB architecture (Neo4j for topology, Postgres for content) remains the long-term direction for Layer 3 code connections and cross-project edges. This schema is designed to migrate cleanly: the `nodes` table splits into Neo4j (topology + lightweight props) and Postgres (content) by extracting `description`, `description_html` into a `node_content` table keyed by the same UUID. ### Unified Nodes Table Components and issues share a single `nodes` table with a `node_type` discriminator. Rationale: - The decomposition tree mixes both types (a component's child can be an issue, an issue's child can be a sub-issue) - Tree queries (`parent_id`, `path`) operate uniformly across types - Issue-specific columns (`status`, `assignee_id`) are nullable and ignored for components - Avoids polymorphic joins for tree traversal ### ltree for Subtree Queries Each node stores a materialized `path` column of type `ltree`. Example: `root.comp_abc.issue_def`. This enables: - `SELECT * FROM nodes WHERE path <@ 'root.comp_abc'` — all descendants of a component - `SELECT * FROM nodes WHERE path @> 'root.comp_abc.issue_def'` — all ancestors - Index-backed, no recursion needed The path uses node short-integer IDs (the `seq` value) as segments for compactness. Updated on reparent via a single `UPDATE ... SET path = new_prefix || subpath(path, nlevel(old_prefix))` for the subtree. ### Labels as Array Labels are stored as `text[]` with a GIN index. For alpha's freeform tags this is simpler than a join table and supports queries like `WHERE labels @> ARRAY['bug', 'p0']`. A normalized `labels` table can be introduced post-alpha if label management (rename, merge, color) becomes necessary. ### UUIDv7 All primary keys use UUIDv7 (time-sortable, generated application-side). Benefits: - Natural chronological ordering without a separate `created_at` sort - Safe for distributed ID generation (no coordination needed) - Same ID used across systems (future Neo4j migration, Centrifugo channels, webhook payloads) ### Short IDs Human-readable IDs like `NL-42` are generated per-project using an atomic counter (`next_short_id` on the `projects` table). The prefix is configurable per project. Short IDs are unique within a project and immutable once assigned. --- ## Extensions ```sql CREATE EXTENSION IF NOT EXISTS ltree; CREATE EXTENSION IF NOT EXISTS pgcrypto; -- gen_random_uuid() fallback if app doesn't supply UUIDv7 ``` --- ## Enums ```sql CREATE TYPE node_type AS ENUM ('component', 'issue'); CREATE TYPE node_status AS ENUM ( 'backlog', 'todo', 'in_progress', 'in_review', 'done', 'cancelled' ); CREATE TYPE link_type AS ENUM ('blocks', 'relates_to'); CREATE TYPE actor_type AS ENUM ('user', 'agent'); CREATE TYPE workspace_role AS ENUM ('owner', 'member'); CREATE TYPE project_role AS ENUM ('owner', 'member', 'agent'); CREATE TYPE repo_provider AS ENUM ('github', 'gitlab'); CREATE TYPE audit_action AS ENUM ( 'node_created', 'node_updated', 'node_deleted', 'node_reparented', 'status_changed', 'assignee_changed', 'labels_changed', 'link_created', 'link_deleted', 'comment_created', 'comment_updated', 'comment_deleted', 'member_added', 'member_removed', 'member_role_changed', 'repo_connected', 'repo_disconnected', 'webhook_created', 'webhook_deleted' ); ``` --- ## Tables ### workspaces ```sql CREATE TABLE workspaces ( id UUID PRIMARY KEY, name TEXT NOT NULL, slug TEXT NOT NULL UNIQUE, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_at TIMESTAMPTZ NOT NULL DEFAULT now() ); CREATE UNIQUE INDEX idx_workspaces_slug ON workspaces (slug); ``` ### actors ```sql CREATE TABLE actors ( id UUID PRIMARY KEY, actor_type actor_type NOT NULL, display_name TEXT NOT NULL, email TEXT, avatar_url TEXT, authentik_uid TEXT UNIQUE, -- OIDC subject claim (users) created_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_at TIMESTAMPTZ NOT NULL DEFAULT now() ); CREATE INDEX idx_actors_email ON actors (email) WHERE email IS NOT NULL; CREATE INDEX idx_actors_authentik ON actors (authentik_uid) WHERE authentik_uid IS NOT NULL; ``` ### workspace_members ```sql CREATE TABLE workspace_members ( workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE, actor_id UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE, role workspace_role NOT NULL DEFAULT 'member', joined_at TIMESTAMPTZ NOT NULL DEFAULT now(), PRIMARY KEY (workspace_id, actor_id) ); CREATE INDEX idx_wm_actor ON workspace_members (actor_id); ``` ### projects ```sql CREATE TABLE projects ( id UUID PRIMARY KEY, workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE, name TEXT NOT NULL, slug TEXT NOT NULL, short_id_prefix TEXT NOT NULL DEFAULT 'NL', -- e.g. "NL" → NL-1, NL-2 next_short_id INTEGER NOT NULL DEFAULT 1, -- atomically incremented root_node_id UUID, -- set after root node creation settings JSONB NOT NULL DEFAULT '{}', -- custom statuses, defaults (post-alpha) created_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), UNIQUE (workspace_id, slug) ); CREATE INDEX idx_projects_workspace ON projects (workspace_id); ``` ### project_members ```sql CREATE TABLE project_members ( project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, actor_id UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE, role project_role NOT NULL DEFAULT 'member', joined_at TIMESTAMPTZ NOT NULL DEFAULT now(), PRIMARY KEY (project_id, actor_id) ); CREATE INDEX idx_pm_actor ON project_members (actor_id); ``` ### nodes The core table. Stores both components and issues in a single table with the decomposition tree structure. ```sql CREATE TABLE nodes ( id UUID PRIMARY KEY, project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, node_type node_type NOT NULL, short_id INTEGER NOT NULL, -- numeric part: 42 in "NL-42" title TEXT NOT NULL, description TEXT, -- markdown source description_html TEXT, -- pre-rendered, sanitized HTML -- Tree structure parent_id UUID REFERENCES nodes(id) ON DELETE SET NULL, path ltree NOT NULL, -- materialized path for subtree queries -- Issue-specific (NULL for components) status node_status, assignee_id UUID REFERENCES actors(id) ON DELETE SET NULL, created_by UUID NOT NULL REFERENCES actors(id), -- Shared labels TEXT[] NOT NULL DEFAULT '{}', -- Timestamps created_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), -- Constraints UNIQUE (project_id, short_id), CONSTRAINT chk_issue_has_status CHECK ( (node_type = 'issue' AND status IS NOT NULL) OR node_type = 'component' ), CONSTRAINT chk_component_no_status CHECK ( (node_type = 'component' AND status IS NULL) OR node_type = 'issue' ) ); -- Tree traversal CREATE INDEX idx_nodes_parent ON nodes (parent_id) WHERE parent_id IS NOT NULL; CREATE INDEX idx_nodes_path ON nodes USING GIST (path); -- Filtering CREATE INDEX idx_nodes_project ON nodes (project_id); CREATE INDEX idx_nodes_status ON nodes (project_id, status) WHERE status IS NOT NULL; CREATE INDEX idx_nodes_assignee ON nodes (assignee_id) WHERE assignee_id IS NOT NULL; CREATE INDEX idx_nodes_labels ON nodes USING GIN (labels); CREATE INDEX idx_nodes_type ON nodes (project_id, node_type); -- Full-text search ALTER TABLE nodes ADD COLUMN search_vector tsvector GENERATED ALWAYS AS ( setweight(to_tsvector('english', coalesce(title, '')), 'A') || setweight(to_tsvector('english', coalesce(description, '')), 'B') ) STORED; CREATE INDEX idx_nodes_fts ON nodes USING GIN (search_vector); -- Short ID lookup CREATE INDEX idx_nodes_short_id ON nodes (project_id, short_id); ``` ### lateral_links Typed edges between nodes (not part of the decomposition tree). ```sql CREATE TABLE lateral_links ( id UUID PRIMARY KEY, project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, source_id UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE, target_id UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE, link_type link_type NOT NULL, created_by UUID NOT NULL REFERENCES actors(id), created_at TIMESTAMPTZ NOT NULL DEFAULT now(), -- No self-links, no duplicate links in same direction CONSTRAINT chk_no_self_link CHECK (source_id != target_id), UNIQUE (source_id, target_id, link_type) ); CREATE INDEX idx_links_source ON lateral_links (source_id); CREATE INDEX idx_links_target ON lateral_links (target_id); CREATE INDEX idx_links_project ON lateral_links (project_id); ``` **Semantics:** - `blocks`: directed — `source` blocks `target`. Query "what blocks issue X" = `WHERE target_id = X AND link_type = 'blocks'`. - `relates_to`: undirected — stored once (lower UUID as source by convention). Query both directions. ### comments Flat comment stream per node. ```sql CREATE TABLE comments ( id UUID PRIMARY KEY, node_id UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE, author_id UUID NOT NULL REFERENCES actors(id), body TEXT NOT NULL, -- markdown source body_html TEXT NOT NULL, -- pre-rendered, sanitized HTML created_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_at TIMESTAMPTZ NOT NULL DEFAULT now() ); CREATE INDEX idx_comments_node ON comments (node_id, created_at); CREATE INDEX idx_comments_author ON comments (author_id); -- Full-text search on comments ALTER TABLE comments ADD COLUMN search_vector tsvector GENERATED ALWAYS AS ( to_tsvector('english', coalesce(body, '')) ) STORED; CREATE INDEX idx_comments_fts ON comments USING GIN (search_vector); ``` ### audit_events Append-only change history. Every mutation is recorded. ```sql CREATE TABLE audit_events ( id UUID PRIMARY KEY, project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, actor_id UUID NOT NULL REFERENCES actors(id), action audit_action NOT NULL, node_id UUID, -- NULL for non-node events (member changes, etc.) before_data JSONB, -- snapshot of changed fields before mutation after_data JSONB, -- snapshot of changed fields after mutation metadata JSONB, -- additional context (e.g. commit SHA for linked changes) created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); CREATE INDEX idx_audit_project_time ON audit_events (project_id, created_at DESC); CREATE INDEX idx_audit_node ON audit_events (node_id, created_at DESC) WHERE node_id IS NOT NULL; CREATE INDEX idx_audit_actor ON audit_events (actor_id, created_at DESC); ``` ### repo_connections Repositories linked to a project. Components reference these via `repo_connection_id`. ```sql CREATE TABLE repo_connections ( id UUID PRIMARY KEY, project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, provider repo_provider NOT NULL, repo_url TEXT NOT NULL, default_branch TEXT NOT NULL DEFAULT 'main', access_token_enc TEXT, -- encrypted OAuth token webhook_secret_hash TEXT, -- hashed webhook secret for incoming pushes connected_by UUID NOT NULL REFERENCES actors(id), connected_at TIMESTAMPTZ NOT NULL DEFAULT now(), UNIQUE (project_id, repo_url) ); CREATE INDEX idx_repos_project ON repo_connections (project_id); ``` ### node_repo_links Maps components to specific paths within connected repositories. ```sql CREATE TABLE node_repo_links ( node_id UUID PRIMARY KEY REFERENCES nodes(id) ON DELETE CASCADE, repo_connection_id UUID NOT NULL REFERENCES repo_connections(id) ON DELETE CASCADE, path TEXT, -- subdirectory within repo (NULL = repo root) branch TEXT -- branch override (NULL = repo default) ); CREATE INDEX idx_nrl_repo ON node_repo_links (repo_connection_id); ``` ### api_tokens Bearer tokens for agent access. ```sql CREATE TABLE api_tokens ( id UUID PRIMARY KEY, actor_id UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE, token_hash TEXT NOT NULL UNIQUE, -- SHA-256 hash of the token (never store plaintext) name TEXT NOT NULL, -- human-readable label ("triage-agent-prod") last_used_at TIMESTAMPTZ, expires_at TIMESTAMPTZ, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), CONSTRAINT chk_expiry CHECK (expires_at IS NULL OR expires_at > created_at) ); CREATE INDEX idx_tokens_actor ON api_tokens (actor_id); CREATE INDEX idx_tokens_hash ON api_tokens (token_hash); ``` ### webhook_configs Minimal webhook registration per project. ```sql CREATE TABLE webhook_configs ( id UUID PRIMARY KEY, project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE, url TEXT NOT NULL, events TEXT[] NOT NULL DEFAULT '{}', -- e.g. {'node.status_changed', 'comment.added'} active BOOLEAN NOT NULL DEFAULT true, consecutive_failures INTEGER NOT NULL DEFAULT 0, last_delivery_at TIMESTAMPTZ, created_by UUID NOT NULL REFERENCES actors(id), created_at TIMESTAMPTZ NOT NULL DEFAULT now() ); CREATE INDEX idx_webhooks_project ON webhook_configs (project_id); ``` --- ## ltree Path Maintenance ### Path Format Each node's `path` is composed of segments representing the chain of `short_id` values from root to node, prefixed with the project's root identifier: ``` root.12.45.78 │ │ │ └── this node (short_id=78) │ │ └── parent (short_id=45) │ └── grandparent (short_id=12) └── project root sentinel ``` Using integer short IDs as segments keeps paths compact and unique within a project. ### On Node Creation ```sql -- Pseudocode (application layer): -- 1. Atomically claim a short_id: UPDATE projects SET next_short_id = next_short_id + 1 WHERE id = :project_id RETURNING next_short_id - 1 AS new_short_id; -- 2. Compute path from parent: -- If parent_id IS NULL (root node): path = 'root' -- Otherwise: path = parent.path || '.' || new_short_id::text INSERT INTO nodes (id, project_id, node_type, short_id, title, parent_id, path, ...) VALUES (:id, :project_id, :type, :new_short_id, :title, :parent_id, :computed_path, ...); ``` ### On Reparent When a node moves to a new parent, update the entire subtree's paths in one statement: ```sql -- :old_path = current node's path (e.g. 'root.12.45') -- :new_parent_path = new parent's path (e.g. 'root.99') -- :node_short_id = the moved node's short_id segment UPDATE nodes SET path = :new_parent_path || '.' || :node_short_id::text || subpath(path, nlevel(:old_path)), parent_id = CASE WHEN id = :node_id THEN :new_parent_id ELSE parent_id END, updated_at = now() WHERE path <@ :old_path; ``` This updates the moved node and all its descendants in a single indexed operation. --- ## Short ID Generation Short IDs are assigned atomically using `UPDATE ... RETURNING`: ```sql -- Claim next short_id for a project (called from application layer) UPDATE projects SET next_short_id = next_short_id + 1 WHERE id = :project_id RETURNING next_short_id - 1 AS short_id; ``` The full human-readable ID is `{project.short_id_prefix}-{short_id}`, e.g. `NL-42`. This is computed at read time, not stored as a string — only the integer is persisted on the node. --- ## Full-Text Search Search is powered by generated `tsvector` columns with GIN indexes on both `nodes` and `comments`. ### Query Pattern ```sql -- Search nodes in a project SELECT id, short_id, title, node_type, ts_rank(search_vector, query) AS rank FROM nodes, to_tsquery('english', :search_term) query WHERE project_id = :project_id AND search_vector @@ query ORDER BY rank DESC LIMIT 20; -- Search comments in a project (via join) SELECT c.id, c.node_id, c.body, c.author_id, ts_rank(c.search_vector, query) AS rank FROM comments c JOIN nodes n ON n.id = c.node_id, to_tsquery('english', :search_term) query WHERE n.project_id = :project_id AND c.search_vector @@ query ORDER BY rank DESC LIMIT 20; ``` ### Command Palette Search The command palette performs a unified search across nodes (by title/description) with results ranked by relevance. The generated column approach means no trigger maintenance — the `search_vector` updates automatically on any `title` or `description` change. --- ## Inbox Query Issues without a parent (triage inbox): ```sql SELECT * FROM nodes WHERE project_id = :project_id AND node_type = 'issue' AND parent_id IS NULL ORDER BY created_at DESC; ``` Note: the project's root node is the only component with `parent_id IS NULL`. Orphaned issues (inbox items) are distinguished by `node_type = 'issue'`. --- ## Unblocked Issues Query Issues that are not blocked by any open issue: ```sql SELECT n.* FROM nodes n WHERE n.project_id = :project_id AND n.node_type = 'issue' AND n.status IN ('todo', 'in_progress') AND NOT EXISTS ( SELECT 1 FROM lateral_links ll JOIN nodes blocker ON blocker.id = ll.source_id WHERE ll.target_id = n.id AND ll.link_type = 'blocks' AND blocker.status NOT IN ('done', 'cancelled') ); ``` --- ## Cascade and Deletion Behavior | FK Relationship | On Delete | |-----------------|-----------| | `nodes.parent_id → nodes.id` | `SET NULL` (orphan to inbox, don't cascade-delete subtrees) | | `nodes.project_id → projects.id` | `CASCADE` (project deletion removes all nodes) | | `lateral_links → nodes` | `CASCADE` (removing a node removes its links) | | `comments → nodes` | `CASCADE` (removing a node removes its comments) | | `nodes.assignee_id → actors.id` | `SET NULL` (deleting an actor unassigns them) | | `workspace_members → workspaces/actors` | `CASCADE` | | `project_members → projects/actors` | `CASCADE` | --- ## Constraints Summary | Constraint | Purpose | |-----------|---------| | `UNIQUE (project_id, short_id)` | Short IDs unique within project | | `UNIQUE (workspace_id, slug)` on projects | Project slugs unique per workspace | | `UNIQUE (source_id, target_id, link_type)` on links | No duplicate links | | `CHECK (source_id != target_id)` on links | No self-links | | `CHECK` on nodes | Issues must have status; components must not | | `UNIQUE (project_id, repo_url)` on repo_connections | No duplicate repo links | --- ## Migration Notes - **Alembic** manages all migrations. The initial migration creates extensions, enums, and all tables in dependency order. - **ltree extension** must be created by a superuser or a user with `CREATE` privilege on the database. The Alembic migration should run `CREATE EXTENSION IF NOT EXISTS ltree` in an `op.execute()` call. - **Generated columns** (search_vector) require Postgres 12+. Target minimum: Postgres 15. - **UUIDv7** is generated application-side (Python `uuid7` package). Postgres stores it as standard `UUID` type — no special extension needed. --- ## Future Migration Path (Post-Alpha) When introducing Neo4j for Layer 3 code connections: 1. Extract graph topology from `nodes` → Neo4j nodes (id, short_id, title, status, labels, assignee_id, path) 2. Move `lateral_links` → Neo4j relationships 3. Keep `nodes` in Postgres but rename to `node_content` (description, description_html only) 4. Add `artifacts` table and Neo4j `Artifact` label + `HAS_ARTIFACT` edge 5. Add `cycles` table and Neo4j `IN_CYCLE` relationship The schema is designed so this split is additive — the UUID primary key is the cross-database join key, and no structural changes are needed to the Postgres tables beyond extracting topology fields.