# Non-Linear: Alpha Database Schema (Postgres-Only)

## Design Decisions

### Why Postgres-Only

Alpha ships without Neo4j (~8 containers). The graph in alpha is small (<500 nodes per project) with limited lateral link types (`blocks`, `relates_to`). Postgres handles this comfortably via:

- Adjacency list (`parent_id`) for the decomposition tree
- `ltree` materialized path for fast subtree queries without recursive CTEs
- A `lateral_links` table for typed edges between nodes

The dual-DB architecture (Neo4j for topology, Postgres for content) remains the long-term direction for Layer 3 code connections and cross-project edges. This schema is designed to migrate cleanly: the `nodes` table splits into Neo4j (topology + lightweight props) and Postgres (content) by extracting `description`, `description_html` into a `node_content` table keyed by the same UUID.

### Unified Nodes Table

Components and issues share a single `nodes` table with a `node_type` discriminator. Rationale:

- The decomposition tree mixes both types (a component's child can be an issue, an issue's child can be a sub-issue)
- Tree queries (`parent_id`, `path`) operate uniformly across types
- Issue-specific columns (`status`, `assignee_id`) are nullable and ignored for components
- Avoids polymorphic joins for tree traversal

### ltree for Subtree Queries

Each node stores a materialized `path` column of type `ltree`. Example: `root.comp_abc.issue_def`. This enables:

- `SELECT * FROM nodes WHERE path <@ 'root.comp_abc'` — all descendants of a component
- `SELECT * FROM nodes WHERE path @> 'root.comp_abc.issue_def'` — all ancestors
- Index-backed, no recursion needed

The path uses node short-integer IDs (the `seq` value) as segments for compactness. Updated on reparent via a single `UPDATE ... SET path = new_prefix || subpath(path, nlevel(old_prefix))` for the subtree.

### Labels as Array

Labels are stored as `text[]` with a GIN index. For alpha's freeform tags this is simpler than a join table and supports queries like `WHERE labels @> ARRAY['bug', 'p0']`. A normalized `labels` table can be introduced post-alpha if label management (rename, merge, color) becomes necessary.

### UUIDv7

All primary keys use UUIDv7 (time-sortable, generated application-side). Benefits:

- Natural chronological ordering without a separate `created_at` sort
- Safe for distributed ID generation (no coordination needed)
- Same ID used across systems (future Neo4j migration, Centrifugo channels, webhook payloads)

### Short IDs

Human-readable IDs like `NL-42` are generated per-project using an atomic counter (`next_short_id` on the `projects` table). The prefix is configurable per project. Short IDs are unique within a project and immutable once assigned.

---

## Extensions

```sql
CREATE EXTENSION IF NOT EXISTS ltree;
CREATE EXTENSION IF NOT EXISTS pgcrypto;  -- gen_random_uuid() fallback if app doesn't supply UUIDv7
```

---

## Enums

```sql
CREATE TYPE node_type AS ENUM ('component', 'issue');

CREATE TYPE node_status AS ENUM (
    'backlog',
    'todo',
    'in_progress',
    'in_review',
    'done',
    'cancelled'
);

CREATE TYPE link_type AS ENUM ('blocks', 'relates_to');

CREATE TYPE actor_type AS ENUM ('user', 'agent');

CREATE TYPE workspace_role AS ENUM ('owner', 'member');

CREATE TYPE project_role AS ENUM ('owner', 'member', 'agent');

CREATE TYPE repo_provider AS ENUM ('github', 'gitlab');

CREATE TYPE audit_action AS ENUM (
    'node_created',
    'node_updated',
    'node_deleted',
    'node_reparented',
    'status_changed',
    'assignee_changed',
    'labels_changed',
    'link_created',
    'link_deleted',
    'comment_created',
    'comment_updated',
    'comment_deleted',
    'member_added',
    'member_removed',
    'member_role_changed',
    'repo_connected',
    'repo_disconnected',
    'webhook_created',
    'webhook_deleted'
);
```

---

## Tables

### workspaces

```sql
CREATE TABLE workspaces (
    id          UUID PRIMARY KEY,
    name        TEXT NOT NULL,
    slug        TEXT NOT NULL UNIQUE,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE UNIQUE INDEX idx_workspaces_slug ON workspaces (slug);
```

### actors

```sql
CREATE TABLE actors (
    id              UUID PRIMARY KEY,
    actor_type      actor_type NOT NULL,
    display_name    TEXT NOT NULL,
    email           TEXT,
    avatar_url      TEXT,
    authentik_uid   TEXT UNIQUE,       -- OIDC subject claim (users)
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_actors_email ON actors (email) WHERE email IS NOT NULL;
CREATE INDEX idx_actors_authentik ON actors (authentik_uid) WHERE authentik_uid IS NOT NULL;
```

### workspace_members

```sql
CREATE TABLE workspace_members (
    workspace_id    UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
    actor_id        UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE,
    role            workspace_role NOT NULL DEFAULT 'member',
    joined_at       TIMESTAMPTZ NOT NULL DEFAULT now(),

    PRIMARY KEY (workspace_id, actor_id)
);

CREATE INDEX idx_wm_actor ON workspace_members (actor_id);
```

### projects

```sql
CREATE TABLE projects (
    id              UUID PRIMARY KEY,
    workspace_id    UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
    name            TEXT NOT NULL,
    slug            TEXT NOT NULL,
    short_id_prefix TEXT NOT NULL DEFAULT 'NL',  -- e.g. "NL" → NL-1, NL-2
    next_short_id   INTEGER NOT NULL DEFAULT 1,  -- atomically incremented
    root_node_id    UUID,                        -- set after root node creation
    settings        JSONB NOT NULL DEFAULT '{}', -- custom statuses, defaults (post-alpha)
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),

    UNIQUE (workspace_id, slug)
);

CREATE INDEX idx_projects_workspace ON projects (workspace_id);
```

### project_members

```sql
CREATE TABLE project_members (
    project_id  UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    actor_id    UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE,
    role        project_role NOT NULL DEFAULT 'member',
    joined_at   TIMESTAMPTZ NOT NULL DEFAULT now(),

    PRIMARY KEY (project_id, actor_id)
);

CREATE INDEX idx_pm_actor ON project_members (actor_id);
```

### nodes

The core table. Stores both components and issues in a single table with the decomposition tree structure.

```sql
CREATE TABLE nodes (
    id              UUID PRIMARY KEY,
    project_id      UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    node_type       node_type NOT NULL,
    short_id        INTEGER NOT NULL,            -- numeric part: 42 in "NL-42"
    title           TEXT NOT NULL,
    description     TEXT,                        -- markdown source
    description_html TEXT,                       -- pre-rendered, sanitized HTML

    -- Tree structure
    parent_id       UUID REFERENCES nodes(id) ON DELETE SET NULL,
    path            ltree NOT NULL,              -- materialized path for subtree queries

    -- Issue-specific (NULL for components)
    status          node_status,
    assignee_id     UUID REFERENCES actors(id) ON DELETE SET NULL,
    created_by      UUID NOT NULL REFERENCES actors(id),

    -- Shared
    labels          TEXT[] NOT NULL DEFAULT '{}',

    -- Timestamps
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),

    -- Constraints
    UNIQUE (project_id, short_id),
    CONSTRAINT chk_issue_has_status CHECK (
        (node_type = 'issue' AND status IS NOT NULL)
        OR node_type = 'component'
    ),
    CONSTRAINT chk_component_no_status CHECK (
        (node_type = 'component' AND status IS NULL)
        OR node_type = 'issue'
    )
);

-- Tree traversal
CREATE INDEX idx_nodes_parent ON nodes (parent_id) WHERE parent_id IS NOT NULL;
CREATE INDEX idx_nodes_path ON nodes USING GIST (path);

-- Filtering
CREATE INDEX idx_nodes_project ON nodes (project_id);
CREATE INDEX idx_nodes_status ON nodes (project_id, status) WHERE status IS NOT NULL;
CREATE INDEX idx_nodes_assignee ON nodes (assignee_id) WHERE assignee_id IS NOT NULL;
CREATE INDEX idx_nodes_labels ON nodes USING GIN (labels);
CREATE INDEX idx_nodes_type ON nodes (project_id, node_type);

-- Full-text search
ALTER TABLE nodes ADD COLUMN search_vector tsvector
    GENERATED ALWAYS AS (
        setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
        setweight(to_tsvector('english', coalesce(description, '')), 'B')
    ) STORED;

CREATE INDEX idx_nodes_fts ON nodes USING GIN (search_vector);

-- Short ID lookup
CREATE INDEX idx_nodes_short_id ON nodes (project_id, short_id);
```

### lateral_links

Typed edges between nodes (not part of the decomposition tree).

```sql
CREATE TABLE lateral_links (
    id          UUID PRIMARY KEY,
    project_id  UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    source_id   UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
    target_id   UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
    link_type   link_type NOT NULL,
    created_by  UUID NOT NULL REFERENCES actors(id),
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),

    -- No self-links, no duplicate links in same direction
    CONSTRAINT chk_no_self_link CHECK (source_id != target_id),
    UNIQUE (source_id, target_id, link_type)
);

CREATE INDEX idx_links_source ON lateral_links (source_id);
CREATE INDEX idx_links_target ON lateral_links (target_id);
CREATE INDEX idx_links_project ON lateral_links (project_id);
```

**Semantics:**

- `blocks`: directed — `source` blocks `target`. Query "what blocks issue X" = `WHERE target_id = X AND link_type = 'blocks'`.
- `relates_to`: undirected — stored once (lower UUID as source by convention). Query both directions.

### comments

Flat comment stream per node.

```sql
CREATE TABLE comments (
    id          UUID PRIMARY KEY,
    node_id     UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
    author_id   UUID NOT NULL REFERENCES actors(id),
    body        TEXT NOT NULL,                -- markdown source
    body_html   TEXT NOT NULL,               -- pre-rendered, sanitized HTML
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_comments_node ON comments (node_id, created_at);
CREATE INDEX idx_comments_author ON comments (author_id);

-- Full-text search on comments
ALTER TABLE comments ADD COLUMN search_vector tsvector
    GENERATED ALWAYS AS (
        to_tsvector('english', coalesce(body, ''))
    ) STORED;

CREATE INDEX idx_comments_fts ON comments USING GIN (search_vector);
```

### audit_events

Append-only change history. Every mutation is recorded.

```sql
CREATE TABLE audit_events (
    id          UUID PRIMARY KEY,
    project_id  UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    actor_id    UUID NOT NULL REFERENCES actors(id),
    action      audit_action NOT NULL,
    node_id     UUID,                        -- NULL for non-node events (member changes, etc.)
    before_data JSONB,                       -- snapshot of changed fields before mutation
    after_data  JSONB,                       -- snapshot of changed fields after mutation
    metadata    JSONB,                       -- additional context (e.g. commit SHA for linked changes)
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_audit_project_time ON audit_events (project_id, created_at DESC);
CREATE INDEX idx_audit_node ON audit_events (node_id, created_at DESC) WHERE node_id IS NOT NULL;
CREATE INDEX idx_audit_actor ON audit_events (actor_id, created_at DESC);
```

### repo_connections

Repositories linked to a project. Components reference these via `repo_connection_id`.

```sql
CREATE TABLE repo_connections (
    id                  UUID PRIMARY KEY,
    project_id          UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    provider            repo_provider NOT NULL,
    repo_url            TEXT NOT NULL,
    default_branch      TEXT NOT NULL DEFAULT 'main',
    access_token_enc    TEXT,                -- encrypted OAuth token
    webhook_secret_hash TEXT,                -- hashed webhook secret for incoming pushes
    connected_by        UUID NOT NULL REFERENCES actors(id),
    connected_at        TIMESTAMPTZ NOT NULL DEFAULT now(),

    UNIQUE (project_id, repo_url)
);

CREATE INDEX idx_repos_project ON repo_connections (project_id);
```

### node_repo_links

Maps components to specific paths within connected repositories.

```sql
CREATE TABLE node_repo_links (
    node_id             UUID PRIMARY KEY REFERENCES nodes(id) ON DELETE CASCADE,
    repo_connection_id  UUID NOT NULL REFERENCES repo_connections(id) ON DELETE CASCADE,
    path                TEXT,                -- subdirectory within repo (NULL = repo root)
    branch              TEXT                 -- branch override (NULL = repo default)
);

CREATE INDEX idx_nrl_repo ON node_repo_links (repo_connection_id);
```

### api_tokens

Bearer tokens for agent access.

```sql
CREATE TABLE api_tokens (
    id              UUID PRIMARY KEY,
    actor_id        UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE,
    token_hash      TEXT NOT NULL UNIQUE,    -- SHA-256 hash of the token (never store plaintext)
    name            TEXT NOT NULL,           -- human-readable label ("triage-agent-prod")
    last_used_at    TIMESTAMPTZ,
    expires_at      TIMESTAMPTZ,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),

    CONSTRAINT chk_expiry CHECK (expires_at IS NULL OR expires_at > created_at)
);

CREATE INDEX idx_tokens_actor ON api_tokens (actor_id);
CREATE INDEX idx_tokens_hash ON api_tokens (token_hash);
```

### webhook_configs

Minimal webhook registration per project.

```sql
CREATE TABLE webhook_configs (
    id                      UUID PRIMARY KEY,
    project_id              UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
    url                     TEXT NOT NULL,
    events                  TEXT[] NOT NULL DEFAULT '{}',  -- e.g. {'node.status_changed', 'comment.added'}
    active                  BOOLEAN NOT NULL DEFAULT true,
    consecutive_failures    INTEGER NOT NULL DEFAULT 0,
    last_delivery_at        TIMESTAMPTZ,
    created_by              UUID NOT NULL REFERENCES actors(id),
    created_at              TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE INDEX idx_webhooks_project ON webhook_configs (project_id);
```

---

## ltree Path Maintenance

### Path Format

Each node's `path` is composed of segments representing the chain of `short_id` values from root to node, prefixed with the project's root identifier:

```
root.12.45.78
 │    │   │  └── this node (short_id=78)
 │    │   └── parent (short_id=45)
 │    └── grandparent (short_id=12)
 └── project root sentinel
```

Using integer short IDs as segments keeps paths compact and unique within a project.

### On Node Creation

```sql
-- Pseudocode (application layer):
-- 1. Atomically claim a short_id:
UPDATE projects SET next_short_id = next_short_id + 1
WHERE id = :project_id
RETURNING next_short_id - 1 AS new_short_id;

-- 2. Compute path from parent:
-- If parent_id IS NULL (root node): path = 'root'
-- Otherwise: path = parent.path || '.' || new_short_id::text
INSERT INTO nodes (id, project_id, node_type, short_id, title, parent_id, path, ...)
VALUES (:id, :project_id, :type, :new_short_id, :title, :parent_id, :computed_path, ...);
```

### On Reparent

When a node moves to a new parent, update the entire subtree's paths in one statement:

```sql
-- :old_path = current node's path (e.g. 'root.12.45')
-- :new_parent_path = new parent's path (e.g. 'root.99')
-- :node_short_id = the moved node's short_id segment

UPDATE nodes
SET
    path = :new_parent_path || '.' || :node_short_id::text
               || subpath(path, nlevel(:old_path)),
    parent_id = CASE WHEN id = :node_id THEN :new_parent_id ELSE parent_id END,
    updated_at = now()
WHERE path <@ :old_path;
```

This updates the moved node and all its descendants in a single indexed operation.

---

## Short ID Generation

Short IDs are assigned atomically using `UPDATE ... RETURNING`:

```sql
-- Claim next short_id for a project (called from application layer)
UPDATE projects
SET next_short_id = next_short_id + 1
WHERE id = :project_id
RETURNING next_short_id - 1 AS short_id;
```

The full human-readable ID is `{project.short_id_prefix}-{short_id}`, e.g. `NL-42`. This is computed at read time, not stored as a string — only the integer is persisted on the node.

---

## Full-Text Search

Search is powered by generated `tsvector` columns with GIN indexes on both `nodes` and `comments`.

### Query Pattern

```sql
-- Search nodes in a project
SELECT id, short_id, title, node_type,
       ts_rank(search_vector, query) AS rank
FROM nodes, to_tsquery('english', :search_term) query
WHERE project_id = :project_id
  AND search_vector @@ query
ORDER BY rank DESC
LIMIT 20;

-- Search comments in a project (via join)
SELECT c.id, c.node_id, c.body, c.author_id,
       ts_rank(c.search_vector, query) AS rank
FROM comments c
JOIN nodes n ON n.id = c.node_id, to_tsquery('english', :search_term) query
WHERE n.project_id = :project_id
  AND c.search_vector @@ query
ORDER BY rank DESC
LIMIT 20;
```

### Command Palette Search

The command palette performs a unified search across nodes (by title/description) with results ranked by relevance. The generated column approach means no trigger maintenance — the `search_vector` updates automatically on any `title` or `description` change.

---

## Inbox Query

Issues without a parent (triage inbox):

```sql
SELECT * FROM nodes
WHERE project_id = :project_id
  AND node_type = 'issue'
  AND parent_id IS NULL
ORDER BY created_at DESC;
```

Note: the project's root node is the only component with `parent_id IS NULL`. Orphaned issues (inbox items) are distinguished by `node_type = 'issue'`.

---

## Unblocked Issues Query

Issues that are not blocked by any open issue:

```sql
SELECT n.* FROM nodes n
WHERE n.project_id = :project_id
  AND n.node_type = 'issue'
  AND n.status IN ('todo', 'in_progress')
  AND NOT EXISTS (
      SELECT 1 FROM lateral_links ll
      JOIN nodes blocker ON blocker.id = ll.source_id
      WHERE ll.target_id = n.id
        AND ll.link_type = 'blocks'
        AND blocker.status NOT IN ('done', 'cancelled')
  );
```

---

## Cascade and Deletion Behavior

| FK Relationship | On Delete |
|-----------------|-----------|
| `nodes.parent_id → nodes.id` | `SET NULL` (orphan to inbox, don't cascade-delete subtrees) |
| `nodes.project_id → projects.id` | `CASCADE` (project deletion removes all nodes) |
| `lateral_links → nodes` | `CASCADE` (removing a node removes its links) |
| `comments → nodes` | `CASCADE` (removing a node removes its comments) |
| `nodes.assignee_id → actors.id` | `SET NULL` (deleting an actor unassigns them) |
| `workspace_members → workspaces/actors` | `CASCADE` |
| `project_members → projects/actors` | `CASCADE` |

---

## Constraints Summary

| Constraint | Purpose |
|-----------|---------|
| `UNIQUE (project_id, short_id)` | Short IDs unique within project |
| `UNIQUE (workspace_id, slug)` on projects | Project slugs unique per workspace |
| `UNIQUE (source_id, target_id, link_type)` on links | No duplicate links |
| `CHECK (source_id != target_id)` on links | No self-links |
| `CHECK` on nodes | Issues must have status; components must not |
| `UNIQUE (project_id, repo_url)` on repo_connections | No duplicate repo links |

---

## Migration Notes

- **Alembic** manages all migrations. The initial migration creates extensions, enums, and all tables in dependency order.
- **ltree extension** must be created by a superuser or a user with `CREATE` privilege on the database. The Alembic migration should run `CREATE EXTENSION IF NOT EXISTS ltree` in an `op.execute()` call.
- **Generated columns** (search_vector) require Postgres 12+. Target minimum: Postgres 15.
- **UUIDv7** is generated application-side (Python `uuid7` package). Postgres stores it as standard `UUID` type — no special extension needed.

---

## Future Migration Path (Post-Alpha)

When introducing Neo4j for Layer 3 code connections:

1. Extract graph topology from `nodes` → Neo4j nodes (id, short_id, title, status, labels, assignee_id, path)
2. Move `lateral_links` → Neo4j relationships
3. Keep `nodes` in Postgres but rename to `node_content` (description, description_html only)
4. Add `artifacts` table and Neo4j `Artifact` label + `HAS_ARTIFACT` edge
5. Add `cycles` table and Neo4j `IN_CYCLE` relationship

The schema is designed so this split is additive — the UUID primary key is the cross-database join key, and no structural changes are needed to the Postgres tables beyond extracting topology fields.