Update documentation for Non-Linear project: refine infrastructure details, reducing container count from 10 to 8, and introduce a new database schema document outlining design decisions, table structures, and data types for Postgres-only implementation. Add initial HTML layout for the layered view interface.

This commit is contained in:
Maxim Snesarev 2026-05-08 20:00:50 +03:00
parent 79a7277ad9
commit be7f9f6236
3 changed files with 1778 additions and 3 deletions

View File

@ -95,13 +95,11 @@ GET /api/v1/projects/{id}/nodes?status=todo&assignee=agent-1&unblocked=true
- Flat comment stream per node. Markdown body. Author + timestamp.
### Infrastructure (~10 containers)
### Infrastructure (~8 containers)
```
api (FastAPI + uvicorn)
frontend (Vue 3 + Vite)
worker (Taskiq — webhook delivery, skeleton inference)
neo4j
postgres
redis
centrifugo

604
10-ALPHA-DB-SCHEMA.md Normal file
View File

@ -0,0 +1,604 @@
# Non-Linear: Alpha Database Schema (Postgres-Only)
## Design Decisions
### Why Postgres-Only
Alpha ships without Neo4j (~8 containers). The graph in alpha is small (<500 nodes per project) with limited lateral link types (`blocks`, `relates_to`). Postgres handles this comfortably via:
- Adjacency list (`parent_id`) for the decomposition tree
- `ltree` materialized path for fast subtree queries without recursive CTEs
- A `lateral_links` table for typed edges between nodes
The dual-DB architecture (Neo4j for topology, Postgres for content) remains the long-term direction for Layer 3 code connections and cross-project edges. This schema is designed to migrate cleanly: the `nodes` table splits into Neo4j (topology + lightweight props) and Postgres (content) by extracting `description`, `description_html` into a `node_content` table keyed by the same UUID.
### Unified Nodes Table
Components and issues share a single `nodes` table with a `node_type` discriminator. Rationale:
- The decomposition tree mixes both types (a component's child can be an issue, an issue's child can be a sub-issue)
- Tree queries (`parent_id`, `path`) operate uniformly across types
- Issue-specific columns (`status`, `assignee_id`) are nullable and ignored for components
- Avoids polymorphic joins for tree traversal
### ltree for Subtree Queries
Each node stores a materialized `path` column of type `ltree`. Example: `root.comp_abc.issue_def`. This enables:
- `SELECT * FROM nodes WHERE path <@ 'root.comp_abc'` — all descendants of a component
- `SELECT * FROM nodes WHERE path @> 'root.comp_abc.issue_def'` — all ancestors
- Index-backed, no recursion needed
The path uses node short-integer IDs (the `seq` value) as segments for compactness. Updated on reparent via a single `UPDATE ... SET path = new_prefix || subpath(path, nlevel(old_prefix))` for the subtree.
### Labels as Array
Labels are stored as `text[]` with a GIN index. For alpha's freeform tags this is simpler than a join table and supports queries like `WHERE labels @> ARRAY['bug', 'p0']`. A normalized `labels` table can be introduced post-alpha if label management (rename, merge, color) becomes necessary.
### UUIDv7
All primary keys use UUIDv7 (time-sortable, generated application-side). Benefits:
- Natural chronological ordering without a separate `created_at` sort
- Safe for distributed ID generation (no coordination needed)
- Same ID used across systems (future Neo4j migration, Centrifugo channels, webhook payloads)
### Short IDs
Human-readable IDs like `NL-42` are generated per-project using an atomic counter (`next_short_id` on the `projects` table). The prefix is configurable per project. Short IDs are unique within a project and immutable once assigned.
---
## Extensions
```sql
CREATE EXTENSION IF NOT EXISTS ltree;
CREATE EXTENSION IF NOT EXISTS pgcrypto; -- gen_random_uuid() fallback if app doesn't supply UUIDv7
```
---
## Enums
```sql
CREATE TYPE node_type AS ENUM ('component', 'issue');
CREATE TYPE node_status AS ENUM (
'backlog',
'todo',
'in_progress',
'in_review',
'done',
'cancelled'
);
CREATE TYPE link_type AS ENUM ('blocks', 'relates_to');
CREATE TYPE actor_type AS ENUM ('user', 'agent');
CREATE TYPE workspace_role AS ENUM ('owner', 'member');
CREATE TYPE project_role AS ENUM ('owner', 'member', 'agent');
CREATE TYPE repo_provider AS ENUM ('github', 'gitlab');
CREATE TYPE audit_action AS ENUM (
'node_created',
'node_updated',
'node_deleted',
'node_reparented',
'status_changed',
'assignee_changed',
'labels_changed',
'link_created',
'link_deleted',
'comment_created',
'comment_updated',
'comment_deleted',
'member_added',
'member_removed',
'member_role_changed',
'repo_connected',
'repo_disconnected',
'webhook_created',
'webhook_deleted'
);
```
---
## Tables
### workspaces
```sql
CREATE TABLE workspaces (
id UUID PRIMARY KEY,
name TEXT NOT NULL,
slug TEXT NOT NULL UNIQUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE UNIQUE INDEX idx_workspaces_slug ON workspaces (slug);
```
### actors
```sql
CREATE TABLE actors (
id UUID PRIMARY KEY,
actor_type actor_type NOT NULL,
display_name TEXT NOT NULL,
email TEXT,
avatar_url TEXT,
authentik_uid TEXT UNIQUE, -- OIDC subject claim (users)
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_actors_email ON actors (email) WHERE email IS NOT NULL;
CREATE INDEX idx_actors_authentik ON actors (authentik_uid) WHERE authentik_uid IS NOT NULL;
```
### workspace_members
```sql
CREATE TABLE workspace_members (
workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
actor_id UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE,
role workspace_role NOT NULL DEFAULT 'member',
joined_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (workspace_id, actor_id)
);
CREATE INDEX idx_wm_actor ON workspace_members (actor_id);
```
### projects
```sql
CREATE TABLE projects (
id UUID PRIMARY KEY,
workspace_id UUID NOT NULL REFERENCES workspaces(id) ON DELETE CASCADE,
name TEXT NOT NULL,
slug TEXT NOT NULL,
short_id_prefix TEXT NOT NULL DEFAULT 'NL', -- e.g. "NL" → NL-1, NL-2
next_short_id INTEGER NOT NULL DEFAULT 1, -- atomically incremented
root_node_id UUID, -- set after root node creation
settings JSONB NOT NULL DEFAULT '{}', -- custom statuses, defaults (post-alpha)
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (workspace_id, slug)
);
CREATE INDEX idx_projects_workspace ON projects (workspace_id);
```
### project_members
```sql
CREATE TABLE project_members (
project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
actor_id UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE,
role project_role NOT NULL DEFAULT 'member',
joined_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (project_id, actor_id)
);
CREATE INDEX idx_pm_actor ON project_members (actor_id);
```
### nodes
The core table. Stores both components and issues in a single table with the decomposition tree structure.
```sql
CREATE TABLE nodes (
id UUID PRIMARY KEY,
project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
node_type node_type NOT NULL,
short_id INTEGER NOT NULL, -- numeric part: 42 in "NL-42"
title TEXT NOT NULL,
description TEXT, -- markdown source
description_html TEXT, -- pre-rendered, sanitized HTML
-- Tree structure
parent_id UUID REFERENCES nodes(id) ON DELETE SET NULL,
path ltree NOT NULL, -- materialized path for subtree queries
-- Issue-specific (NULL for components)
status node_status,
assignee_id UUID REFERENCES actors(id) ON DELETE SET NULL,
created_by UUID NOT NULL REFERENCES actors(id),
-- Shared
labels TEXT[] NOT NULL DEFAULT '{}',
-- Timestamps
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
-- Constraints
UNIQUE (project_id, short_id),
CONSTRAINT chk_issue_has_status CHECK (
(node_type = 'issue' AND status IS NOT NULL)
OR node_type = 'component'
),
CONSTRAINT chk_component_no_status CHECK (
(node_type = 'component' AND status IS NULL)
OR node_type = 'issue'
)
);
-- Tree traversal
CREATE INDEX idx_nodes_parent ON nodes (parent_id) WHERE parent_id IS NOT NULL;
CREATE INDEX idx_nodes_path ON nodes USING GIST (path);
-- Filtering
CREATE INDEX idx_nodes_project ON nodes (project_id);
CREATE INDEX idx_nodes_status ON nodes (project_id, status) WHERE status IS NOT NULL;
CREATE INDEX idx_nodes_assignee ON nodes (assignee_id) WHERE assignee_id IS NOT NULL;
CREATE INDEX idx_nodes_labels ON nodes USING GIN (labels);
CREATE INDEX idx_nodes_type ON nodes (project_id, node_type);
-- Full-text search
ALTER TABLE nodes ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce(description, '')), 'B')
) STORED;
CREATE INDEX idx_nodes_fts ON nodes USING GIN (search_vector);
-- Short ID lookup
CREATE INDEX idx_nodes_short_id ON nodes (project_id, short_id);
```
### lateral_links
Typed edges between nodes (not part of the decomposition tree).
```sql
CREATE TABLE lateral_links (
id UUID PRIMARY KEY,
project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
source_id UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
target_id UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
link_type link_type NOT NULL,
created_by UUID NOT NULL REFERENCES actors(id),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
-- No self-links, no duplicate links in same direction
CONSTRAINT chk_no_self_link CHECK (source_id != target_id),
UNIQUE (source_id, target_id, link_type)
);
CREATE INDEX idx_links_source ON lateral_links (source_id);
CREATE INDEX idx_links_target ON lateral_links (target_id);
CREATE INDEX idx_links_project ON lateral_links (project_id);
```
**Semantics:**
- `blocks`: directed — `source` blocks `target`. Query "what blocks issue X" = `WHERE target_id = X AND link_type = 'blocks'`.
- `relates_to`: undirected — stored once (lower UUID as source by convention). Query both directions.
### comments
Flat comment stream per node.
```sql
CREATE TABLE comments (
id UUID PRIMARY KEY,
node_id UUID NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
author_id UUID NOT NULL REFERENCES actors(id),
body TEXT NOT NULL, -- markdown source
body_html TEXT NOT NULL, -- pre-rendered, sanitized HTML
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_comments_node ON comments (node_id, created_at);
CREATE INDEX idx_comments_author ON comments (author_id);
-- Full-text search on comments
ALTER TABLE comments ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (
to_tsvector('english', coalesce(body, ''))
) STORED;
CREATE INDEX idx_comments_fts ON comments USING GIN (search_vector);
```
### audit_events
Append-only change history. Every mutation is recorded.
```sql
CREATE TABLE audit_events (
id UUID PRIMARY KEY,
project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
actor_id UUID NOT NULL REFERENCES actors(id),
action audit_action NOT NULL,
node_id UUID, -- NULL for non-node events (member changes, etc.)
before_data JSONB, -- snapshot of changed fields before mutation
after_data JSONB, -- snapshot of changed fields after mutation
metadata JSONB, -- additional context (e.g. commit SHA for linked changes)
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_audit_project_time ON audit_events (project_id, created_at DESC);
CREATE INDEX idx_audit_node ON audit_events (node_id, created_at DESC) WHERE node_id IS NOT NULL;
CREATE INDEX idx_audit_actor ON audit_events (actor_id, created_at DESC);
```
### repo_connections
Repositories linked to a project. Components reference these via `repo_connection_id`.
```sql
CREATE TABLE repo_connections (
id UUID PRIMARY KEY,
project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
provider repo_provider NOT NULL,
repo_url TEXT NOT NULL,
default_branch TEXT NOT NULL DEFAULT 'main',
access_token_enc TEXT, -- encrypted OAuth token
webhook_secret_hash TEXT, -- hashed webhook secret for incoming pushes
connected_by UUID NOT NULL REFERENCES actors(id),
connected_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (project_id, repo_url)
);
CREATE INDEX idx_repos_project ON repo_connections (project_id);
```
### node_repo_links
Maps components to specific paths within connected repositories.
```sql
CREATE TABLE node_repo_links (
node_id UUID PRIMARY KEY REFERENCES nodes(id) ON DELETE CASCADE,
repo_connection_id UUID NOT NULL REFERENCES repo_connections(id) ON DELETE CASCADE,
path TEXT, -- subdirectory within repo (NULL = repo root)
branch TEXT -- branch override (NULL = repo default)
);
CREATE INDEX idx_nrl_repo ON node_repo_links (repo_connection_id);
```
### api_tokens
Bearer tokens for agent access.
```sql
CREATE TABLE api_tokens (
id UUID PRIMARY KEY,
actor_id UUID NOT NULL REFERENCES actors(id) ON DELETE CASCADE,
token_hash TEXT NOT NULL UNIQUE, -- SHA-256 hash of the token (never store plaintext)
name TEXT NOT NULL, -- human-readable label ("triage-agent-prod")
last_used_at TIMESTAMPTZ,
expires_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
CONSTRAINT chk_expiry CHECK (expires_at IS NULL OR expires_at > created_at)
);
CREATE INDEX idx_tokens_actor ON api_tokens (actor_id);
CREATE INDEX idx_tokens_hash ON api_tokens (token_hash);
```
### webhook_configs
Minimal webhook registration per project.
```sql
CREATE TABLE webhook_configs (
id UUID PRIMARY KEY,
project_id UUID NOT NULL REFERENCES projects(id) ON DELETE CASCADE,
url TEXT NOT NULL,
events TEXT[] NOT NULL DEFAULT '{}', -- e.g. {'node.status_changed', 'comment.added'}
active BOOLEAN NOT NULL DEFAULT true,
consecutive_failures INTEGER NOT NULL DEFAULT 0,
last_delivery_at TIMESTAMPTZ,
created_by UUID NOT NULL REFERENCES actors(id),
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_webhooks_project ON webhook_configs (project_id);
```
---
## ltree Path Maintenance
### Path Format
Each node's `path` is composed of segments representing the chain of `short_id` values from root to node, prefixed with the project's root identifier:
```
root.12.45.78
│ │ │ └── this node (short_id=78)
│ │ └── parent (short_id=45)
│ └── grandparent (short_id=12)
└── project root sentinel
```
Using integer short IDs as segments keeps paths compact and unique within a project.
### On Node Creation
```sql
-- Pseudocode (application layer):
-- 1. Atomically claim a short_id:
UPDATE projects SET next_short_id = next_short_id + 1
WHERE id = :project_id
RETURNING next_short_id - 1 AS new_short_id;
-- 2. Compute path from parent:
-- If parent_id IS NULL (root node): path = 'root'
-- Otherwise: path = parent.path || '.' || new_short_id::text
INSERT INTO nodes (id, project_id, node_type, short_id, title, parent_id, path, ...)
VALUES (:id, :project_id, :type, :new_short_id, :title, :parent_id, :computed_path, ...);
```
### On Reparent
When a node moves to a new parent, update the entire subtree's paths in one statement:
```sql
-- :old_path = current node's path (e.g. 'root.12.45')
-- :new_parent_path = new parent's path (e.g. 'root.99')
-- :node_short_id = the moved node's short_id segment
UPDATE nodes
SET
path = :new_parent_path || '.' || :node_short_id::text
|| subpath(path, nlevel(:old_path)),
parent_id = CASE WHEN id = :node_id THEN :new_parent_id ELSE parent_id END,
updated_at = now()
WHERE path <@ :old_path;
```
This updates the moved node and all its descendants in a single indexed operation.
---
## Short ID Generation
Short IDs are assigned atomically using `UPDATE ... RETURNING`:
```sql
-- Claim next short_id for a project (called from application layer)
UPDATE projects
SET next_short_id = next_short_id + 1
WHERE id = :project_id
RETURNING next_short_id - 1 AS short_id;
```
The full human-readable ID is `{project.short_id_prefix}-{short_id}`, e.g. `NL-42`. This is computed at read time, not stored as a string — only the integer is persisted on the node.
---
## Full-Text Search
Search is powered by generated `tsvector` columns with GIN indexes on both `nodes` and `comments`.
### Query Pattern
```sql
-- Search nodes in a project
SELECT id, short_id, title, node_type,
ts_rank(search_vector, query) AS rank
FROM nodes, to_tsquery('english', :search_term) query
WHERE project_id = :project_id
AND search_vector @@ query
ORDER BY rank DESC
LIMIT 20;
-- Search comments in a project (via join)
SELECT c.id, c.node_id, c.body, c.author_id,
ts_rank(c.search_vector, query) AS rank
FROM comments c
JOIN nodes n ON n.id = c.node_id, to_tsquery('english', :search_term) query
WHERE n.project_id = :project_id
AND c.search_vector @@ query
ORDER BY rank DESC
LIMIT 20;
```
### Command Palette Search
The command palette performs a unified search across nodes (by title/description) with results ranked by relevance. The generated column approach means no trigger maintenance — the `search_vector` updates automatically on any `title` or `description` change.
---
## Inbox Query
Issues without a parent (triage inbox):
```sql
SELECT * FROM nodes
WHERE project_id = :project_id
AND node_type = 'issue'
AND parent_id IS NULL
ORDER BY created_at DESC;
```
Note: the project's root node is the only component with `parent_id IS NULL`. Orphaned issues (inbox items) are distinguished by `node_type = 'issue'`.
---
## Unblocked Issues Query
Issues that are not blocked by any open issue:
```sql
SELECT n.* FROM nodes n
WHERE n.project_id = :project_id
AND n.node_type = 'issue'
AND n.status IN ('todo', 'in_progress')
AND NOT EXISTS (
SELECT 1 FROM lateral_links ll
JOIN nodes blocker ON blocker.id = ll.source_id
WHERE ll.target_id = n.id
AND ll.link_type = 'blocks'
AND blocker.status NOT IN ('done', 'cancelled')
);
```
---
## Cascade and Deletion Behavior
| FK Relationship | On Delete |
|-----------------|-----------|
| `nodes.parent_id → nodes.id` | `SET NULL` (orphan to inbox, don't cascade-delete subtrees) |
| `nodes.project_id → projects.id` | `CASCADE` (project deletion removes all nodes) |
| `lateral_links → nodes` | `CASCADE` (removing a node removes its links) |
| `comments → nodes` | `CASCADE` (removing a node removes its comments) |
| `nodes.assignee_id → actors.id` | `SET NULL` (deleting an actor unassigns them) |
| `workspace_members → workspaces/actors` | `CASCADE` |
| `project_members → projects/actors` | `CASCADE` |
---
## Constraints Summary
| Constraint | Purpose |
|-----------|---------|
| `UNIQUE (project_id, short_id)` | Short IDs unique within project |
| `UNIQUE (workspace_id, slug)` on projects | Project slugs unique per workspace |
| `UNIQUE (source_id, target_id, link_type)` on links | No duplicate links |
| `CHECK (source_id != target_id)` on links | No self-links |
| `CHECK` on nodes | Issues must have status; components must not |
| `UNIQUE (project_id, repo_url)` on repo_connections | No duplicate repo links |
---
## Migration Notes
- **Alembic** manages all migrations. The initial migration creates extensions, enums, and all tables in dependency order.
- **ltree extension** must be created by a superuser or a user with `CREATE` privilege on the database. The Alembic migration should run `CREATE EXTENSION IF NOT EXISTS ltree` in an `op.execute()` call.
- **Generated columns** (search_vector) require Postgres 12+. Target minimum: Postgres 15.
- **UUIDv7** is generated application-side (Python `uuid7` package). Postgres stores it as standard `UUID` type — no special extension needed.
---
## Future Migration Path (Post-Alpha)
When introducing Neo4j for Layer 3 code connections:
1. Extract graph topology from `nodes` → Neo4j nodes (id, short_id, title, status, labels, assignee_id, path)
2. Move `lateral_links` → Neo4j relationships
3. Keep `nodes` in Postgres but rename to `node_content` (description, description_html only)
4. Add `artifacts` table and Neo4j `Artifact` label + `HAS_ARTIFACT` edge
5. Add `cycles` table and Neo4j `IN_CYCLE` relationship
The schema is designed so this split is additive — the UUID primary key is the cross-database join key, and no structural changes are needed to the Postgres tables beyond extracting topology fields.

1173
non-linear.html Normal file

File diff suppressed because it is too large Load Diff