744 lines
35 KiB
Markdown
744 lines
35 KiB
Markdown
# Non-Linear: Tech Stack & Architecture
|
|
|
|
## Stack Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ FRONTEND │
|
|
│ Vue 3 + Tailwind + Headless UI + ECharts │
|
|
│ Graph Viz: TBD (D3 vs Cytoscape — eval pending) │
|
|
│ Command Palette: vue-command-palette / custom │
|
|
│ Keybindings: VueUse useMagicKeys │
|
|
│ Icons: Lucide │ Font: Inter │ Motion: @vueuse/motion│
|
|
│ State: Pinia │ HTTP: ofetch │ WS: centrifuge-js │
|
|
├─────────────────────────────────────────────────────────┤
|
|
│ CROSS-PLATFORM │
|
|
│ Desktop: Tauri (thin wrapper, no offline) — post-alpha │
|
|
│ Mobile: Capacitor (responsive web first) — v0.2+ │
|
|
├─────────────────────────────────────────────────────────┤
|
|
│ BACKEND │
|
|
│ FastAPI (Python) │
|
|
│ Taskiq (async task queue — webhooks, imports, agents) │
|
|
├─────────────────────────────────────────────────────────┤
|
|
│ DATA LAYER │
|
|
│ Neo4j — graph topology (nodes, edges, status, labels) │
|
|
│ Postgres — content & metadata (rich text, comments, │
|
|
│ attachments meta, audit logs, project cfg) │
|
|
│ Redis — caching, rate limiting │
|
|
│ Meilisearch — full-text search (post-alpha) │
|
|
│ MinIO — S3-compatible file storage (post-alpha) │
|
|
├─────────────────────────────────────────────────────────┤
|
|
│ REAL-TIME │
|
|
│ Centrifugo — WebSocket server, live updates, push │
|
|
├─────────────────────────────────────────────────────────┤
|
|
│ AUTH │
|
|
│ Authentik — OIDC, API tokens, role mgmt, SSO-ready │
|
|
├─────────────────────────────────────────────────────────┤
|
|
│ INFRA/OPS │
|
|
│ Caddy (reverse proxy + TLS) │
|
|
│ Vault (secrets) — post-alpha │
|
|
│ Prometheus + Grafana — post-alpha │
|
|
│ Loki + Tempo + OpenTelemetry — post-alpha │
|
|
├─────────────────────────────────────────────────────────┤
|
|
│ DEPLOYMENT │
|
|
│ Docker Compose (dev + single-node production) │
|
|
└─────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
> **Alpha infrastructure:** Alpha ships with Neo4j, Postgres, Redis, Centrifugo, Caddy, Authentik, FastAPI, Taskiq worker, and Vue frontend (~10 containers). Meilisearch is replaced by Postgres `tsvector`. MinIO, Vault, and the observability stack (Prometheus/Grafana/Loki/Tempo) are introduced post-alpha. See [06a-ALPHA-SCOPE.md](06a-ALPHA-SCOPE.md) for the full alpha boundary.
|
|
|
|
## Data Boundary
|
|
|
|
### Neo4j — Graph Topology
|
|
|
|
Owns the decomposition tree and all overlay edges across the four data layers:
|
|
|
|
- **Node labels:** `Component` (Layer 1), `Issue` (Layer 2), `Artifact` (Layer 4), `Cycle`, `Project`
|
|
- Node identity (UUID), short ID
|
|
- Lightweight properties: status, labels, assignee_id, created_at, updated_at
|
|
- **Layer 1 edges:** `HAS_CHILD` between components (decomposition tree)
|
|
- **Layer 1→2 edges:** `HAS_CHILD` from components to issues (work attachment)
|
|
- **Layer 2 edges:** `BLOCKS`, `DUPLICATES`, `RELATES_TO` between issues (work coordination)
|
|
- **Layer 3 edges:** `DEPENDS_ON`, `IMPORTS`, `CALLS_API`, `SHARES_DB` between components (code connections)
|
|
- **Layer 4 edges:** `HAS_ARTIFACT` from components/issues to artifacts
|
|
- Project root references, cycle membership
|
|
|
|
Each edge type is scoped to a single layer, which enables efficient layer-filtered queries — a Cypher query for "show me this subtree with only Layer 3 edges" simply filters by relationship type.
|
|
|
|
**Why Neo4j over Postgres recursive CTEs:** Queries like "find all unblocked leaves in this subtree," "critical path through blocks links," "everything 3 hops from this node" are what Cypher is built for. CTEs get painful with lateral links and variable-depth queries. The gap widens with Layer 3 code connections (multi-hop dependency chains) and in v0.2+ with cross-project edges.
|
|
|
|
### Postgres — Content & Metadata
|
|
|
|
- **Rich text content:** issue and component descriptions (markdown)
|
|
- **Comment threads:** body, author, parent_comment_id (threading), timestamps
|
|
- **Attachment metadata:** filename, size, mime_type, s3_key, uploader_id, uploaded_at (inline attachments in comments/descriptions)
|
|
- **Artifact metadata (Layer 4):** title, kind, url/file_ref, mime_type — rich metadata for external docs, designs, and uploaded files attached to nodes
|
|
- **User/agent accounts:** profile data, preferences, notification settings
|
|
- **Project settings:** configuration, member lists, default policies
|
|
- **Audit logs:** who changed what, when, with before/after snapshots
|
|
- **Policy definitions:** role templates, custom permission rules
|
|
|
|
**Linked to Neo4j by UUID.** Neo4j node stores `id: "abc-123"`. Postgres stores full content keyed by same UUID. FastAPI joins them as needed. This applies to all node types: Components (Layer 1), Issues (Layer 2), and Artifacts (Layer 4).
|
|
|
|
### Redis — Caching & Real-Time
|
|
|
|
- Subtree query cache (TTL, invalidated on graph mutations)
|
|
- WebSocket pub/sub for real-time updates
|
|
- Rate limiting for agent API
|
|
- Authentik token validation cache
|
|
|
|
### Meilisearch — Search Index (post-alpha)
|
|
|
|
> **Alpha:** Full-text search is handled by Postgres `tsvector` in alpha. Meilisearch is introduced post-alpha for typo-tolerant, prefix-aware search across large datasets.
|
|
|
|
- Indexes issue titles, descriptions, comments, labels
|
|
- Fed from both Neo4j and Postgres
|
|
- Powers command palette search (issues + commands in one result set)
|
|
- Typo-tolerant, prefix search, filtering by label/status/assignee
|
|
|
|
### MinIO — File Storage (post-alpha)
|
|
|
|
> **Alpha:** No file uploads. Links in descriptions suffice for the 2-week validation. MinIO is introduced post-alpha alongside Artifacts (Layer 4).
|
|
|
|
- S3-compatible API, self-hosted
|
|
- Stores attachment files (images, docs)
|
|
- Postgres stores metadata and S3 key; MinIO stores bytes
|
|
- Migration path to AWS S3: zero code changes
|
|
|
|
## Concrete Database Schemas
|
|
|
|
### UUID Strategy
|
|
|
|
All entities use UUIDv7 (time-sortable). Generated application-side by FastAPI before writing to either database. The same UUID is used as the primary key in both Neo4j and Postgres, serving as the cross-database join key.
|
|
|
|
### Neo4j Schema
|
|
|
|
Neo4j stores graph topology and lightweight node properties. All content lives in Postgres.
|
|
|
|
**Node labels and properties:**
|
|
|
|
```cypher
|
|
// Layer 1: Component node
|
|
CREATE (c:Component {
|
|
id: "uuidv7",
|
|
short_id: "NL-C12",
|
|
title: "auth-service",
|
|
status: null, // components have no status
|
|
labels: ["backend", "core"],
|
|
owner_id: "uuidv7",
|
|
assignee_id: null,
|
|
repo_provider: "github",
|
|
repo_url: "https://github.com/team/auth",
|
|
repo_path: "/src/oauth",
|
|
repo_branch: "main",
|
|
created_at: datetime(),
|
|
updated_at: datetime()
|
|
})
|
|
|
|
// Layer 2: Issue node
|
|
CREATE (i:Issue {
|
|
id: "uuidv7",
|
|
short_id: "NL-42",
|
|
title: "implement refresh tokens",
|
|
status: "todo",
|
|
labels: ["feature", "p1"],
|
|
assignee_id: "uuidv7",
|
|
created_by: "uuidv7",
|
|
cycle_id: "uuidv7",
|
|
created_at: datetime(),
|
|
updated_at: datetime()
|
|
})
|
|
|
|
// Layer 4: Artifact node — POST-ALPHA
|
|
CREATE (a:Artifact {
|
|
id: "uuidv7",
|
|
title: "Login flow mockup",
|
|
kind: "link", // "link" | "file" | "embed"
|
|
url: "https://figma.com/...", // for links/embeds
|
|
file_ref: null, // MinIO s3_key for uploaded files
|
|
mime_type: null,
|
|
size_bytes: null,
|
|
created_by: "uuidv7",
|
|
created_at: datetime()
|
|
})
|
|
|
|
// Project root (virtual node linking to decomposition tree root)
|
|
CREATE (p:Project {
|
|
id: "uuidv7",
|
|
workspace_id: "uuidv7",
|
|
root_id: "uuidv7"
|
|
})
|
|
```
|
|
|
|
**Relationships (organized by layer):**
|
|
|
|
```cypher
|
|
// Decomposition tree (parent → child) — Layer 1 + Layer 2
|
|
(component)-[:HAS_CHILD]->(component) // Layer 1: structure nesting
|
|
(component)-[:HAS_CHILD]->(issue) // Layer 1→2: work attached to structure
|
|
(issue)-[:HAS_CHILD]->(issue) // Layer 2: sub-tasks
|
|
|
|
// Layer 2: Work coordination links (between issues)
|
|
(issue)-[:BLOCKS]->(issue)
|
|
(issue)-[:RELATES_TO]->(issue)
|
|
(issue)-[:DUPLICATES]->(issue)
|
|
|
|
// Layer 3: Code connection links (between components) — POST-ALPHA
|
|
(component)-[:DEPENDS_ON {source: "manual"}]->(component)
|
|
(component)-[:IMPORTS {source: "inferred"}]->(component)
|
|
(component)-[:CALLS_API {source: "inferred"}]->(component)
|
|
(component)-[:SHARES_DB {source: "manual"}]->(component)
|
|
|
|
// Layer 4: Artifact attachments — POST-ALPHA
|
|
(component)-[:HAS_ARTIFACT]->(artifact)
|
|
(issue)-[:HAS_ARTIFACT]->(artifact)
|
|
|
|
// Cycle membership
|
|
(issue)-[:IN_CYCLE]->(cycle:Cycle { id, name, start_date, end_date })
|
|
```
|
|
|
|
Layer 3 edges carry a `source` property (`"manual"` or `"inferred"`) to distinguish human-declared dependencies from code-analysis results.
|
|
|
|
**Indexes:**
|
|
|
|
```cypher
|
|
CREATE INDEX comp_id FOR (c:Component) ON (c.id);
|
|
CREATE INDEX comp_short FOR (c:Component) ON (c.short_id);
|
|
CREATE INDEX issue_id FOR (i:Issue) ON (i.id);
|
|
CREATE INDEX issue_short FOR (i:Issue) ON (i.short_id);
|
|
CREATE INDEX issue_status FOR (i:Issue) ON (i.status);
|
|
CREATE INDEX issue_assignee FOR (i:Issue) ON (i.assignee_id);
|
|
CREATE INDEX artifact_id FOR (a:Artifact) ON (a.id);
|
|
CREATE INDEX project_id FOR (p:Project) ON (p.id);
|
|
```
|
|
|
|
### Postgres Schema (SQLModel)
|
|
|
|
Postgres stores all content, metadata, and configuration. Managed via Alembic migrations.
|
|
|
|
```python
|
|
class NodeContent(SQLModel, table=True):
|
|
"""Rich content for both components and issues."""
|
|
id: uuid.UUID = Field(primary_key=True) # matches Neo4j node id
|
|
description: str | None = None # markdown
|
|
description_html: str | None = None # pre-rendered, sanitized HTML
|
|
|
|
class Comment(SQLModel, table=True):
|
|
id: uuid.UUID = Field(default_factory=uuid7, primary_key=True)
|
|
node_id: uuid.UUID = Field(foreign_key="nodecontent.id", index=True)
|
|
author_id: uuid.UUID = Field(foreign_key="actor.id")
|
|
body: str # markdown
|
|
body_html: str # pre-rendered, sanitized HTML
|
|
created_at: datetime
|
|
updated_at: datetime
|
|
|
|
class CommentReaction(SQLModel, table=True):
|
|
id: uuid.UUID = Field(default_factory=uuid7, primary_key=True)
|
|
comment_id: uuid.UUID = Field(foreign_key="comment.id", index=True)
|
|
actor_id: uuid.UUID = Field(foreign_key="actor.id")
|
|
emoji: str # e.g. "+1", "rocket"
|
|
created_at: datetime
|
|
|
|
class Attachment(SQLModel, table=True):
|
|
"""File attached inline to a comment or description (e.g. pasted image)."""
|
|
id: uuid.UUID = Field(default_factory=uuid7, primary_key=True)
|
|
node_id: uuid.UUID = Field(foreign_key="nodecontent.id", index=True)
|
|
filename: str
|
|
size_bytes: int
|
|
mime_type: str
|
|
s3_key: str # MinIO object key
|
|
uploader_id: uuid.UUID = Field(foreign_key="actor.id")
|
|
uploaded_at: datetime
|
|
|
|
class ArtifactContent(SQLModel, table=True):
|
|
"""Layer 4: external context attached to a component or issue.
|
|
Topology (HAS_ARTIFACT edge) lives in Neo4j; rich metadata lives here.
|
|
POST-ALPHA: introduced alongside Artifact nodes and MinIO."""
|
|
id: uuid.UUID = Field(primary_key=True) # matches Neo4j Artifact node id
|
|
title: str
|
|
kind: str # "link" | "file" | "embed"
|
|
url: str | None = None # external URL (Figma, Docs, etc.)
|
|
file_ref: str | None = None # MinIO s3_key for uploaded files
|
|
mime_type: str | None = None
|
|
size_bytes: int | None = None
|
|
node_id: uuid.UUID = Field(foreign_key="nodecontent.id", index=True)
|
|
created_by: uuid.UUID = Field(foreign_key="actor.id")
|
|
created_at: datetime
|
|
|
|
class Actor(SQLModel, table=True):
|
|
"""Human user or AI agent."""
|
|
id: uuid.UUID = Field(default_factory=uuid7, primary_key=True)
|
|
type: str # "user" | "agent"
|
|
name: str
|
|
email: str | None = None
|
|
authentik_uid: str | None = None # OIDC subject claim
|
|
preferences: dict = Field(default_factory=dict) # JSON: theme, notifications, etc.
|
|
created_at: datetime
|
|
|
|
class Workspace(SQLModel, table=True):
|
|
id: uuid.UUID = Field(default_factory=uuid7, primary_key=True)
|
|
name: str
|
|
slug: str = Field(unique=True, index=True)
|
|
created_at: datetime
|
|
|
|
class WorkspaceMember(SQLModel, table=True):
|
|
workspace_id: uuid.UUID = Field(foreign_key="workspace.id", primary_key=True)
|
|
actor_id: uuid.UUID = Field(foreign_key="actor.id", primary_key=True)
|
|
role: str # workspace-level role
|
|
joined_at: datetime
|
|
|
|
class ProjectConfig(SQLModel, table=True):
|
|
id: uuid.UUID = Field(primary_key=True) # matches Neo4j Project id
|
|
workspace_id: uuid.UUID = Field(foreign_key="workspace.id", index=True)
|
|
name: str
|
|
settings: dict = Field(default_factory=dict) # JSON: custom statuses, defaults
|
|
created_at: datetime
|
|
|
|
class PolicyRule(SQLModel, table=True):
|
|
id: uuid.UUID = Field(default_factory=uuid7, primary_key=True)
|
|
project_id: uuid.UUID = Field(foreign_key="projectconfig.id", index=True)
|
|
actor_id: uuid.UUID | None = Field(default=None) # null = role-level
|
|
role_name: str | None = None
|
|
action: str # e.g. "read_node", "create_child", "*"
|
|
resource_scope: str # "global" | "subtree:{node_id}" | "node:{node_id}"
|
|
effect: str # "allow" | "deny"
|
|
|
|
class AuditLog(SQLModel, table=True):
|
|
id: uuid.UUID = Field(default_factory=uuid7, primary_key=True)
|
|
project_id: uuid.UUID = Field(foreign_key="projectconfig.id", index=True)
|
|
actor_id: uuid.UUID = Field(foreign_key="actor.id")
|
|
action: str # e.g. "status_changed", "reparented"
|
|
node_id: uuid.UUID | None = None
|
|
before: dict | None = None # JSON snapshot
|
|
after: dict | None = None # JSON snapshot
|
|
created_at: datetime = Field(index=True)
|
|
|
|
class WebhookConfig(SQLModel, table=True):
|
|
id: uuid.UUID = Field(default_factory=uuid7, primary_key=True)
|
|
project_id: uuid.UUID = Field(foreign_key="projectconfig.id", index=True)
|
|
url: str
|
|
secret_hash: str # hashed, never stored plaintext
|
|
events: list[str] = Field(default_factory=list)
|
|
active: bool = True
|
|
consecutive_failures: int = 0
|
|
created_at: datetime
|
|
```
|
|
|
|
## Dual-Database Consistency
|
|
|
|
Neo4j and Postgres are **not replicated** — they own different data, linked by UUID. Both writes happen in the same API request. The consistency strategy for v0.1:
|
|
|
|
### Write Order
|
|
|
|
1. **Postgres first.** Open a SQLAlchemy transaction. Write content/metadata. Do not commit yet.
|
|
2. **Neo4j second.** Perform the graph mutation (create node, update properties, create edge).
|
|
3. **Commit Postgres.** If Postgres commit succeeds, the operation is complete.
|
|
|
|
### Failure Handling
|
|
|
|
- **Neo4j write fails:** Rollback the Postgres transaction (it hasn't committed). Clean failure, no orphans.
|
|
- **Postgres commit fails after Neo4j succeeds:** Issue a compensating operation on Neo4j (delete the node/revert the property change). Log the incident for review.
|
|
- **Partial Neo4j failure (e.g., network timeout with unknown state):** Flag the UUID for reconciliation review.
|
|
|
|
### Reconciliation Job
|
|
|
|
A periodic background task (Taskiq, runs every 15 minutes) checks for inconsistencies:
|
|
|
|
- UUIDs present in Neo4j but missing from Postgres (orphan graph nodes)
|
|
- UUIDs present in Postgres `NodeContent` but missing from Neo4j (orphan content)
|
|
- Mismatched lightweight properties (status, assignee) between Neo4j and Postgres audit log
|
|
|
|
Orphans are logged and surfaced in an admin dashboard. Auto-repair is deferred — manual review for v0.1.
|
|
|
|
### What's Eventually Consistent
|
|
|
|
- **Meilisearch index:** Updated asynchronously via Taskiq. Acceptable lag of seconds.
|
|
- **Redis cache:** Invalidated on mutation. TTL-based expiry as fallback.
|
|
- **Centrifugo events:** Fire-and-forget publish. Missed events are recoverable by client re-fetch.
|
|
|
|
## Backend Architecture
|
|
|
|
### FastAPI Application Structure
|
|
|
|
```
|
|
non-linear-api/
|
|
├── app/
|
|
│ ├── main.py # App, middleware, startup/shutdown
|
|
│ ├── config.py # Settings from env vars
|
|
│ ├── dependencies.py # Shared deps (db sessions, auth, current_user)
|
|
│ ├── auth/ # Authentik integration
|
|
│ │ ├── oidc.py # Token validation, OIDC discovery
|
|
│ │ ├── permissions.py # Policy engine evaluation
|
|
│ │ └── agent_tokens.py # API token management for agents
|
|
│ ├── graph/ # Neo4j layer
|
|
│ │ ├── connection.py # Neo4j driver management
|
|
│ │ ├── queries.py # Cypher query templates
|
|
│ │ ├── mutations.py # Graph write operations
|
|
│ │ └── traversal.py # Subtree, path, neighbor queries
|
|
│ ├── content/ # Postgres layer
|
|
│ │ ├── models.py # SQLAlchemy/SQLModel models
|
|
│ │ ├── descriptions.py # Rich text CRUD
|
|
│ │ ├── comments.py # Comment thread CRUD
|
|
│ │ ├── attachments.py # Inline attachment metadata + MinIO upload/download — POST-ALPHA
|
|
│ │ └── artifacts.py # Layer 4: artifact CRUD (links, files, embeds) — POST-ALPHA
|
|
│ ├── connections/ # Layer 3: code connection analysis — POST-ALPHA
|
|
│ │ ├── inference.py # Auto-infer dependencies from repo analysis
|
|
│ │ └── manual.py # Manual code connection CRUD
|
|
│ ├── search/ # Meilisearch integration
|
|
│ │ ├── indexer.py # Index updates on mutations
|
|
│ │ └── search.py # Query interface
|
|
│ ├── realtime/ # WebSocket layer
|
|
│ │ ├── manager.py # Connection management
|
|
│ │ └── events.py # Event types and broadcasting
|
|
│ ├── tasks/ # Taskiq background jobs
|
|
│ │ ├── webhooks.py # Deliver webhooks to agent endpoints
|
|
│ │ ├── indexing.py # Async search index updates
|
|
│ │ ├── notifications.py # Notification delivery
|
|
│ │ └── connections.py # Layer 3: periodic code connection inference — POST-ALPHA
|
|
│ └── api/v1/ # Route handlers
|
|
│ ├── nodes.py # CRUD + tree operations
|
|
│ ├── links.py # Lateral link management (Layer 2 + Layer 3)
|
|
│ ├── projects.py # Project CRUD
|
|
│ ├── comments.py # Comment endpoints
|
|
│ ├── attachments.py # Inline upload/download
|
|
│ ├── artifacts.py # Layer 4: artifact endpoints — POST-ALPHA
|
|
│ ├── connections.py # Layer 3: code connection endpoints — POST-ALPHA
|
|
│ ├── search.py # Search endpoint
|
|
│ └── agent.py # Agent-specific API surface
|
|
├── tests/
|
|
├── alembic/ # Postgres migrations
|
|
├── docker-compose.yml
|
|
└── pyproject.toml
|
|
```
|
|
|
|
### Request Flows
|
|
|
|
**Typical read ("get node with full context"):**
|
|
|
|
```
|
|
Client → FastAPI → Auth middleware (validate token via Authentik)
|
|
→ Policy engine (check permissions)
|
|
→ Neo4j: fetch node + parent + children + links
|
|
→ Postgres: fetch description, comments, attachment meta
|
|
→ Merge response → Client
|
|
```
|
|
|
|
**Typical write ("change node status"):**
|
|
|
|
```
|
|
Client → FastAPI → Auth → Policy engine
|
|
→ Neo4j: update node status
|
|
→ Redis: invalidate cache, publish event
|
|
→ Taskiq: queue webhook delivery, search index update
|
|
→ WebSocket: broadcast to connected clients
|
|
→ Response → Client
|
|
```
|
|
|
|
### Sync Strategy (Neo4j ↔ Postgres)
|
|
|
|
Not replicated — they own different data. Linked by UUID. Both operations happen in same API request. Compensating transaction pattern for consistency. Eventual consistency acceptable for search index and cache.
|
|
|
|
## Auth Architecture
|
|
|
|
```
|
|
┌──────────┐ OIDC token ┌───────────┐
|
|
│ Vue App ├─────────────────────►│ Authentik │
|
|
└────┬─────┘ (login flow) └─────┬─────┘
|
|
│ │
|
|
│ Bearer token │ Token introspection
|
|
▼ ▼
|
|
┌──────────┐◄────────────────────┌───────────┐
|
|
│ FastAPI │ validate token │ Authentik │
|
|
│ (resource│ check claims │ (OIDC │
|
|
│ server) │ │ provider)│
|
|
└──────────┘ └───────────┘
|
|
```
|
|
|
|
- **Human users:** OIDC login flow. JWT access tokens.
|
|
- **AI agents:** API tokens issued through Authentik, tied to agent actor accounts.
|
|
- **FastAPI:** pure resource server. Validates tokens, reads claims, enforces policies.
|
|
|
|
## API Error Contract
|
|
|
|
All error responses use a consistent envelope:
|
|
|
|
```json
|
|
{
|
|
"error": {
|
|
"code": "validation_error",
|
|
"message": "Human-readable description",
|
|
"details": [
|
|
{ "field": "title", "message": "Field is required" }
|
|
]
|
|
}
|
|
}
|
|
```
|
|
|
|
### HTTP Status Codes
|
|
|
|
| Code | Usage |
|
|
|------|-------|
|
|
| `400` | Malformed request (bad JSON, missing required fields) |
|
|
| `404` | Resource not found **or** actor lacks permission to see it. Permission-denied nodes return 404 (not 403) to prevent information leakage about resource existence. |
|
|
| `409` | Conflict (e.g., duplicate `short_id`, stale update) |
|
|
| `422` | Validation error. Standard FastAPI/Pydantic response with field-level detail. |
|
|
| `429` | Rate limited. Includes `Retry-After` header (seconds). |
|
|
| `500` | Internal server error. Logged with correlation ID for debugging. |
|
|
|
|
### Rate Limiting
|
|
|
|
- Agent API: token bucket per actor, configurable per role (default: 100 req/min).
|
|
- Human API: higher limits (default: 300 req/min).
|
|
- Enforced via Redis. `429` response includes `Retry-After` and `X-RateLimit-Remaining` headers.
|
|
|
|
## Security
|
|
|
|
### Input Sanitization
|
|
|
|
- **Cypher injection:** All Neo4j queries use parameterized Cypher exclusively. User-supplied values are never interpolated into query strings. The `graph/queries.py` module enforces this by accepting only typed parameters.
|
|
- **SQL injection:** SQLModel/SQLAlchemy parameterized queries. No raw SQL with string formatting.
|
|
- **XSS prevention:** All markdown content (descriptions, comments) is sanitized server-side using `nh3` (Rust-based HTML sanitizer) before storage. Both raw markdown and pre-rendered sanitized HTML are stored. The frontend renders the pre-sanitized HTML.
|
|
- **File upload validation:** MIME type validation against allowlist (images, PDFs, common doc formats). Size limit: 25 MB per file. Filename sanitization to prevent path traversal.
|
|
|
|
### Transport & Headers
|
|
|
|
- **TLS:** All traffic encrypted via Caddy reverse proxy (automatic Let's Encrypt certificates).
|
|
- **CSRF:** SameSite=Lax cookies for browser sessions. Bearer token API calls are inherently CSRF-safe.
|
|
- **Content-Security-Policy:** Strict CSP headers served by Caddy. `script-src 'self'`, no inline scripts, no `eval`.
|
|
- **CORS:** Allowlist of known origins (frontend domain). No wildcard in production.
|
|
- **Security headers:** `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY`, `Strict-Transport-Security`.
|
|
|
|
## Design Language
|
|
|
|
Targets Linear's aesthetic: minimal, fast, slightly dark-IDE feel.
|
|
|
|
- **Spacing:** tight, no wasted space
|
|
- **Colors:** muted base palette, high-contrast accents only for status/priority
|
|
- **Borders:** almost none — separation via spacing and subtle background shifts
|
|
- **Dark mode:** default, light mode secondary
|
|
- **Typography:** Inter, small-but-readable sizes
|
|
- **Animations:** subtle slides and fades, 100-150ms, nothing bouncy
|
|
- **Optimistic updates:** every interaction feels instant, syncs in background
|
|
|
|
## Real-Time Updates (Centrifugo)
|
|
|
|
Centrifugo handles both live UI updates and notification delivery over WebSocket. Redis is no longer used for WebSocket pub/sub directly — Centrifugo manages its own connections and subscribes to events published by the backend via its server API.
|
|
|
|
### Channel Structure
|
|
|
|
| Channel | Scope | Subscribers |
|
|
|---------|-------|-------------|
|
|
| `project:{id}` | All mutations in a project | All connected project members |
|
|
| `node:{id}` | Mutations to a specific node | Clients viewing the focus widget for that node |
|
|
| `user:{id}` | Personal notifications | Single user's connected clients |
|
|
|
|
### Events Pushed
|
|
|
|
| Event | Layer | Channel | Payload |
|
|
|-------|-------|---------|---------|
|
|
| `node.status_changed` | 2 | `project:{id}` + `node:{id}` | node_id, old_status, new_status, actor |
|
|
| `node.created` | 1/2 | `project:{id}` | node_id, parent_id, type, title, actor |
|
|
| `node.deleted` | 1/2 | `project:{id}` + `node:{id}` | node_id, actor |
|
|
| `node.reparented` | 1/2 | `project:{id}` + `node:{id}` | node_id, old_parent, new_parent, actor |
|
|
| `comment.added` | 2 | `node:{id}` | comment_id, node_id, author, preview |
|
|
| `link.changed` | 2/3 | `project:{id}` | source_id, target_id, link_type, layer, action (created/removed) |
|
|
| `assignment.changed` | 2 | `project:{id}` + `node:{id}` | node_id, old_assignee, new_assignee |
|
|
| `artifact.attached` | 4 | `project:{id}` + `node:{id}` | artifact_id, node_id, title, kind, actor |
|
|
| `artifact.removed` | 4 | `project:{id}` + `node:{id}` | artifact_id, node_id, actor |
|
|
| `connection.inferred` | 3 | `project:{id}` | source_id, target_id, link_type, source: "inferred" |
|
|
| `notification` | — | `user:{id}` | notification object |
|
|
|
|
The `layer` field on `link.changed` events tells the client which layer the change affects, enabling clients to ignore events for inactive layers.
|
|
|
|
### Backend Publish Flow
|
|
|
|
```
|
|
Mutation request → Postgres + Neo4j writes
|
|
→ Centrifugo server API: publish event to relevant channels
|
|
→ Taskiq: queue webhook delivery + search index update
|
|
→ Response to client
|
|
```
|
|
|
|
The backend publishes to Centrifugo via its HTTP server API (not through Redis pub/sub). This gives direct control over which channels receive which events.
|
|
|
|
### Client-Side Handling
|
|
|
|
- **Pinia store:** Incoming Centrifugo events are applied to the Pinia store. The graph view, focus widget, and list view all react to store changes.
|
|
- **Optimistic updates:** The client applies mutations locally before the server responds. If the server rejects the mutation (4xx), the client reverts the optimistic change by re-fetching the affected node.
|
|
- **Conflict model:** Last-write-wins for simple fields (status, assignee, labels). The server is the source of truth. When two clients modify the same field concurrently, the last write committed to Neo4j is the one that Centrifugo broadcasts.
|
|
- **Reconnection:** On WebSocket disconnect, the client re-subscribes to channels and fetches the current state to catch up on missed events.
|
|
|
|
### Cross-Platform
|
|
|
|
- **Tauri desktop:** No offline support. Tauri wraps the Vue app as-is. When the network is unavailable, the app shows a connection-lost banner and retries. No local mutation queue.
|
|
|
|
## Docker Compose
|
|
|
|
### Alpha (~10 containers)
|
|
|
|
```yaml
|
|
services:
|
|
api: # FastAPI (uvicorn)
|
|
frontend: # Vue 3 (Vite)
|
|
worker: # Taskiq worker (same codebase as api)
|
|
neo4j: # Graph database
|
|
postgres: # Relational database
|
|
redis: # Cache + rate limiting
|
|
centrifugo: # Real-time WebSocket server
|
|
caddy: # Reverse proxy + automatic TLS
|
|
authentik: # Identity provider (server + worker)
|
|
authentik-db: # Authentik's own Postgres
|
|
```
|
|
|
|
~10 containers. Runs comfortably on 16GB RAM. Search via Postgres `tsvector`.
|
|
|
|
### Development (v0.1-full)
|
|
|
|
```yaml
|
|
services:
|
|
api: # FastAPI (uvicorn --reload)
|
|
frontend: # Vue 3 (vite dev server)
|
|
worker: # Taskiq worker (same codebase as api)
|
|
neo4j: # Graph database
|
|
postgres: # Relational database
|
|
redis: # Cache + rate limiting
|
|
meilisearch: # Search engine
|
|
minio: # Object storage
|
|
centrifugo: # Real-time WebSocket server
|
|
caddy: # Reverse proxy + automatic TLS
|
|
authentik: # Identity provider (server + worker)
|
|
authentik-db: # Authentik's own Postgres
|
|
```
|
|
|
|
~13 containers. Runs comfortably on 16GB RAM.
|
|
|
|
### Production (Single-Node, v0.1-full)
|
|
|
|
Same Docker Compose topology with production-grade additions:
|
|
|
|
```yaml
|
|
services:
|
|
# ... all of the above, plus:
|
|
caddy: # Reverse proxy + automatic TLS
|
|
vault: # Secrets management (HashiCorp Vault)
|
|
prometheus: # Metrics collection
|
|
grafana: # Dashboards + alerting
|
|
loki: # Log aggregation
|
|
tempo: # Distributed tracing
|
|
```
|
|
|
|
~18 containers total. Recommended: 32GB RAM, 4+ CPU cores for production.
|
|
|
|
## Reverse Proxy (Caddy)
|
|
|
|
Caddy serves as the single entry point for all traffic:
|
|
|
|
- **Automatic TLS** via Let's Encrypt (ACME). Zero-config HTTPS.
|
|
- **Routes:** `/api/*` → FastAPI, `/ws/*` → Centrifugo, `/*` → Vue frontend (nginx or static files).
|
|
- **Security headers:** CSP, HSTS, X-Frame-Options, X-Content-Type-Options injected at this layer.
|
|
- **Rate limiting:** Basic connection-level rate limiting as a first defense layer (application-level rate limiting in FastAPI for finer control).
|
|
|
|
## Secrets Management
|
|
|
|
> **Alpha:** Docker secrets or environment variables. Vault is introduced post-alpha.
|
|
|
|
### HashiCorp Vault (post-alpha)
|
|
|
|
- All sensitive configuration (database passwords, Authentik client secrets, agent API token signing keys, webhook HMAC secrets, MinIO credentials) stored in Vault.
|
|
- FastAPI reads secrets from Vault at startup via the `hvac` Python client.
|
|
- Secret rotation supported without application restart (Vault dynamic secrets for Postgres credentials).
|
|
|
|
### Docker Secrets (Fallback)
|
|
|
|
For simpler deployments that don't want Vault overhead, Docker secrets via compose files are supported. Environment variables as the last resort.
|
|
|
|
## Observability (post-alpha)
|
|
|
|
> **Alpha:** Structured JSON logs via `structlog` + `docker logs`. The full observability stack below is introduced post-alpha.
|
|
|
|
### Metrics (Prometheus + Grafana)
|
|
|
|
- **FastAPI:** `prometheus-fastapi-instrumentator` exposes request latency, status codes, in-flight requests at `/metrics`.
|
|
- **Neo4j:** Neo4j Prometheus plugin or `neo4j-exporter` for query latency, cache hit rates, transaction counts.
|
|
- **Postgres:** `postgres_exporter` for connection pool, query stats, replication lag.
|
|
- **Redis:** `redis_exporter` for memory, hit rate, connected clients.
|
|
- **Centrifugo:** Built-in Prometheus metrics for connections, channels, messages.
|
|
- **Grafana dashboards:** Pre-built dashboards for each service. Alerting rules for error rate spikes, high latency, container restarts.
|
|
|
|
### Tracing (OpenTelemetry + Tempo)
|
|
|
|
- OpenTelemetry SDK instrumented in FastAPI. Traces span the full request lifecycle: auth → policy check → Neo4j query → Postgres query → response.
|
|
- Trace context propagated to Taskiq workers (webhook delivery, indexing).
|
|
- Traces stored in Grafana Tempo, queryable from Grafana.
|
|
|
|
### Logging (Structured JSON + Loki)
|
|
|
|
- All services emit structured JSON logs (Python `structlog` for FastAPI).
|
|
- Fields: timestamp, level, correlation_id, actor_id, action, duration_ms.
|
|
- Collected by Grafana Loki via Docker logging driver or Promtail.
|
|
- Correlation ID links logs across FastAPI → Taskiq → Centrifugo for a single request.
|
|
|
|
### Health Checks
|
|
|
|
Every service exposes a health check endpoint used by Docker Compose `healthcheck` directives:
|
|
|
|
- `GET /health` on FastAPI, Centrifugo
|
|
- TCP checks for Neo4j, Postgres, Redis, Meilisearch, MinIO
|
|
- Grafana alerts on health check failures.
|
|
|
|
## Database Migrations
|
|
|
|
### Postgres (Alembic)
|
|
|
|
- Alembic manages all Postgres schema migrations.
|
|
- Migration files stored in `alembic/versions/`.
|
|
- Auto-generated from SQLModel model changes (`alembic revision --autogenerate`).
|
|
- Applied on deployment: `alembic upgrade head` runs before the API container starts.
|
|
|
|
### Neo4j (Versioned Cypher Scripts)
|
|
|
|
- Migration scripts stored in `neo4j/migrations/` as numbered Cypher files (`001_initial_schema.cypher`, `002_add_cycle_nodes.cypher`).
|
|
- A lightweight migration runner (Python script) tracks applied migrations in a Neo4j `:Migration` node.
|
|
- Applied on deployment before the API container starts.
|
|
|
|
## Testing Strategy
|
|
|
|
### Integration Tests (Primary)
|
|
|
|
- **Framework:** pytest with testcontainers.
|
|
- **Containers:** Neo4j, Postgres, Redis, Meilisearch spun up per test session (shared across tests for speed, reset between test classes).
|
|
- **Scope:** API endpoint tests hitting real databases. Policy engine tests with real Neo4j graph structures. Dual-DB consistency tests verifying write-order semantics.
|
|
- **Fixtures:** Factory functions that create graph structures (components, issues, links) for test scenarios.
|
|
|
|
### End-to-End Tests
|
|
|
|
- **Framework:** Playwright against the full Docker Compose stack.
|
|
- **Scope:** Critical user flows — create project, add components, navigate graph, triage inbox, agent API workflows.
|
|
- **Environment:** Dedicated `docker-compose.test.yml` with ephemeral containers.
|
|
|
|
### What's Not Mandated
|
|
|
|
Isolated unit tests are not required by convention. The dual-DB architecture makes mocking both databases brittle. Integration tests with real containers are the priority.
|
|
|
|
## CI/CD Pipeline
|
|
|
|
```
|
|
push/MR → lint → test → build → deploy
|
|
```
|
|
|
|
| Stage | Tools | Description |
|
|
|-------|-------|-------------|
|
|
| **Lint** | ruff (Python), eslint + prettier (Vue/TS) | Code style and static analysis |
|
|
| **Test** | pytest + testcontainers, Playwright | Integration + E2E tests |
|
|
| **Build** | Docker | Build API, frontend, worker images |
|
|
| **Push** | Container registry | Push tagged images to GitLab Container Registry |
|
|
| **Deploy** | SSH + docker compose pull | Pull new images on production server, rolling restart |
|
|
|
|
CI runs on GitLab CI. Pipeline definition in `.gitlab-ci.yml`. Testcontainers require Docker-in-Docker or a privileged runner.
|
|
|
|
## Open Technical Questions
|
|
|
|
1. **Graph viz library:** D3 vs Cytoscape — prototype comparison pending
|
|
2. **Neo4j driver:** official `neo4j` Python driver vs `neomodel` OGM
|
|
3. **Gantt implementation:** custom or frappe-gantt as starting point
|