Roadmap¶

Tracking of deferred features, planned work, and known design ideas. Items here are explicit decisions to not build something now; the rationale and the unblocking criteria are documented per item so future contributors can pick them up with full context.

When you ship or defer something significant, update this page in the same change.

Sandbox / `--isolate`¶

The --isolate flag (see AI sandbox guide) ships with a deliberate set of tradeoffs. These are the next logical steps and the reasons they aren't in v1.

`--isolate-allow-net=<host[,...]>` — network egress allowlist¶

What: Restrict the AI's network reach to a specific host list (e.g. api.anthropic.com, api.bnerd.cloud, configured Kubernetes API servers). Currently --isolate is filesystem-bounded but allows unrestricted egress, leaving prompt-injection-driven exfiltration via web_fetch open.

Why deferred: It's a real architectural step up, not a one-line addition. The reference design is ExitBox/Cloud-Exit's "moat" pattern — container has no DNS and no direct internet; a sidecar Squid proxy on a second container has the only egress route and enforces the allowlist with 403 Forbidden for blocked hosts. Hot-reload via Unix socket so the user can grant a domain mid-session without restarting.

When to build: When users start pointing bnerd at self-hosted or third-party LLM endpoints they don't fully trust. Until then the filesystem boundary covers the everyday "don't poop in my home dir" risk.

Landlock LSM as in-process defense in depth¶

What: On Linux 5.13+, call landlock_create_ruleset from inside the bnerd process to drop its own filesystem reach to a workdir + a few specific paths. Belt and braces on top of bwrap/podman.

Why deferred: Diminishing returns. Bwrap already enforces the boundary at namespace level; landlock would catch a hypothetical bwrap-escape but those are extremely rare and not in our threat model. Worth adding when we touch this code anyway.

When to build: Opportunistically — next time we're in pkg/isolate/ for another reason.

macOS native `sandbox-exec` path¶

What: A native macOS isolation runtime using sandbox-exec and SBPL profiles, sitting alongside bwrap/podman/docker in the runtime priority list. Avoids the need for podman-machine on macOS.

Why deferred: SBPL is a deprecated Apple API (still works but undocumented), and podman-machine works fine for the macOS users we have today. Not worth the maintenance burden for a deprecated API.

When to build: If/when podman-machine on macOS proves unreliable for users, or if Apple ships a public replacement for sandbox-exec.

Published container images¶

What: Build bnerd-isolate:vX.Y.Z images in CI and push to registry.gitlab.com/cloud/app/cli. First-time --isolate invocation pulls instead of building locally.

Why deferred: Local build works, takes ~2 minutes once. Saves a CI integration job + image hosting until there's user demand.

When to build: When the local-build-on-first-use latency becomes a real friction point, or when we want to ship signed/SBOM'd images.

Granular `--isolate-mount` / `--isolate-no-mount` flags¶

What: Per-mount opt-in/opt-out flags so users can carve specific tradeoffs (e.g. "isolate me but mount ~/.aws for this session", or "strict mode but keep SSH agent forwarding").

Why deferred: Two-tier on/strict covers ~95% of cases without exploding the flag surface. Adding granular flags now would lock us into an interface before we know what users actually need.

When to build: After observing real user requests for combinations that the two-tier model can't express.

Per-tool sandbox capability declarations¶

What: Each AI tool declares which paths/credentials it needs. The sandbox is constructed dynamically based on which tools are enabled by --ai-mode rather than a fixed mount table.

Why deferred: Premature abstraction. The fixed mount table is short and audit-friendly; dynamic construction would obscure what's exposed.

When to build: Only if the tool surface grows so much that the fixed mount table accumulates many "X tool needs Y" exceptions.

Distribution & operations¶

The CLI is currently distributed as binaries from the GitLab Package Registry plus go install. These items are the next steps for putting bnerd in front of users who don't already have a Go toolchain or a GitLab login — captured here so the gap is visible, not because any of them is overdue.

OS keychain integration for tokens¶

What: Load the API token, Anthropic API key, Slack tokens, and per-email-account IMAP/SMTP passwords from the platform keychain — Secret Service / pass on Linux, Keychain Services on macOS, Credential Manager on Windows. Plaintext ~/.bnerd.yaml (mode 0600) remains the fallback for headless CI environments.

Why deferred: YAML+0600 is correct for a single-user dev workstation, and that's where every bnerd user lives today. Keychain integration is three platform-specific code paths plus a first-run migration UX ("move your token into the keychain?"), and none of that buys anything for the current user base.

When to build: First request from a multi-user environment, the first time a security-conscious org evaluating bnerd pushes back on plaintext token storage, or once enough users have multiple email accounts that the cumulative app-password footprint becomes uncomfortable.

Distribution channels: Homebrew, deb/rpm, container image¶

What: Beyond the GitLab Package Registry binaries, ship a Homebrew tap (brew install bnerd/tap/bnerd), .deb / .rpm packages built in CI, and a bnerd:vX.Y.Z container image. (The separate bnerd-isolate runtime image is covered above under Sandbox.)

Why deferred: Today's users install via make install or grab a binary from the registry. Each new channel is a recurring tax — formula bumps, package-repo signing keys, manifest lists — that isn't worth paying before there's a user base outside the Bnerd team.

When to build: First external contributor or user who asks "how do I install this without go?", or alongside any public-launch announcement.

Release signing¶

What: Each tagged release ships a signature alongside the existing SHA256SUMS, either cosign keyless OIDC (GitLab CI provides the OIDC token; verifiers check against the GitLab project's identity) or a GPG signature with a managed key. getting-started/installation.md already documents checksum verification; signing extends that to provenance.

Why deferred: Checksums and SBOM are already published, so a download can be verified against the published manifest. Provenance — proving the manifest itself comes from the bnerd publisher — needs a key-custody decision (cosign keyless vs. managed GPG) and a verification UX users have to learn. That's worth landing alongside the public distribution channels above, when the audience for the verification story is wider than the team.

When to build: Alongside the Homebrew / deb / rpm rollout. Pick the signing tool at that point — cosign keyless is the lower-friction default if GitLab CI's OIDC token issuer is acceptable to downstream verifiers.

`bnerd version --check-update` and config-schema migration¶

What: Two related capabilities:

bnerd version --check-update queries the GitLab releases API once and prints whether a newer release is available. Read-only, no auto-install.
~/.bnerd.yaml carries a schema version field; on startup, an out-of-date schema triggers an interactive bnerd config migrate that rewrites the file in place.

Why deferred: The user list is small enough that version mismatches surface in chat, not in tooling. The config schema has only grown backwards-compatibly so far (vpn:, --isolate settings), so a migration tool would have nothing to migrate.

When to build: First time a config-schema change isn't backwards-compatible — that's the forcing function for both. Ship the version check at the same time so users have a self-serve path to "you need to upgrade".

Differentiated exit codes¶

What: Replace the universal os.Exit(1) on error with a documented set: 2 for auth, 3 for network/API, 4 for validation/usage, 5 for sandbox/isolation. Documented in reference/global-flags.md so scripts can branch on them.

Why deferred: No user has yet built a non-trivial script around bnerd's error behavior, so picking exit-code conventions in advance just risks ossifying them in the wrong shape.

When to build: When the first user asks "how do I tell from a script whether this failed because the token expired or because the network is down" — that conversation is the spec for the codes.

Structured logging via `--log-format=json`¶

What: Optional JSON-line output for bnerd's stdout/stderr (or at minimum the audit log), making bnerd ingestible by Loki / logstash / similar pipelines.

Why deferred: The current debug log (/tmp/bnerd-debug.log via BNERD_DEBUG=1) is sufficient for human-driven sessions, and the MCP audit log already lands as JSON-line. Adding structured logging across all output paths means routing every print site through a logger abstraction — a large diff for a feature only useful in service-style deployments. bnerd is a CLI.

When to build: If/when bnerd is run as a long-lived daemon (e.g. an MCP server in a Kubernetes pod) and the operator wants its logs ingested by their observability stack.

CLI ergonomics & reliability¶

These are quality-of-life and reliability features that the current user base hasn't asked for yet. Captured here so the design questions are written down rather than re-derived each time someone hits the gap.

Named profiles / context switching¶

What: Multiple named configurations in ~/.bnerd.yaml (or ~/.config/bnerd/contexts/) plus bnerd --profile <name> and bnerd config use-context <name> to switch between them. The kubeconfig-context pattern, applied to bnerd's own auth + org + project triple.

Why deferred: Today's users live on one org. Switching means editing ~/.bnerd.yaml or exporting BNERD_* env vars — clunky but rare. Adding a context system before it's needed would make the config schema breaking-change-prone.

When to build: First user who works across two organizations on the same machine. The schema choice (single file with multiple contexts vs. context-per-file like kubeconfig) is the main design question and should be informed by how that user already organizes their auth.

`bnerd doctor` — diagnostic bundle for support¶

What: A top-level bnerd doctor command that gathers version, OS, runtime info, config (with token + AI keys redacted), connectivity check to api-url, MCP audit log tail, and optionally isolate doctor output — all in a single shareable report. Mirrors the isolate doctor pattern but for the whole CLI.

Why deferred: Today's user-to-author distance is short enough that "what does bnerd config show say?" works. Diagnostic bundles matter when support has to debug at arm's length — typically alongside a wider distribution surface.

When to build: When the first external user files a bug that takes more than two round-trips to triage. That's also when redaction matters: doctor must be safe to paste in a public issue tracker.

HTTP retries with exponential backoff¶

What: The pkg/client HTTP client retries on transient failures (HTTP 429, 502, 503, 504, network errors) with exponential backoff and jitter, capped at ~3 attempts. Read-only methods retry by default; write methods retry only when the body is a deterministic JSON payload (i.e. POST/PATCH idempotency is the caller's concern).

Why deferred: The bnerd CloudAPI is internal and stable, so transient failures haven't been a recurring complaint. Adding retries before observing real failure shapes risks retrying the wrong things — e.g. retrying a POST that already succeeded server-side and double-charging billing. The session-level retry in pkg/ai/chat/session.go covers the AI-streaming case, which is where transient failure has actually bitten us.

When to build: First time a flaky network or a CloudAPI deployment causes a user-visible spurious failure. The retry policy should be informed by the actual error shape, not invented in advance.

Confirmation prompts for destructive operations¶

What: Destructive commands (bnerd dns records delete, bnerd domains delete, bnerd k8s ... delete, bnerd vpn down while traffic is in flight, etc.) prompt for confirmation by default. A --yes / -y flag, plus BNERD_NONINTERACTIVE=1, bypass the prompt for scripts.

Why deferred: The cost of an accidental delete in the current user base is low — bnerd users tend to be the same people who set up the resources, and the CloudAPI has its own server-side guardrails. Adding prompts before there's a near-miss creates friction without paying for itself, and picking the prompt wording / detection for "is stdin a TTY?" before a real incident risks a half-baked policy.

When to build: First time someone deletes the wrong DNS record from muscle memory. That conversation is also the spec for which operations need the prompt and which don't.

Shell completion¶

What: bnerd completion bash|zsh|fish|powershell emits a completion script. Cobra generates this for free; the only work is registering the command and documenting the install one-liner.

Why deferred: Discoverability via --help covers the case today, and the user base is small enough that no one has asked. Adding a shipped completion script also means committing to keep flag names stable across versions — a soft compatibility commitment we haven't formalised yet.

When to build: Alongside the Homebrew tap (above) — completion install is one of the standard Homebrew formula post-install steps, so the two compose naturally.

Audit log query CLI¶

What: bnerd audit list, bnerd audit show <id>, bnerd audit grep <pattern> — read-only commands over ~/.bnerd/audit.log (the JSON-line audit trail produced by the MCP server and AI session). Filters by tool name, time range, denied-only, etc.

Why deferred: The audit log is a structured JSON-line file, so jq -c < ~/.bnerd/audit.log | grep ... works fine for the small number of users who currently inspect it. A dedicated subcommand makes the format an interface, which is a commitment to backwards-compatibility that doesn't pay off until more than a handful of users rely on the data.

When to build: First time a user (or compliance reviewer) asks "what did the AI do yesterday?" and the answer needs to live in something more durable than a jq one-liner. The query schema is the design question worth deferring until then.

AI Chat¶

These are flagged as future ideas in private session memory. Listed here so they're visible to anyone reading the codebase.

Plan mode¶

What: AI explores (reads files, checks state), proposes a structured plan, waits for user approval, then executes. Composes naturally with --isolate: planning happens inside the sandbox without any execution risk.

Design notes: Conversation-level state machine — normal → planning → awaiting_approval → executing. The TUI needs a new view for the plan-approval gate.

Local tooling system¶

What: AI can download tools (terraform, flux, kubectl, k9s, helm) to a project-local .local/bin/, then execute them as structured tool calls. Version pinning via .local/tools.yaml.

Composes with --isolate: Trivially — $PWD/.local/bin/ is mounted RW, so downloaded tools persist across sandbox runs without touching the host. This is also the documented escape hatch for "tool not in the container image" today.

Cross-provider model switching¶

What: Provider profiles (named backends) assigned to roles (plan/exec/escalation). Lazy backend creation, phase-boundary switching, hard+soft escalation. Supports: Anthropic-only, self-hosted-only, hybrid, layered self-hosted.

Composes with --isolate=strict: This is exactly the use case that motivates strict mode — running an untrusted local model with kubeconfig dropped.

Confirmation-dialog scroll discoverability¶

What: Across all blocking AI interactions (plan approval, tool confirm, phase checkpoint, plan question, proposal blocks), pgup / pgdown / ctrl+u / ctrl+d already scroll the chat scrollback — but this isn't signposted in the UI, and the routing is disabled while the refine-input textarea is focused.

Why deferred: The Slack/Ticket proposal blocks already gained inline-scrollable bodies (PgUp/PgDn within the proposal, boundary-fallthrough to chat scrollback). That covers the most-painful case. The remaining gap is a visible hint line and routing during refine-input focus.

When to build: Next time a user reports "I can't see the X I'm being asked to approve" — at that point, add a one-line hint to the relevant dialog block and audit the input-focused key branches in pkg/tui/chat_keys.go to allow scroll keys through to the chat viewport.

Migrate OpenAI backend to the official Go SDK¶

What: Replace the hand-rolled HTTP + SSE implementation in pkg/ai/chat/openai.go with github.com/openai/openai-go (v3.x, requires Go 1.22+). Keep base URL configurable so LiteLLM / vLLM / Ollama still work as drop-in compat servers.

Why deferred: The hand-rolled code with targeted fixes (assistant content: null on tool-only turns, name on role: tool messages, mid-stream {"error": ...} frame parsing routed through the existing retry loop, sanitized tool args, defensive empty-ID filter, lenient data: prefix) handles the LiteLLM cross-provider routing surface today. SDK gains (built-in retry on 408/409/429/5xx, typed errors, OpenAI-maintained protocol evolution) are real but not yet load-bearing — and the SDK is OpenAI-spec-strict, so any compat-server quirks it doesn't tolerate would still need shims.

When to build: When the next class of compat-server wire-format bug surfaces — that's the signal that hand-rolling has stopped being cheap. Likely triggers: streaming reasoning tokens, parallel tool-call quirks, new fields in completion responses, or a structural compat issue requiring a fifth shim on top of the current four.

OpenProject integration¶

The OpenProject integration (guide) ships read + non-destructive write tools. These items were explicitly deferred during the design phase and aren't on the schedule.

Destructive operations (delete work package, delete time entry, archive project)¶

What: AI tools to remove or archive OpenProject resources. Currently the AI cannot delete or archive anything; the OpenProject web UI handles those cases.

Why deferred: Rare for leadership work; the riskiest class of operation if the AI misfires; the web UI's confirmations are already adequate.

When to build: When a concrete recurring need surfaces (e.g. weekly batch cleanup). At that point also revisit whether to add a fourth archive safety class above Destructive for soft-delete/archive operations that are reversible.

Service-account / hybrid identity mode¶

What: Allow the integration to authenticate as a dedicated ai-assistant@… user instead of the human's personal API key, or use a hybrid (service account for comments/work-package edits, personal key only for time entries).

Why deferred: Single user; the personal-token attribution is accurate enough — Bernd is responsible for whatever the AI did with his authority.

When to build: Team members onboard, or compliance requires distinguishing AI-authored content from human-authored content in the OpenProject journal.

Project scoping aligned with `--project-id`¶

What: Apply the bnerd --project-id / BNERD_PROJECT_ID setting to OpenProject writes so the AI cannot accidentally touch a different project than the active CloudAPI project.

Why deferred: No mapping exists between CloudAPI projects (UUIDs/identifiers) and OpenProject projects (different identifier space). Without that mapping, scoping would be either always-no-op or always-wrong.

When to build: Once a CloudAPI ↔ OpenProject project mapping is defined (likely as a config field per CloudAPI project, or as a tag on the OpenProject project).

Full CLI mirror (`bnerd op work-packages create / update`, etc.)¶

What: Mirror every AI write tool with a bnerd op … CLI subcommand so OpenProject can be scripted from the shell.

Why deferred: AI is the primary surface; the existing smoke-test CLI (bnerd op whoami, bnerd op projects list, etc.) is enough to verify connectivity without committing to a parallel write surface.

When to build: When shell scripting against OpenProject becomes a recurring need that the AI flow doesn't serve.

Microsoft Teams triage backend¶

What: A pkg/teams/ package implementing inbox.MessageSource (the same contract Slack and Email use today). The inbox surface (inbox_list, inbox_get_thread, etc.) and the merged peek already work source-agnostically — only the Teams adapter remains: authenticate via the Bot Framework + Microsoft Graph, project channel posts + DMs + replies into inbox.Entry, send via Graph's chat/messages endpoint.

Why deferred: Bot Framework auth is its own infrastructure project (Azure Bot registration, channel registration, webhook endpoint or proactive messaging) — large enough to deserve its own focused effort once email proves the abstraction held under a second backend.

When to build: When a customer asks for Teams parity. The abstraction is ready; only the adapter is missing.

Mattermost triage backend¶

What: A pkg/mattermost/ package implementing inbox.MessageSource. Closest protocol cousin to Slack — channels, threads, DMs, websocket-pushed events — so the adapter shape would mirror pkg/slack/source/ almost line-for-line. Single PAT per workspace (no separate bot/app/user-token split), simpler auth than Slack.

Why deferred: Slack covers most customer messaging today; Mattermost is "nice to have" without a concrete request driving it. The abstraction is validated by Slack + Email, so when a Mattermost-first team asks for the integration the adapter can be built without re-litigating the design.

When to build: First customer ask.

IMAP IDLE push (replace email timer poller)¶

What: Swap the 60-second IMAP SEARCH UNSEEN poll loop in pkg/email/poller.go for a long-lived IMAP IDLE connection per account, falling back to polling when the server rejects IDLE or the connection drops persistently.

Why deferred: 60s latency is acceptable for triage — most replies aren't time-critical and the cost of an idle connection per account adds material complexity (timeout handling, reconnect-with-backoff, capability probing). The existing poller pattern is identical to the Slack user-scope poller, which has been running fine for months.

When to build: When users complain that 60s feels laggy, or when an email account hits an IMAP server with aggressive idle-session limits.

Gmail / Microsoft Graph email adapters¶

What: Replace IMAP+SMTP with provider-native APIs for Gmail (Google API) and Outlook/Office365 (Microsoft Graph). Adds richer features: server-side label/category filters, conversation-thread IDs as first-class handles (no header heuristics), focused-inbox awareness, send-as-alias support.

Why deferred: IMAP+SMTP is the universal floor — it works for Gmail-with-app-password, Outlook-with-app-password, Fastmail, Proton Bridge, self-hosted Dovecot, and everything else. Provider-native adapters require OAuth infrastructure that doesn't exist in bnerd today (browser callback flow, refresh tokens, per-provider scopes). Two backends pretending to be one.

When to build: When a paying user needs Gmail labels or Outlook focused-inbox specifically, and the OAuth infrastructure is worth building.

Email attachments¶

What: Implement ReplyOptions.Attachments (already declared in pkg/inbox/types.go as a roadmap placeholder) for the email backend: multipart/mixed message construction in pkg/email/smtp.go, attachment download via IMAP BODYSTRUCTURE + BODY[n] fetch on read.

Why deferred: Most AI-drafted replies are body-only. Adding attachments means new tool schemas (inbox_attach_file?), file-upload UI in the proposal block, and MIME composition that handles content-type detection. Not blocking the core triage flow.

When to build: First user request — likely from someone who wants the AI to reply with a generated report or screenshot.

Slack inbox auto-triage on peek-open¶

What: When the Ctrl-S peek overlay opens with unread messages, kick off a background LLM call that classifies each unread entry (urgent / support / question / FYI / noise) and pre-drafts a suggested action. Cache results in triage_cache.json keyed by the event-ID set; invalidate on new arrivals. Render the classification + suggestion as a chip on each row.

Why deferred: The Phase 4 system-prompt awareness already gives the AI enough context to volunteer triage when you switch to :pa. Auto-triage on peek-open adds an LLM cost on every overlay glance, plus the plumbing (side-channel chat session, async row updates, cache invalidation) is significant. The simpler "open :pa, ask 'what's pending?'" flow shipped today covers the same need without the infra.

When to build: If users routinely complain that the manual :pa triage flow has too much friction, or once a cheap classification model is wired in for short prompts.

Standalone `:slack` resource view¶

What: Promote the Ctrl-S peek overlay into a full TUI resource view (table, filter /, detail d, edit/reply modal e) accessible via :slack, on equal footing with :zones, :k8s, etc.

Why deferred: The overlay (with the Phase 3 Subscriptions tab) now covers the at-a-glance + per-conversation management needs, and the AI covers the deep-search need. A full resource view is more code to maintain and most of its richer keybindings would duplicate what the overlay or the AI already does better.

When to build: When users start asking for column sorting beyond what the overlay's grouping provides, or for other power-user list operations the overlay can't do.

Slack-app read-state mirroring (user-token only)¶

What: Use conversations.info.last_read (per-user-token) to defer to the Slack desktop/mobile app's read state, so a message you opened on your phone doesn't keep counting as unread in the bnerd inbox.

Why deferred: With the user-scope poller now shipping, the user token is the primary path for personal-inbox setups — the historical "you'd force everyone to set up xoxp" friction no longer applies. The remaining work is just plumbing last_read into the inbox state so app-side reads silently drop entries.

When to build: When the inbox unread count routinely overshoots reality because messages were read in the Slack app — or when a user explicitly asks for it.

Cross-source auto-classification on arrival¶

What: Run a lightweight LLM classification on every incoming inbox event regardless of source (urgent / support / question / fyi / noise) so the indicator can show "1 likely support" instead of a raw count. With the source-agnostic inbox.Hub shipped, classification can run uniformly across Slack + Email + future Mattermost/Teams from a single dispatcher.

Why deferred: Cost scales with traffic volume across every configured source — multiple email accounts plus a chatty Slack workspace adds up. Runs even while the user is offline; adds infra complexity (background agent loop, retries, rate limits, classification cache invalidation). The on-demand :pa model already covers the same value when the user actually engages.

When to build: When on-demand triage feels too lazy in practice, a cheap classifier (Haiku-tier) is wired in, AND a user has enough cross-source volume that the noise floor genuinely warrants pre-filtering.

AI-write deny rules¶

What: Hard-coded server-side-style policies that block specific high-risk writes even when the safety mode and confirmation say "go" — e.g. block status changes on closed tickets, block edits to time entries from prior months, block reassignments to other users.

Why deferred: Over-restricting is hard to undo, and the confirmation prompt + audit log already provide visibility. Without a concrete near-miss to model the rules on, picking them in advance risks being either useless or annoying.

When to build: After the first observed misfire — the incident report becomes the spec for the rule that would have caught it.

Web UI (`bnerd web`)¶

The bnerd web command now serves the AI chat (chat/code/pa, switchable) and a resource dashboard spanning DNS, compute, Kubernetes, tickets, domains, apps, org/access, invoices, and billing — all on the shared pkg/resources layer. These are the capabilities still deliberately deferred.

Browser inbox / personal-assistant peek¶

What: The full Slack/email inbox surface in the browser — thread list, expand, reply compose, subscription management — with live updates, matching the TUI's Ctrl+S peek. The PA chat mode (and its Slack/email tools, once inbox.Hub is wired into the web process) is the precursor; this epic is the dedicated UI on top.

Why deferred: It is a large, stateful, real-time subsystem: background pollers running in the web process, the inbox.Hub lifecycle, and a push channel (SSE/WS) for live thread updates — well beyond the request/response dashboard pattern.

When to build: After the PA chat tools are wired and there's demand to triage the inbox from the browser. Start from the wired inbox.Hub; add a thread-list view and reuse the existing reply-proposal dialog bridge.

In-browser code editor¶

What: Hand-editing files in the browser for :code mode (on top of the read-only file tree + diff review that ships first), i.e. a real editor pane.

Why deferred: A usable code editor needs a JS editor library (CodeMirror/ Monaco), which reintroduces a JS/TS build step — the thing the server-rendered htmx stack was chosen to avoid. The agent already edits files (with diffs shown), so hand-editing is a convenience, not a blocker.

When to build: If direct in-browser editing becomes a real need, and we accept adding a front-end build step for that view (or revisit the SPA question).

Public localhost REST API¶

What: A documented, stable JSON API on localhost for other local tools and scripts to consume (list/inspect resources, drive the agent programmatically), distinct from the internal endpoints that power the web UI.

Why deferred: bnerd web's endpoints exist only to serve its own browser UI and are free to change shape. A public API is a contract — it needs versioning, stability guarantees, and a security review of what non-browser clients may do. The CLI and MCP server already cover scripting and external-tool integration today.

When to build: When a concrete local-automation use case appears that the CLI and MCP server cannot serve, justifying a committed, versioned surface.

How to update this page¶

When you:

Deliver a deferred item — remove its entry here, mention it in the relevant guide/reference page, and link from the changelog if applicable.
Defer a new feature during design or PR review — add an entry here with the same What / Why deferred / When to build shape.
Re-prioritize something — update the "When to build" line; don't silently move items around.

Keep entries terse. The point is that a future contributor can read the entry and either pick it up or know exactly why it's still on the shelf.