Skip to content

Roadmap

Tracking of deferred features, planned work, and known design ideas. Items here are explicit decisions to not build something now; the rationale and the unblocking criteria are documented per item so future contributors can pick them up with full context.

When you ship or defer something significant, update this page in the same change.

Sandbox / --isolate

The --isolate flag (see AI sandbox guide) ships with a deliberate set of tradeoffs. These are the next logical steps and the reasons they aren't in v1.

--isolate-allow-net=<host[,...]> — network egress allowlist

What: Restrict the AI's network reach to a specific host list (e.g. api.anthropic.com, api.bnerd.net, configured Kubernetes API servers). Currently --isolate is filesystem-bounded but allows unrestricted egress, leaving prompt-injection-driven exfiltration via web_fetch open.

Why deferred: It's a real architectural step up, not a one-line addition. The reference design is ExitBox/Cloud-Exit's "moat" pattern — container has no DNS and no direct internet; a sidecar Squid proxy on a second container has the only egress route and enforces the allowlist with 403 Forbidden for blocked hosts. Hot-reload via Unix socket so the user can grant a domain mid-session without restarting.

When to build: When users start pointing bnerd at self-hosted or third-party LLM endpoints they don't fully trust. Until then the filesystem boundary covers the everyday "don't poop in my home dir" risk.

Landlock LSM as in-process defense in depth

What: On Linux 5.13+, call landlock_create_ruleset from inside the bnerd process to drop its own filesystem reach to a workdir + a few specific paths. Belt and braces on top of bwrap/podman.

Why deferred: Diminishing returns. Bwrap already enforces the boundary at namespace level; landlock would catch a hypothetical bwrap-escape but those are extremely rare and not in our threat model. Worth adding when we touch this code anyway.

When to build: Opportunistically — next time we're in pkg/isolate/ for another reason.

macOS native sandbox-exec path

What: A native macOS isolation runtime using sandbox-exec and SBPL profiles, sitting alongside bwrap/podman/docker in the runtime priority list. Avoids the need for podman-machine on macOS.

Why deferred: SBPL is a deprecated Apple API (still works but undocumented), and podman-machine works fine for the macOS users we have today. Not worth the maintenance burden for a deprecated API.

When to build: If/when podman-machine on macOS proves unreliable for users, or if Apple ships a public replacement for sandbox-exec.

Published container images

What: Build bnerd-isolate:vX.Y.Z images in CI and push to registry.gitlab.com/cloud/app/cli. First-time --isolate invocation pulls instead of building locally.

Why deferred: Local build works, takes ~2 minutes once. Saves a CI integration job + image hosting until there's user demand.

When to build: When the local-build-on-first-use latency becomes a real friction point, or when we want to ship signed/SBOM'd images.

Granular --isolate-mount / --isolate-no-mount flags

What: Per-mount opt-in/opt-out flags so users can carve specific tradeoffs (e.g. "isolate me but mount ~/.aws for this session", or "strict mode but keep SSH agent forwarding").

Why deferred: Two-tier on/strict covers ~95% of cases without exploding the flag surface. Adding granular flags now would lock us into an interface before we know what users actually need.

When to build: After observing real user requests for combinations that the two-tier model can't express.

Per-tool sandbox capability declarations

What: Each AI tool declares which paths/credentials it needs. The sandbox is constructed dynamically based on which tools are enabled by --ai-mode rather than a fixed mount table.

Why deferred: Premature abstraction. The fixed mount table is short and audit-friendly; dynamic construction would obscure what's exposed.

When to build: Only if the tool surface grows so much that the fixed mount table accumulates many "X tool needs Y" exceptions.

Distribution & operations

The CLI is currently distributed as binaries from the GitLab Package Registry plus go install. These items are the next steps for putting bnerd in front of users who don't already have a Go toolchain or a GitLab login — captured here so the gap is visible, not because any of them is overdue.

OS keychain integration for tokens

What: Load the API token (and the Anthropic API key) from the platform keychain — Secret Service / pass on Linux, Keychain Services on macOS, Credential Manager on Windows. Plaintext ~/.bnerd.yaml (mode 0600) remains the fallback for headless CI environments.

Why deferred: YAML+0600 is correct for a single-user dev workstation, and that's where every bnerd user lives today. Keychain integration is three platform-specific code paths plus a first-run migration UX ("move your token into the keychain?"), and none of that buys anything for the current user base.

When to build: First request from a multi-user environment, or the first time a security-conscious org evaluating bnerd pushes back on plaintext token storage.

Distribution channels: Homebrew, deb/rpm, container image

What: Beyond the GitLab Package Registry binaries, ship a Homebrew tap (brew install bnerd/tap/bnerd), .deb / .rpm packages built in CI, and a bnerd:vX.Y.Z container image. (The separate bnerd-isolate runtime image is covered above under Sandbox.)

Why deferred: Today's users install via make install or grab a binary from the registry. Each new channel is a recurring tax — formula bumps, package-repo signing keys, manifest lists — that isn't worth paying before there's a user base outside the Bnerd team.

When to build: First external contributor or user who asks "how do I install this without go?", or alongside any public-launch announcement.

Release signing

What: Each tagged release ships a signature alongside the existing SHA256SUMS, either cosign keyless OIDC (GitLab CI provides the OIDC token; verifiers check against the GitLab project's identity) or a GPG signature with a managed key. getting-started/installation.md already documents checksum verification; signing extends that to provenance.

Why deferred: Checksums and SBOM are already published, so a download can be verified against the published manifest. Provenance — proving the manifest itself comes from the bnerd publisher — needs a key-custody decision (cosign keyless vs. managed GPG) and a verification UX users have to learn. That's worth landing alongside the public distribution channels above, when the audience for the verification story is wider than the team.

When to build: Alongside the Homebrew / deb / rpm rollout. Pick the signing tool at that point — cosign keyless is the lower-friction default if GitLab CI's OIDC token issuer is acceptable to downstream verifiers.

bnerd version --check-update and config-schema migration

What: Two related capabilities:

  • bnerd version --check-update queries the GitLab releases API once and prints whether a newer release is available. Read-only, no auto-install.
  • ~/.bnerd.yaml carries a schema version field; on startup, an out-of-date schema triggers an interactive bnerd config migrate that rewrites the file in place.

Why deferred: The user list is small enough that version mismatches surface in chat, not in tooling. The config schema has only grown backwards-compatibly so far (vpn:, --isolate settings), so a migration tool would have nothing to migrate.

When to build: First time a config-schema change isn't backwards-compatible — that's the forcing function for both. Ship the version check at the same time so users have a self-serve path to "you need to upgrade".

Differentiated exit codes

What: Replace the universal os.Exit(1) on error with a documented set: 2 for auth, 3 for network/API, 4 for validation/usage, 5 for sandbox/isolation. Documented in reference/global-flags.md so scripts can branch on them.

Why deferred: No user has yet built a non-trivial script around bnerd's error behavior, so picking exit-code conventions in advance just risks ossifying them in the wrong shape.

When to build: When the first user asks "how do I tell from a script whether this failed because the token expired or because the network is down" — that conversation is the spec for the codes.

Structured logging via --log-format=json

What: Optional JSON-line output for bnerd's stdout/stderr (or at minimum the audit log), making bnerd ingestible by Loki / logstash / similar pipelines.

Why deferred: The current debug log (/tmp/bnerd-debug.log via BNERD_DEBUG=1) is sufficient for human-driven sessions, and the MCP audit log already lands as JSON-line. Adding structured logging across all output paths means routing every print site through a logger abstraction — a large diff for a feature only useful in service-style deployments. bnerd is a CLI.

When to build: If/when bnerd is run as a long-lived daemon (e.g. an MCP server in a Kubernetes pod) and the operator wants its logs ingested by their observability stack.

CLI ergonomics & reliability

These are quality-of-life and reliability features that the current user base hasn't asked for yet. Captured here so the design questions are written down rather than re-derived each time someone hits the gap.

Named profiles / context switching

What: Multiple named configurations in ~/.bnerd.yaml (or ~/.config/bnerd/contexts/) plus bnerd --profile <name> and bnerd config use-context <name> to switch between them. The kubeconfig-context pattern, applied to bnerd's own auth + org + project triple.

Why deferred: Today's users live on one org. Switching means editing ~/.bnerd.yaml or exporting BNERD_* env vars — clunky but rare. Adding a context system before it's needed would make the config schema breaking-change-prone.

When to build: First user who works across two organizations on the same machine. The schema choice (single file with multiple contexts vs. context-per-file like kubeconfig) is the main design question and should be informed by how that user already organizes their auth.

bnerd doctor — diagnostic bundle for support

What: A top-level bnerd doctor command that gathers version, OS, runtime info, config (with token + AI keys redacted), connectivity check to api-url, MCP audit log tail, and optionally isolate doctor output — all in a single shareable report. Mirrors the isolate doctor pattern but for the whole CLI.

Why deferred: Today's user-to-author distance is short enough that "what does bnerd config show say?" works. Diagnostic bundles matter when support has to debug at arm's length — typically alongside a wider distribution surface.

When to build: When the first external user files a bug that takes more than two round-trips to triage. That's also when redaction matters: doctor must be safe to paste in a public issue tracker.

HTTP retries with exponential backoff

What: The pkg/client HTTP client retries on transient failures (HTTP 429, 502, 503, 504, network errors) with exponential backoff and jitter, capped at ~3 attempts. Read-only methods retry by default; write methods retry only when the body is a deterministic JSON payload (i.e. POST/PATCH idempotency is the caller's concern).

Why deferred: The bnerd CloudAPI is internal and stable, so transient failures haven't been a recurring complaint. Adding retries before observing real failure shapes risks retrying the wrong things — e.g. retrying a POST that already succeeded server-side and double-charging billing. The session-level retry in pkg/ai/chat/session.go covers the AI-streaming case, which is where transient failure has actually bitten us.

When to build: First time a flaky network or a CloudAPI deployment causes a user-visible spurious failure. The retry policy should be informed by the actual error shape, not invented in advance.

Confirmation prompts for destructive operations

What: Destructive commands (bnerd dns records delete, bnerd domains delete, bnerd k8s ... delete, bnerd vpn down while traffic is in flight, etc.) prompt for confirmation by default. A --yes / -y flag, plus BNERD_NONINTERACTIVE=1, bypass the prompt for scripts.

Why deferred: The cost of an accidental delete in the current user base is low — bnerd users tend to be the same people who set up the resources, and the CloudAPI has its own server-side guardrails. Adding prompts before there's a near-miss creates friction without paying for itself, and picking the prompt wording / detection for "is stdin a TTY?" before a real incident risks a half-baked policy.

When to build: First time someone deletes the wrong DNS record from muscle memory. That conversation is also the spec for which operations need the prompt and which don't.

Shell completion

What: bnerd completion bash|zsh|fish|powershell emits a completion script. Cobra generates this for free; the only work is registering the command and documenting the install one-liner.

Why deferred: Discoverability via --help covers the case today, and the user base is small enough that no one has asked. Adding a shipped completion script also means committing to keep flag names stable across versions — a soft compatibility commitment we haven't formalised yet.

When to build: Alongside the Homebrew tap (above) — completion install is one of the standard Homebrew formula post-install steps, so the two compose naturally.

Audit log query CLI

What: bnerd audit list, bnerd audit show <id>, bnerd audit grep <pattern> — read-only commands over ~/.bnerd/audit.log (the JSON-line audit trail produced by the MCP server and AI session). Filters by tool name, time range, denied-only, etc.

Why deferred: The audit log is a structured JSON-line file, so jq -c < ~/.bnerd/audit.log | grep ... works fine for the small number of users who currently inspect it. A dedicated subcommand makes the format an interface, which is a commitment to backwards-compatibility that doesn't pay off until more than a handful of users rely on the data.

When to build: First time a user (or compliance reviewer) asks "what did the AI do yesterday?" and the answer needs to live in something more durable than a jq one-liner. The query schema is the design question worth deferring until then.

AI Chat

These are flagged as future ideas in private session memory. Listed here so they're visible to anyone reading the codebase.

Plan mode

What: AI explores (reads files, checks state), proposes a structured plan, waits for user approval, then executes. Composes naturally with --isolate: planning happens inside the sandbox without any execution risk.

Design notes: Conversation-level state machine — normal → planning → awaiting_approval → executing. The TUI needs a new view for the plan-approval gate.

Local tooling system

What: AI can download tools (terraform, flux, kubectl, k9s, helm) to a project-local .local/bin/, then execute them as structured tool calls. Version pinning via .local/tools.yaml.

Composes with --isolate: Trivially — $PWD/.local/bin/ is mounted RW, so downloaded tools persist across sandbox runs without touching the host. This is also the documented escape hatch for "tool not in the container image" today.

Cross-provider model switching

What: Provider profiles (named backends) assigned to roles (plan/exec/escalation). Lazy backend creation, phase-boundary switching, hard+soft escalation. Supports: Anthropic-only, self-hosted-only, hybrid, layered self-hosted.

Composes with --isolate=strict: This is exactly the use case that motivates strict mode — running an untrusted local model with kubeconfig dropped.

How to update this page

When you:

  • Deliver a deferred item — remove its entry here, mention it in the relevant guide/reference page, and link from the changelog if applicable.
  • Defer a new feature during design or PR review — add an entry here with the same What / Why deferred / When to build shape.
  • Re-prioritize something — update the "When to build" line; don't silently move items around.

Keep entries terse. The point is that a future contributor can read the entry and either pick it up or know exactly why it's still on the shelf.