Safety Model¶

Agent teams build on the same safety infrastructure that governs every bnerd AI session. No new permission system is introduced; the team layer adds connectivity between sessions, not new safety semantics.

SafetyMode floor¶

Every team recipe declares a safety floor (defaulting to read-only). When you run a team, the --safety flag may only tighten this floor — it can never loosen it:

# The cloud-app recipe declares safety: non-destructive.
# This flag tightens it further to read-only for this run.
bnerd team run cloud-app "explore the codebase" --safety=read-only

# This would be rejected: full is looser than non-destructive.
bnerd team run cloud-app "…" --safety=full  # error: flag loosens the recipe floor

The effective mode is passed to every teammate's tool registry at spawn. A teammate cannot exceed the team's effective mode regardless of what its role file declares.

Per-role scoped tool registries¶

Each teammate's tool registry is built in this order (most restrictive wins):

SafetyMode floor — the team's effective mode, established first
Role base toolset — the appropriate category of tools for the role (read, write, etc.) filtered by the safety floor
disallowedTools — tools explicitly blocked by the role file, removed unconditionally
tools allowlist — if present, only the listed tool names survive; everything else is dropped (Claude Code tool names are translated to bnerd names at this step)
Team coordination tools — always registered last, not removable by role policy: team_send_message, team_task_claim, team_task_complete, team_task_list, team_task_get

Unknown Claude Code tool names (Agent, Task, Skill) log a warning and are dropped in phase 1.

Confirmations¶

Tools classified as requiring confirmation surface a [CONFIRM] prompt in the headless output (or a modal in the TUI). The --auto-approve flag controls what happens:

Policy	Behaviour
`never` (default)	Every confirmation pauses and waits for operator input (`y/N/explain`)
`safe`	`SafetyRead` tools are auto-approved; `SafetyWrite` and `SafetyDestructive` still pause
`all`	All confirmations are auto-approved — only safe combined with `--safety=read-only`

In MCP mode, pending confirmations are queued under a ULID and resolved by the calling AI with bnerd_team_approve(team_id, confirm_id, decision).

Gate machinery¶

The task list encodes four levels of gates that prevent premature completion:

Dependency gates¶

A task with blocked_by: [3, 5] cannot be claimed until tasks 3 and 5 are completed. Eligibility is re-evaluated automatically when any task transitions to completed — no explicit unblock call is needed.

Multi-gate completion (review pairs)¶

A task can declare named gates that must each be satisfied before it can complete:

{
  "subject": "add POST /networks to hq",
  "gates": [
    { "name": "code-review",    "by_role": "code-reviewer" },
    { "name": "security-review","by_role": "security-reviewer" }
  ]
}

Each gate is satisfied by a teammate calling team_task_satisfy_gate(task_id, gate_name, decision, at_ref?). The task transitions to completed only when all declared gates are satisfied. This is the canonical pattern for critic–actor pairs.

Tip-alignment¶

Every gate satisfaction carries an optional at_ref (commit SHA or artifact content-hash). When a task tries to complete, the runtime verifies that all gates were satisfied against the same at_ref. If they diverge, the task is held in awaiting_approval and a StreamTipMismatch event is emitted. The team-lead is notified and must reconcile before the task can proceed.

This prevents a reviewer from approving a stale commit while the author has already pushed new changes.

Human-gate tasks (RequiresApproval)¶

A task with requires_approval: true cannot transition to completed without an explicit team_task_approve(id, decision, feedback?) call from the team-lead. This is the standard pattern for phase-boundary gates:

Phase-2 gate task
  requires_approval: true
  blocked_by: [all phase-1 task IDs]

Phase-2 tasks are blocked until the gate task is approved, so no teammate starts phase-2 work until you have reviewed phase-1 outcomes and given the go-ahead.

Plan-approval gates (per-teammate)¶

Roles with permissionMode: plan (or teams with plan_approval: true) enter plan-mode on spawn. The teammate produces a plan and calls team_request_plan_approval(plan_summary). The team-lead reviews and approves or rejects with feedback. On rejection, the teammate revises and re-requests. On approval, the teammate transitions to execution mode.

Secrets masking¶

Teammates inherit the same secrets-scrubbing pipeline as the main chat session. API keys, tokens, and other sensitive values detected in tool call arguments or outputs are masked before they reach the model context. Masking is applied on every emit path — it cannot be disabled by a role or task instruction.

Always-on skills¶

Teammates with write tools always have two skills loaded unconditionally, regardless of role configuration:

verify-before-done — the teammate must verify an artifact actually exists and contains the expected content before reporting a task done. A green build exit code alone is not sufficient.
destructive-action-guard — overrides any task instruction that would otherwise lead to a destructive action without explicit human approval.

These are not listed in the role's skills: field; they are injected by the Coordinator after the role's own skills are loaded.

File-write locking¶

Two write-capable teammates editing the same file simultaneously is prevented by a coordinator-owned file lock map. If a teammate's fs_write_file or fs_patch_file call targets a file currently held by another teammate, the tool returns an error suggesting coordination via team_send_message. Read-only teams do not incur this overhead.

Pause as a safety valve¶

team_pause(reason) — callable from the team-lead or a workflow-lead — transitions the Coordinator to paused. All teammates finish their current tool call and idle; new task claims are refused. bnerd team resume or the chat-lead's team_resume() tool lifts the pause.

Pausing is the intended response when something unexpected surfaces mid-run:

> pause the team, something looks wrong with task #7

The coordinator records the reason and the time; the bnerd team status output shows the pause state so you can review and decide.