Cloud agent (AWS / GCP / K8s)

A cloud Puck is the same puck binary as the endpoint version, deployed inside the customer’s cloud account (AWS, GCP, Azure, K8s). The “host” it investigates is the cloud control plane itself: IAM trust graphs, S3 / GCS reachability, Lambda execution roles, cross-account assume-role chains, EC2 / VM metadata, etc.

Same protocol, same heartbeat, same tag policies. The only difference is:

Where it runs — an ECS task / Cloud Run service / K8s pod inside the customer’s account, not on a developer’s laptop.
What it can call — aws *, gcloud *, kubectl * instead of ps, lsof, find. Tag policy provides the prompt + the catalog override.
How it authenticates to the cloud — the workload identity role attached to the compute instance. The brain never holds AWS keys.

Why a cloud Puck instead of a brain-side connector

Two reasons:

The brain never holds the customer’s cloud credentials. A cloud Puck runs in the customer’s environment under the customer’s workload-identity role. They control the boundary; you don’t have to.
Same investigation model. The pathfinder + plan + finding pipeline already handles cross-host correlation. A cloud Puck is one more curious agent in the fleet — the brain stitches its findings into the same graph as endpoint Pucks. “This AWS key found on eng-laptop-47” + “this AWS key resolves to AdministratorAccess via the cloud-aws Puck” automatically share a node.

You’d want a brain-side connector only if a cloud Puck couldn’t run in-account. In every realistic deployment it can.

Anatomy of a cloud Puck

                          customer AWS account
       ┌──────────────────────────────────────────────┐
       │   ECS Fargate task (or EKS pod, Cloud Run)   │
       │                                              │
       │   ┌─────────────────────────────────┐        │
       │   │  puck (same binary)             │        │
       │   │                                 │        │
       │   │  --tags cloud-aws,account-prod  │        │
       │   │                                 │        │
       │   │  Workload identity role:        │        │
       │   │  PuckCloudAgentReadOnly         │        │
       │   └──┬──────────────────────────────┘        │
       │      │ HTTPS (443)                           │
       └──────┼────────────────────────────────────────┘
              │
              ▼
        Puck brain (your hosted control plane)
        Receives heartbeats, dispatches plans.

The cloud Puck heartbeats to the brain, polls for plans, executes read-only AWS API calls under its attached IAM role, and ships results back. No cloud credentials traverse the brain.

Tag policies

The shipped cloud-aws and cloud-gcp policies are loaded by demo-seed.sql and visible in Settings → Tag policies out of the box:

cloud-aws
- extra_system_prompt: framing for the LLM (“you are a cloud Puck running inside the customer’s AWS account, the host is the AWS control plane, walk IAM trust policies…”)
- extra_allowed: read-only aws iam *, aws s3 *, aws sts *, aws lambda list-*, aws rds describe-*, aws ec2 describe-*, aws ecs describe-*, aws cloudtrail lookup-events
- extra_denied: anything matching *create*, *delete*, *put*, *update*, *attach*, *detach*
- severity_floor: high — cloud findings are typically critical
- default_environment_class: cloud
- tag_findings_with: { layer: cloud, plane: aws }
cloud-gcp — same shape, gcloud * and gsutil * allowlist; deny list scoped to *create*, *delete*, *set-iam-policy*, etc.

Customers can layer additional tags for account boundaries:

puck --api-key … \
     --tags cloud-aws,account-prod-payments,team-platform

The brain merges every applicable policy. account-prod-payments can attach its own route_findings_to_webhook (PagerDuty for prod) or max_investigations_per_day budget.

ECS / Fargate deployment

IAM role

Two policies on the workload role:

PuckCloudAgentRead — what the Puck investigates. Start with AWS-managed ReadOnlyAccess and pare down based on what Pathfinder actually needs:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "iam:Get*", "iam:List*", "iam:Simulate*",
        "s3:GetBucketAcl", "s3:GetBucketPolicy", "s3:GetBucketTagging",
        "s3:ListAllMyBuckets", "s3:ListBucket",
        "sts:GetCallerIdentity",
        "lambda:Get*", "lambda:List*",
        "rds:Describe*", "rds:List*",
        "ec2:Describe*",
        "ecs:Describe*", "ecs:List*",
        "cloudtrail:LookupEvents", "cloudtrail:GetTrail"
      ],
      "Resource": "*"
    }
  ]
}

Boundary policy is recommended — explicitly deny every *Create*, *Delete*, *Put*, *Update*, *Attach*, *Detach* action. The tag-policy extra_denied is defence in depth, not the boundary.

PuckBrainEgress — VPC egress to the brain. Usually no policy needed; Fargate’s default outbound rule plus a security group allowing 443 to your brain host is enough.

Trust policy

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": { "Service": "ecs-tasks.amazonaws.com" },
    "Action": "sts:AssumeRole"
  }]
}

Task definition

{
  "family": "puck-cloud-agent",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "256",
  "memory": "512",
  "executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole",
  "taskRoleArn":      "arn:aws:iam::ACCOUNT:role/PuckCloudAgentReadOnly",
  "containerDefinitions": [{
    "name":  "puck",
    "image": "ghcr.io/puck-security/agent:latest",
    "essential": true,
    "command": [
      "--brain-url", "https://brain.your-org.example",
      "--api-key",   "${PUCK_API_KEY}",
      "--tags",      "cloud-aws,account-prod-payments"
    ],
    "secrets": [
      { "name": "PUCK_API_KEY", "valueFrom": "arn:aws:secretsmanager:…:puck/api-key" }
    ],
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group":         "/ecs/puck-cloud-agent",
        "awslogs-region":        "us-east-1",
        "awslogs-stream-prefix": "puck"
      }
    }
  }]
}

Run as a long-lived Fargate service (one task, restart on exit) — not a scheduled task. The Puck heartbeats every 60s and pulls plans on demand; uptime should match endpoint Pucks.

GCP / Cloud Run deployment

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: puck-cloud-agent
spec:
  template:
    spec:
      serviceAccountName: puck-cloud-agent@PROJECT.iam.gserviceaccount.com
      containers:
        - image: ghcr.io/puck-security/agent:latest
          args:
            - --brain-url
            - https://brain.your-org.example
            - --api-key
            - $(PUCK_API_KEY)
            - --tags
            - cloud-gcp,project-prod-payments
          env:
            - name: PUCK_API_KEY
              valueFrom:
                secretKeyRef:
                  name:   puck-api-key
                  key:    latest

Cloud Run scales the service to zero by default — for a Puck you want min instances = 1 so heartbeats don’t lapse:

gcloud run services update puck-cloud-agent \
       --min-instances=1 --no-cpu-throttling

Workload-identity binding gives the service account the read-only roles (roles/viewer, roles/iam.securityReviewer, roles/cloudasset.viewer).

Kubernetes deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name:      puck-cloud-agent
  namespace: security
spec:
  replicas: 1
  selector: { matchLabels: { app: puck-cloud-agent } }
  template:
    metadata:
      labels: { app: puck-cloud-agent }
    spec:
      serviceAccountName: puck-cloud-agent
      containers:
        - name:  puck
          image: ghcr.io/puck-security/agent:latest
          args:
            - --brain-url=https://brain.your-org.example
            - --api-key=$(PUCK_API_KEY)
            - --tags=cloud-aws,k8s-cluster-prod
          env:
            - name: PUCK_API_KEY
              valueFrom:
                secretKeyRef:
                  name: puck-api-key
                  key:  api-key

For AWS, attach an IRSA (IAM Roles for Service Accounts) annotation on the ServiceAccount:

apiVersion: v1
kind: ServiceAccount
metadata:
  name:      puck-cloud-agent
  namespace: security
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/PuckCloudAgentReadOnly

Multi-account pattern

You’ll typically deploy one cloud Puck per account. The account-* tag gives the brain a stable handle for scope selectors:

# Production payments account
puck --tags cloud-aws,account-prod-payments

# Production data account
puck --tags cloud-aws,account-prod-data

# Sandbox / dev
puck --tags cloud-aws,account-sandbox

Investigations can then target a specific account:

POST /v1/investigations
{
  "query": "What does an OrgAccountAccessRole assumption from sandbox actually reach in prod-payments?",
  "scope": {
    "type": "selector",
    "filter": { "tags": { "label": "account-prod-payments" } }
  },
  "depth": "deep"
}

The brain selects the cloud Puck with account-prod-payments as the pathfinder and runs the IAM walk inside that account.

Verifying the install

Once the task is running, confirm:

Heartbeat — GET /api/v1/agents should show the new agent with the cloud-aws and account-* labels and an OS of whatever container runtime hosted it.

Caller identity — kick off a smoke investigation:

curl -X POST https://brain.your-org.example/api/v1/query \
  -H "Authorization: Bearer $PUCK_API_KEY" \
  -d '{ "query": "What IAM role am I running as, and what can it reach?" }'

The investigation narrative should reference the role you attached (PuckCloudAgentReadOnly) and the resources visible to it.

Tag policy applied — the Puck’s heartbeat response carries the labels the brain has on file, and the pathfinder prompts will include the cloud system-prompt prefix the next time it investigates this host.

Troubleshooting

Symptom	Cause	Fix
Heartbeat 401s	Wrong / expired `PUCK_API_KEY`	Rotate via Settings → API key, redeploy task
`aws sts get-caller-identity` returns `AccessDenied`	Workload role not attached or trust policy wrong	Check task role on the running container; for Fargate, look at `taskRoleArn`
`Read-only access denied` on a specific service	`ReadOnlyAccess` doesn’t cover newer services (Q in 2024+)	Add explicit `Get`/`List`/`Describe*` on the missing service to `PuckCloudAgentReadOnly`
Pathfinder runs `ps` / `lsof`	The cloud Puck is using endpoint commands	Confirm `cloud-aws` tag is present and policy has the cloud `extra_allowed` set; the prompt in the policy steers it away from endpoint commands
Cross-account investigations fail	The role can’t `sts:AssumeRole` into linked accounts	Either grant `sts:AssumeRole *` on the workload role, or deploy a separate cloud Puck in each account (preferred — narrower blast radius)