Skip to content

Cloud agent (AWS / GCP / K8s)

A cloud Puck is the same puck binary as the endpoint version, deployed inside the customer’s cloud account (AWS, GCP, Azure, K8s). The “host” it investigates is the cloud control plane itself: IAM trust graphs, S3 / GCS reachability, Lambda execution roles, cross-account assume-role chains, EC2 / VM metadata, etc.

Same protocol, same heartbeat, same tag policies. The only difference is:

  • Where it runs — an ECS task / Cloud Run service / K8s pod inside the customer’s account, not on a developer’s laptop.
  • What it can callaws *, gcloud *, kubectl * instead of ps, lsof, find. Tag policy provides the prompt + the catalog override.
  • How it authenticates to the cloud — the workload identity role attached to the compute instance. The brain never holds AWS keys.

Why a cloud Puck instead of a brain-side connector

Two reasons:

  1. The brain never holds the customer’s cloud credentials. A cloud Puck runs in the customer’s environment under the customer’s workload-identity role. They control the boundary; you don’t have to.

  2. Same investigation model. The pathfinder + plan + finding pipeline already handles cross-host correlation. A cloud Puck is one more curious agent in the fleet — the brain stitches its findings into the same graph as endpoint Pucks. “This AWS key found on eng-laptop-47” + “this AWS key resolves to AdministratorAccess via the cloud-aws Puck” automatically share a node.

You’d want a brain-side connector only if a cloud Puck couldn’t run in-account. In every realistic deployment it can.

Anatomy of a cloud Puck

customer AWS account
┌──────────────────────────────────────────────┐
│ ECS Fargate task (or EKS pod, Cloud Run) │
│ │
│ ┌─────────────────────────────────┐ │
│ │ puck (same binary) │ │
│ │ │ │
│ │ --tags cloud-aws,account-prod │ │
│ │ │ │
│ │ Workload identity role: │ │
│ │ PuckCloudAgentReadOnly │ │
│ └──┬──────────────────────────────┘ │
│ │ HTTPS (443) │
└──────┼────────────────────────────────────────┘
Puck brain (your hosted control plane)
Receives heartbeats, dispatches plans.

The cloud Puck heartbeats to the brain, polls for plans, executes read-only AWS API calls under its attached IAM role, and ships results back. No cloud credentials traverse the brain.

Tag policies

The shipped cloud-aws and cloud-gcp policies are loaded by demo-seed.sql and visible in Settings → Tag policies out of the box:

  • cloud-aws

    • extra_system_prompt: framing for the LLM (“you are a cloud Puck running inside the customer’s AWS account, the host is the AWS control plane, walk IAM trust policies…”)
    • extra_allowed: read-only aws iam *, aws s3 *, aws sts *, aws lambda list-*, aws rds describe-*, aws ec2 describe-*, aws ecs describe-*, aws cloudtrail lookup-events
    • extra_denied: anything matching *create*, *delete*, *put*, *update*, *attach*, *detach*
    • severity_floor: high — cloud findings are typically critical
    • default_environment_class: cloud
    • tag_findings_with: { layer: cloud, plane: aws }
  • cloud-gcp — same shape, gcloud * and gsutil * allowlist; deny list scoped to *create*, *delete*, *set-iam-policy*, etc.

Customers can layer additional tags for account boundaries:

Terminal window
puck --api-key \
--tags cloud-aws,account-prod-payments,team-platform

The brain merges every applicable policy. account-prod-payments can attach its own route_findings_to_webhook (PagerDuty for prod) or max_investigations_per_day budget.

ECS / Fargate deployment

IAM role

Two policies on the workload role:

  1. PuckCloudAgentRead — what the Puck investigates. Start with AWS-managed ReadOnlyAccess and pare down based on what Pathfinder actually needs:

    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Effect": "Allow",
    "Action": [
    "iam:Get*", "iam:List*", "iam:Simulate*",
    "s3:GetBucketAcl", "s3:GetBucketPolicy", "s3:GetBucketTagging",
    "s3:ListAllMyBuckets", "s3:ListBucket",
    "sts:GetCallerIdentity",
    "lambda:Get*", "lambda:List*",
    "rds:Describe*", "rds:List*",
    "ec2:Describe*",
    "ecs:Describe*", "ecs:List*",
    "cloudtrail:LookupEvents", "cloudtrail:GetTrail"
    ],
    "Resource": "*"
    }
    ]
    }

    Boundary policy is recommended — explicitly deny every *Create*, *Delete*, *Put*, *Update*, *Attach*, *Detach* action. The tag-policy extra_denied is defence in depth, not the boundary.

  2. PuckBrainEgress — VPC egress to the brain. Usually no policy needed; Fargate’s default outbound rule plus a security group allowing 443 to your brain host is enough.

Trust policy

{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": { "Service": "ecs-tasks.amazonaws.com" },
"Action": "sts:AssumeRole"
}]
}

Task definition

{
"family": "puck-cloud-agent",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "256",
"memory": "512",
"executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::ACCOUNT:role/PuckCloudAgentReadOnly",
"containerDefinitions": [{
"name": "puck",
"image": "ghcr.io/puck-security/agent:latest",
"essential": true,
"command": [
"--brain-url", "https://brain.your-org.example",
"--api-key", "${PUCK_API_KEY}",
"--tags", "cloud-aws,account-prod-payments"
],
"secrets": [
{ "name": "PUCK_API_KEY", "valueFrom": "arn:aws:secretsmanager:…:puck/api-key" }
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/puck-cloud-agent",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "puck"
}
}
}]
}

Run as a long-lived Fargate service (one task, restart on exit) — not a scheduled task. The Puck heartbeats every 60s and pulls plans on demand; uptime should match endpoint Pucks.

GCP / Cloud Run deployment

cloud-run.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: puck-cloud-agent
spec:
template:
spec:
serviceAccountName: puck-cloud-agent@PROJECT.iam.gserviceaccount.com
containers:
- image: ghcr.io/puck-security/agent:latest
args:
- --brain-url
- https://brain.your-org.example
- --api-key
- $(PUCK_API_KEY)
- --tags
- cloud-gcp,project-prod-payments
env:
- name: PUCK_API_KEY
valueFrom:
secretKeyRef:
name: puck-api-key
key: latest

Cloud Run scales the service to zero by default — for a Puck you want min instances = 1 so heartbeats don’t lapse:

Terminal window
gcloud run services update puck-cloud-agent \
--min-instances=1 --no-cpu-throttling

Workload-identity binding gives the service account the read-only roles (roles/viewer, roles/iam.securityReviewer, roles/cloudasset.viewer).

Kubernetes deployment

apiVersion: apps/v1
kind: Deployment
metadata:
name: puck-cloud-agent
namespace: security
spec:
replicas: 1
selector: { matchLabels: { app: puck-cloud-agent } }
template:
metadata:
labels: { app: puck-cloud-agent }
spec:
serviceAccountName: puck-cloud-agent
containers:
- name: puck
image: ghcr.io/puck-security/agent:latest
args:
- --brain-url=https://brain.your-org.example
- --api-key=$(PUCK_API_KEY)
- --tags=cloud-aws,k8s-cluster-prod
env:
- name: PUCK_API_KEY
valueFrom:
secretKeyRef:
name: puck-api-key
key: api-key

For AWS, attach an IRSA (IAM Roles for Service Accounts) annotation on the ServiceAccount:

apiVersion: v1
kind: ServiceAccount
metadata:
name: puck-cloud-agent
namespace: security
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::ACCOUNT:role/PuckCloudAgentReadOnly

Multi-account pattern

You’ll typically deploy one cloud Puck per account. The account-* tag gives the brain a stable handle for scope selectors:

Terminal window
# Production payments account
puck --tags cloud-aws,account-prod-payments
# Production data account
puck --tags cloud-aws,account-prod-data
# Sandbox / dev
puck --tags cloud-aws,account-sandbox

Investigations can then target a specific account:

POST /v1/investigations
{
"query": "What does an OrgAccountAccessRole assumption from sandbox actually reach in prod-payments?",
"scope": {
"type": "selector",
"filter": { "tags": { "label": "account-prod-payments" } }
},
"depth": "deep"
}

The brain selects the cloud Puck with account-prod-payments as the pathfinder and runs the IAM walk inside that account.

Verifying the install

Once the task is running, confirm:

  1. HeartbeatGET /api/v1/agents should show the new agent with the cloud-aws and account-* labels and an OS of whatever container runtime hosted it.

  2. Caller identity — kick off a smoke investigation:

    Terminal window
    curl -X POST https://brain.your-org.example/api/v1/query \
    -H "Authorization: Bearer $PUCK_API_KEY" \
    -d '{ "query": "What IAM role am I running as, and what can it reach?" }'

    The investigation narrative should reference the role you attached (PuckCloudAgentReadOnly) and the resources visible to it.

  3. Tag policy applied — the Puck’s heartbeat response carries the labels the brain has on file, and the pathfinder prompts will include the cloud system-prompt prefix the next time it investigates this host.

Troubleshooting

SymptomCauseFix
Heartbeat 401sWrong / expired PUCK_API_KEYRotate via Settings → API key, redeploy task
aws sts get-caller-identity returns AccessDeniedWorkload role not attached or trust policy wrongCheck task role on the running container; for Fargate, look at taskRoleArn
Read-only access denied on a specific serviceReadOnlyAccess doesn’t cover newer services (Q in 2024+)Add explicit Get*/List*/Describe* on the missing service to PuckCloudAgentReadOnly
Pathfinder runs ps / lsofThe cloud Puck is using endpoint commandsConfirm cloud-aws tag is present and policy has the cloud extra_allowed set; the prompt in the policy steers it away from endpoint commands
Cross-account investigations failThe role can’t sts:AssumeRole into linked accountsEither grant sts:AssumeRole * on the workload role, or deploy a separate cloud Puck in each account (preferred — narrower blast radius)