Does the Promptbook replace a pentest?

No. It helps identify signals and organize review. Impact validation, exploitation chain, and fix confirmation stay in the Manual Pentest.

Why is the Manual Pentest not sold through public checkout?

Because scope, authorization, environment, and operational risk need to be evaluated before a proposal.

Codex mobile and Agent HQ: approving agent PRs on a phone still needs boundaries

Codex in ChatGPT, GitHub Agent HQ and Copilot coding agent shorten the path from issue to PR to deploy. For SaaS with real customers, review permissions, CI, secrets, branch rules and rollback before speed becomes excessive access.

The agent moved into the decision flow

The important shift is no longer "AI wrote code". The shift is that AI can create branches, understand issues, open PRs, interact with CI, request approvals and be managed from mobile. Codex, Copilot coding agent, Agent HQ and GitHub Mobile put engineering work in a faster and easier approval loop.

That is useful for founders. It becomes risky when the same speed touches payments, customer data, admin areas, plan logic or production deploys.

Review first

Agent repository permissions: read, write, secrets, actions and protected branches.
Issue scope: one closed problem, not "improve the whole product".
Mandatory CI before merge: lint, tests, typecheck, build and route smoke.
Secrets absent from logs, comments, prompts, screenshots, artifacts and PR summaries.
Human approval for database migrations, payment logic, webhooks, auth and tenant rules.
Preview data, not real production data for quick testing.
Rollback for changes touching checkout, login, uploads, admin or billing.

The sales blocker

Customers do not care whether Codex, Copilot, Claude or Cursor built the screen. They care whether one customer can see another customer's data, whether paid access unlocks incorrectly, whether a webhook duplicates delivery or whether a mobile approval shipped without evidence.

The practical path is Promptbook for the first flow review, Risk Review when the signal touches revenue or data, and Manual Pentest when a B2B customer needs scoped proof.

Before approving from mobile

Is the PR small and focused?
Did CI run on the exact commit to be merged?
Is there a test for the flow that sells, authenticates or separates customers?
Did the agent avoid production secrets?
Does the human reviewer understand the business risk?
Is rollback or a feature flag available?

Sources

Mobile approval works when the guardrails already exist. Without guardrails, speed becomes over-permissioned access.