Add Browser Collector Enrollment Primitive
Context
PDPP collects personal data from several topologies. Two matter here:
- Filesystem-backed local collection (
claude-code,codex): a local collector reads files on the owner's device and pushes normalized records through the device-exporter enroll/ingest path. The enrolled binding is recorded withsource_kind: "local_device". - Browser-bound collection (
amazon,chase,chatgpt): a connector drives a real browser session against a logged-in provider. Today this runs natively (headed Patchright on the owner's desktop) or, in Docker, through the host browser bridge (design-host-browser-bridge-for-docker). It has no enrollment primitive — there is no way for a trusted owner agent to initiate a browser-bound connection and have it materialize through the device-exporter path.
The owner-agent control surface (add-owner-agent-control-surface) made the gap
visible and honest: POST /v1/owner/connections/intents classifies a connector
by its manifest runtime_requirements.bindings and returns unsupported for any
browser binding, with a reason that names this primitive verbatim. The intent
route is wired but dormant for browser-bound connectors precisely because the
enrollment primitive does not exist.
This change designs that primitive. It does not implement it (that is the
next lane). The output is the durable boundary: what browser_collector is, how
binding-aware enrollment gates it, what proof is required before Amazon flips
from unsupported, and what stays out of PDPP Core.
Grounding facts (verified in tree)
reference-implementation/server/routes/ref-device-exporters.tsenroll handler hardcodessourceKind: "local_device"andsourceBinding.kind: "local_device"(thecreateRequestConnectorInstanceStore().upsert(...)call). The enrollment-code handler acceptsconnector_id+local_binding_nameand does not inspect the manifest.reference-implementation/server/routes/owner-connection-intent.tsclassifyConnectorIntentModalitymaps abrowserbinding →browser_bound, andunsupportedReason("browser_bound")already names "abrowser_collectorsource kind (distinct fromlocal_device), binding-aware enrollment gating, and committed proof".packages/polyfill-connectors/manifests/amazon.jsondeclaresruntime_requirements.bindings: { network: { required: true }, browser: { required: true } }.claude_code.json/codex.jsondeclarefilesystem.packages/polyfill-connectors/src/browser-manifest-honesty.test.tsalready enforces that browser-backed connectors declarebindings.browser.required: true.- The Collection Profile (
spec-collection-profile.md§1) defines standard bindingsbrowser_automation,browser_profile,filesystem,network,interactive,loopback_listen. There is no barebrowserbinding in the spec registry; reference manifests usebrowseras a reference-local marker. This mismatch is captured indesign-notes/browser-binding-launch-direction-2026-05-18.md.
Goals / Non-Goals
Goals:
- Define
browser_collectoras a connector-instance source kind, a peer oflocal_device, so browser-collected and filesystem-collected bindings for the same connector type never collide and an owner can tell them apart. - Make device-exporter enrollment binding-aware so the source kind is derived from the manifest, not hardcoded, and a contradicting source kind is rejected.
- Specify the owner-mediated initiation flow for browser-bound connectors that reaches an actionable next step without marking the connection active.
- Make proof a precondition: no route may advertise a real browser-bound next step until a committed test + scrubbed fixture shows end-to-end ingest.
- Keep Core / Collection Profile / reference / operator boundaries explicit.
Non-Goals:
- Do not implement the primitive in this lane. This is design/spec only.
- Do not promote
browser_collector(orlocal_device) into PDPP Core protocol vocabulary. It is reference / Collection Profile implementation vocabulary. - Do not reconcile the bare
browserbinding name against the spec-definedbrowser_automation/browser_profileregistry here. That is a separate promotion tracked bybrowser-binding-launch-direction-2026-05-18.mdand must not be silently absorbed. This change reads whatever binding marker the manifest declares; it does not bless the name. - Do not build a new browser transport. Browser sessions continue to run via the
native headed launcher or the host browser bridge.
browser_collectoris an enrollment / identity primitive, not a new automation channel. - Do not weaken the device-exporter trust model. Browser-collected ingest uses the same enrolled, revocable device credentials and the same instance-resolution rules as filesystem collection.
Decisions
1. browser_collector is a connector-instance source kind, not a spine source kind
There are two unrelated "source kind" axes in the reference, and conflating them would be a real bug:
reference-implementation/lib/spine.tsdefinesSourceKind = "connector" | "provider_native"for spine event provenance. This change does not touch that union.- The connector-instance source binding carries
sourceKind: "local_device"(the enroll handler'supsert).browser_collectoris a new value on this axis only — a peer oflocal_device.
A browser_collector source binding means: this connector instance is collected
by a local collector that drives a browser session for a browser-bound connector.
It is distinct from local_device (filesystem-read) because the trust posture,
the diagnostics an owner needs ("session expired / login required" vs "device
unreachable"), and the future health surface differ. Recording it as
local_device would assert filesystem collection that did not happen.
The source binding records enough to namespace and resolve the instance: the
connector key, the source kind (browser_collector), the device id, the local
binding name, and a source-instance id — the same shape local_device uses, with
the kind swapped. No new top-level noun is minted; per
connection-first-collection-identity-2026-05-18.md, a binding becomes
first-class only when it needs independent lifecycle, which it does not here.
2. Enrollment derives the source kind from the manifest and rejects contradictions
Today the enroll route hardcodes local_device. That is correct only because the
only enrolled connectors are filesystem-backed. The durable rule:
- The enrollment-code and enroll routes SHALL resolve the connector manifest and
read
runtime_requirements.bindings. - A
filesystembinding → the enrolled binding SHALL belocal_device. - A
browserbinding (and nofilesystembinding) → the enrolled binding SHALL bebrowser_collector. - If a caller supplies an explicit source kind that contradicts the manifest
(e.g.
local_devicefor abrowser-only connector), the route SHALL reject the request rather than silently record the wrong kind. - A connector with neither binding, or no registered manifest, SHALL be rejected with a typed error — enrollment SHALL NOT default to a source kind.
This mirrors the existing intent-route classifier
(classifyConnectorIntentModality) so the enroll path and the intent path agree
on the same manifest-derived placement signal. filesystem wins over browser
if a manifest declares both (defensive; no current manifest does), matching the
classifier's existing precedence.
The binding marker the route reads is whatever the manifest declares
(browser). This change deliberately does not rename it to a spec-registry
binding; see Non-Goals and Open Questions.
3. New connection creation stays an intent; browser-bound gets a typed next step only after proof
The owner-agent control surface already established that initiation is an intent,
not a mutation, and that every response carries connection_active: false. This
change defines the browser-bound branch:
- Before proof: the
browser_boundbranch returnsunsupported(current behavior). The reason names this primitive. No code change flips it early. - After proof (decision 4): the branch MAY return a typed
next_step.kind: "enroll_browser_collector"carrying a single-use enrollment code and the enroll endpoint, minted via the samedeviceExporterStore.createEnrollmentCodeoperation the local-collector branch uses (separate bearer auth adapter; no handler cloning). The response SHALL keepconnection_active: false.
The connection materializes only when the owner's collector exchanges the code,
the owner completes provider login locally (the agent never receives
credentials or drives 2FA), and the collector ingests at least one batch through
the device-exporter path. Initiation continues to emit
owner_agent.connection.initiate spine evidence (actor, connector key, modality,
next-step kind, outcome) and never logs the bearer token or the minted code.
enroll_browser_collector is reserved as a distinct next-step kind from
enroll_local_collector so the collector and the owner can see that the next
step requires a browser session and a local login, not a filesystem scan. The
contract enum already reserves unused kinds (open_url,
complete_browser_assistance, upload_file); enroll_browser_collector joins
them as a reserved-then-emitted kind, so adding it is not a contract break.
4. Proof is a precondition, not a follow-up
The acceptance bar from add-owner-agent-control-surface is explicit: claiming a
flow the reference does not prove is a faked success and is forbidden. Therefore:
- No route SHALL advertise
enroll_browser_collector(or otherwise flip a browser-bound connector offunsupported) until there is a committed test that drives a browser-bound connector (Amazon) through enrollment → browser session → device-exporter ingest, and a scrubbed fixture (per thescrub-connector-fixturespipeline) proving the ingested shape. - The proof SHALL exercise the real enroll/ingest path, not a mock that asserts the happy path without touching the binding-aware enrollment code.
- Until the proof lands,
unsupportedwith the named gap is the honest output.
This keeps the flip and the proof in the same reviewable unit when implementation happens, and prevents a future lane from advertising the next step on faith.
5. Browser transport is unchanged; this is identity + enrollment only
browser_collector does not introduce a new way to run a browser. The browser
session still runs through the native headed launcher or the host browser bridge
(design-host-browser-bridge-for-docker). What is new is:
- the enrollment decision (which source kind to record), and
- the instance identity (
browser_collectorbinding) that ingest resolves against.
This keeps browser automation where it already lives (the polyfill-connector
runtime and the host bridge) and out of PDPP Core and the central server. The
central server still receives normalized records, state, health, and diagnostics
— never raw browser control or remote filesystem access — exactly as
local-device-exporter-collection requires today.
6. Multi-account Amazon is correct by construction
With browser_collector as a distinct binding under connection-first identity:
- "Tim personal Amazon" and a future "shared Amazon" are two connector instances
with the same
connector_id: amazon, distinctconnector_instance_id, eachbrowser_collector, each with its own local binding name, schedules, state, health, and idempotency namespace (the instance-scoping requirements inreference-connector-instancesalready cover this once the source kind exists). - A second Amazon account is "enroll another browser-collector binding for
amazon", not a special case. Chase and ChatGPT inherit the same path because they share thebrowserbinding marker.
Risks / Trade-offs
- Risk:
browser_collectorproliferates source-kind values. Mitigation: it is one new value on one existing axis (connector-instance source binding), justified by a distinct trust posture and diagnostics need. No new top-level noun, no new spine source kind. - Risk: binding-aware enrollment misclassifies a connector that declares both
filesystemandbrowser. Mitigation: deterministic precedence (filesystemwins), matching the existing intent classifier; no current manifest declares both; a connector with neither is rejected, not defaulted. - Risk: the bare
browserbinding marker gets blessed by accident. Mitigation: this change reads the marker the manifest already declares and explicitly defers the spec-registry reconciliation tobrowser-binding-launch-direction-2026-05-18.md. The spec delta does not namebrowseras a Core or Collection Profile binding. - Risk: a future lane advertises the next step without real proof. Mitigation: the proof gate is a normative requirement, and the flip and proof are designed to land in the same unit.
- Risk: agents attempt unsafe provider automation. Mitigation: provider login
and 2FA stay owner-mediated and local; the intent never returns credentials, and
connection_activestays false until local ingest.
Migration Plan
This change is design only; the migration below is the implementation lane's plan, recorded so the next slice has a concrete start.
- Add
browser_collectorto the connector-instance source-kind type and the enroll handler'ssourceBindingconstruction. - Add a manifest-derived source-kind resolver shared by the enrollment-code and enroll routes (reuse the intent classifier's binding precedence).
- Reject contradicting or unresolvable source kinds with typed errors; add unit
coverage for filesystem→
local_device, browser→browser_collector, contradiction→reject, no-binding→reject. - Land the Amazon end-to-end proof test + scrubbed fixture.
- Only then: flip the
browser_boundintent branch to returnenroll_browser_collector, and add theadd-owner-agent-control-surfaceAmazon second-account acceptance coverage (its tasks 5.3 / 8.5).
Rollback is route-level: the intent branch reverts to unsupported; enrolled
browser_collector instances remain valid instance rows governed by normal
retention and grant rules.
Open Questions
- Should
enroll_browser_collectorbe a distinct next-step kind, or shouldenroll_local_collectorcarry arequires_browser_session: trueflag? Leaning distinct kind (decision 3) because the collector and owner need to see the browser/login requirement without parsing flags; revisit if the collector CLI would rather branch on a flag. - Should the bare
browserbinding be reconciled tobrowser_automation/browser_profilebefore or after this primitive ships? Deferred tobrowser-binding-launch-direction-2026-05-18.md. This change works with either outcome because it reads whatever marker the manifest declares. - Does Chase's money-adjacent posture warrant a stricter enrollment gate than
Amazon (e.g. an explicit per-connector allowlist for
browser_collectorinitiation)? Out of scope here; flagged for the implementation lane.
Acceptance Checks
-
browser_collectoris defined as a connector-instance source kind distinct fromlocal_device, with a stated reason it is notlocal_deviceand not a spine source kind. - Binding-aware enrollment gating is specified: manifest-derived source kind, contradiction rejection, no defaulting.
- The owner-mediated browser-bound initiation flow reaches a typed next step without marking the connection active.
- A proof precondition is specified before any route flips Amazon off
unsupported. - Core / Collection Profile / reference / operator boundaries are explicit, and
the
browserbinding name reconciliation is deferred, not absorbed. - Implementation, Amazon proof test + scrubbed fixture, and the intent-branch flip — deferred to the implementation lane (out of scope here).