Chrome Extension
The OpenScouter Chrome extension is the primary data capture interface for testers. It runs during a live test, records three parallel streams, and sends event data to the API in real time. The extension is built on Manifest V3, which replaced the legacy Manifest V2 background page model with service workers.
Manifest V3
OpenScouter uses Manifest V3 (MV3), the current Chrome extension platform. MV3 introduced several constraints that shaped the extension architecture.
Background scripts were replaced by a service worker. Service workers are ephemeral: the browser may terminate them when idle and restart them on demand. This means the background worker cannot hold state in memory between events. All persistent state is written to chrome.storage.local immediately when it changes.
Content Security Policy is stricter in MV3. Inline scripts and remote code execution are not permitted. All logic runs from locally bundled files.
The declarativeNetRequest API replaces the older webRequest blocking API. For OpenScouter’s purposes this has no impact, as the extension observes rather than intercepts network requests.
Module Architecture
The extension is composed of five key modules.
| Module | File | Responsibility |
|---|---|---|
| Background Worker | background.js | Lifecycle management, API communication, storage coordination |
| Content Script | content.js | DOM observation, user event capture, page-level data extraction |
| Camera Module | camera.js | Facial snapshot capture via MediaDevices API |
| Audio Module | audio.js | Microphone capture and streaming for voice transcription |
| Popup UI | popup/ | Tester controls, test status, task display |
The background worker and content scripts communicate via the Chrome messaging API. Camera and audio capture run in the content script context because they require access to the page’s media permissions.
Required Permissions
The extension requests the following permissions in manifest.json.
{ "permissions": [ "storage", "tabs", "activeTab", "scripting", "alarms" ], "host_permissions": [ "<all_urls>" ], "optional_permissions": [ "camera", "microphone" ]}Camera and microphone permissions are declared as optional. The extension requests them at runtime only after the tester explicitly grants consent in the popup UI. If consent is denied, the test proceeds with browser events only.
<all_urls> host permission is required because the extension must capture events on any URL the tester navigates to during the study.
7-Stage Test Lifecycle
The extension moves through seven stages during a test. Each stage transition is persisted to chrome.storage.local so the background worker can resume correctly if the service worker is terminated and restarted.
Stage 1: Idle
The extension is installed but no test is active. The popup shows the tester’s login status and any pending study offers.
Stage 2: Offer Accepted
The tester accepts a study offer via Telegram or the dashboard. The popup displays the study brief and task list. No data capture is running.
Stage 3: Consent Collection
The tester is presented with consent options for camera and microphone. Selections are stored before any capture begins. The extension does not open camera or microphone access until this stage is complete.
Stage 4: Test Active
The tester clicks Start Test. The extension begins capturing browser events immediately. If camera consent was granted, facial snapshots begin at the configured interval. If microphone consent was granted, audio streaming begins.
The content script injects a minimal overlay showing the current task number and a stop button. The overlay is positioned in a fixed corner and does not interfere with the site under test.
Stage 5: Task Transitions
When the tester moves to the next task, the extension records a task boundary event. All subsequent browser events are tagged with the new task_id. This allows per-task analysis in the Analyst agent.
Stage 6: Test Ended
The tester clicks Stop Test. The extension flushes any buffered events to the API, stops camera and microphone capture, and sends a session_end event with a final timestamp.
Stage 7: Cooldown
After the test ends, the extension enters a brief cooldown period during which the tester can review their captured data before it is submitted for analysis. The tester can flag any segment they want excluded, for example if they needed to pause unexpectedly. Flagged segments are stored with excluded = true and are not processed by the AI pipeline.
Content Script Injection
The content script is injected programmatically by the background worker when a test transitions to Stage 4. This is done using chrome.scripting.executeScript.
chrome.scripting.executeScript({ target: { tabId: activeTabId }, files: ['content.js']});Programmatic injection rather than static injection in manifest.json ensures the content script only runs during active tests. This reduces the extension’s footprint on sites that are not being tested.
The content script attaches event listeners for the following events: click, dblclick, keydown, scroll, focus, blur, input, submit, popstate, and hashchange. Listeners are attached to the document root with { capture: true, passive: true } to ensure all events are captured regardless of whether the page calls stopPropagation.
Background Worker
The service worker handles three responsibilities.
API communication. All HTTP requests to the OpenScouter API originate from the background worker. The content script sends events to the background worker via chrome.runtime.sendMessage, and the background worker batches and sends them to the API at regular intervals. Batching reduces the number of network requests and avoids triggering rate limits during high-interaction moments.
Storage management. Persistent state including the active session_id, study_id, task list, and consent flags is stored in chrome.storage.local. The background worker reads and writes this state. The content script never accesses storage directly.
Alarm scheduling. chrome.alarms are used to trigger periodic operations such as facial snapshot intervals. Alarms survive service worker termination and re-wake the worker when they fire.
Data Batching and Delivery
Browser events are queued in memory by the background worker and flushed to the API every 5 seconds or when the queue reaches 50 events, whichever comes first. An unload event triggers an immediate flush to avoid data loss if the browser tab is closed.
Facial snapshots are uploaded individually due to their size. Each snapshot is compressed to JPEG at 70% quality before transmission.
Voice audio is streamed in 4-second chunks via a WebSocket connection maintained by the background worker.