Data Flows

OpenScouter captures three parallel data streams during every test. These streams converge at the analysis stage, where corroboration logic determines the confidence level of each finding. Understanding how data flows from the extension to the database and through the AI pipeline helps developers integrate correctly and debug unexpected behavior.

The Three Stream Schemas

Each stream has a defined schema. Events are sent to the API as they occur during the test. All three streams reference the same session_id so the analysis layer can join them by time offset.

Browser Events

Browser events capture every user interaction during the test.

{
  "session_id": "sess_abc123",
  "event_type": "click",
  "timestamp": 1710000000000,
  "x": 452,
  "y": 318,
  "target_selector": "#nav-menu > li:nth-child(3) > a",
  "target_text": "Settings",
  "scroll_depth": 0.34,
  "time_on_element_ms": 2800,
  "is_rage_click": false,
  "viewport_width": 1440,
  "viewport_height": 900,
  "url": "https://example.com/dashboard",
  "task_id": "task_7"
}

The is_rage_click field is set to true when three or more clicks occur on the same element within 500 milliseconds. Rage clicks are a strong signal for interaction friction and are weighted heavily in corroboration logic.

The time_on_element_ms field measures how long the pointer remained over the target before the click. Elevated dwell time on interactive elements can indicate hesitation or confusion.

Facial Expression Snapshots

Facial snapshots are captured at configurable intervals. DeepFace processes each snapshot and returns an emotion label and confidence score.

{
  "session_id": "sess_abc123",
  "snapshot_id": "snap_xyz789",
  "timestamp": 1710000004200,
  "emotion": "confused",
  "confidence": 0.81,
  "action_units": {
    "AU04": 0.72,
    "AU07": 0.45,
    "AU23": 0.38
  },
  "capture_interval_ms": 2000,
  "face_detected": true
}

Snapshots where face_detected = false are stored but excluded from analysis. The action_units object contains Facial Action Coding System values used by DeepFace for fine-grained expression analysis.

Emotion labels produced by DeepFace: neutral, happy, sad, angry, fearful, disgusted, surprised, confused. Only confused, angry, fearful, and disgusted are treated as barrier-corroborating signals.

Voice Transcripts

Voice transcripts arrive as segment objects. Each segment contains the transcribed text, start and end timestamps, and a sentiment classification.

{
  "session_id": "sess_abc123",
  "segment_id": "seg_def456",
  "start_ms": 1710000003100,
  "end_ms": 1710000007900,
  "text": "Wait, where did the save button go? I was just looking at it.",
  "sentiment": "negative",
  "keywords": ["save button", "missing", "confusion"],
  "confidence": 0.94,
  "task_id": "task_7"
}

Keywords are extracted client-side before the segment is sent. The keywords array is used by the Analyst agent to speed up pattern matching across large transcripts.

Corroboration Logic

A finding’s confidence level is determined by how many independent signals point to the same moment. The corroboration engine runs during Agent 2 (Analyst) processing.

Signal Types

Signal	Description	Weight
Rage click	Three or more clicks on same element within 500ms	High
Negative emotion	Confused, angry, or fearful facial expression at barrier moment	Medium
Negative speech	Transcript segment with negative sentiment overlapping the event	Medium
Task abandonment	Tester navigated away without completing the task	High
Elevated dwell time	Pointer dwell on element exceeds 3x the study average	Low
Repeated backtracking	Tester visited the same URL more than twice in one task	Low

Confidence Levels

Confidence is calculated from the combination of signals present within a 10-second window around the candidate barrier moment.

Combination	Confidence Level
Rage click + negative emotion + negative speech	HIGH
Rage click + either negative emotion or negative speech	HIGH
Task abandonment + either emotion or speech signal	HIGH
Any two medium-weight signals	MEDIUM
Single medium-weight signal only	MEDIUM
Single low-weight signal only	LOW

A HIGH confidence finding is presented to the tester first in the confirmation interface. LOW confidence findings are presented last and include a note explaining which signals were detected.

Cross-Tester Corroboration

When the Synthesizer runs after all tests for a study are complete, it applies an additional corroboration pass across testers.

If three or more testers independently confirm the same barrier at the same location, the finding’s confidence level is boosted by one tier. A MEDIUM finding confirmed by three or more testers becomes HIGH in the study-level report. This cross-tester boost is stored on the job_reports record and is visible in the final business report.

Timeline Analysis

The Analyst agent divides each test into 30-second time buckets. This segmentation serves two purposes: it bounds the search space when clustering events, and it provides the time-axis data that renders as the activity timeline in the dashboard.

Event Clustering

Within each 30-second bucket, events are clustered by proximity. Proximity is measured in two dimensions: time offset within the bucket, and spatial position on the page.

Events that occur within 2 seconds of each other and within 200 pixels of each other are treated as a single interaction cluster. A cluster containing a rage click, a negative emotion snapshot, and a negative transcript segment triggers a HIGH confidence candidate regardless of the individual signal weights.

Bucket Metadata

Each time bucket produces a metadata object stored with the session_notes record.

{
  "bucket_start_ms": 1710000000000,
  "bucket_end_ms": 1710000030000,
  "event_count": 14,
  "rage_click_count": 2,
  "dominant_emotion": "confused",
  "negative_speech_segments": 1,
  "candidate_barriers": 1,
  "friction_score": 0.78
}

The friction_score is a normalized value between 0 and 1. Buckets with a friction score above 0.6 are highlighted in the dashboard timeline view.

Data Retention

Raw stream data is retained for 90 days after a test completes. After 90 days, session_events and facial_snapshots records are deleted. voice_segments are deleted after 30 days due to the sensitivity of audio content.

Confirmed session_notes records and the derived reports and job_reports records are retained indefinitely unless the organization requests deletion.