System Overview
OpenScouter is a multi-layer platform that connects neurodivergent testers with businesses commissioning accessibility studies. The architecture centers on a 4-agent AI pipeline that processes live test data, applies corroboration logic, and generates structured reports. A human-in-the-loop gate ensures no AI-generated finding reaches a report without tester confirmation.
Platform Components
OpenScouter consists of five primary components that work together across each study lifecycle.
| Component | Technology | Role |
|---|---|---|
| Web Application | Next.js 14, React | Business and tester dashboards, study management |
| Chrome Extension | Manifest V3 | Live data capture during tests |
| API Layer | Next.js API routes, Supabase | Data persistence, webhook handling, agent orchestration |
| AI Pipeline | OpenAI, Anthropic, Google Gemini | 4-agent sequential analysis |
| Database | Supabase (PostgreSQL) | Structured storage with Row-Level Security |
The 4-Agent AI Pipeline
Each test moves through four agents in sequence. Agents operate on data produced by earlier stages. The pipeline is designed so failures in one stage degrade gracefully rather than blocking the entire study.
Agent 1: Scouty
Endpoint: POST /api/ai
Scouty handles real-time chat and initial analysis during a live test. It ingests the three parallel data streams from the Chrome extension as events arrive: browser interactions, facial expressions, and voice transcripts. Scouty identifies candidate barrier moments by correlating signals across all three streams and maintains a running list of moments with preliminary severity flags.
Scouty is the only agent that operates in real time during the test. All other agents process data after the test concludes.
Agent 2: Analyst
Endpoint: POST /api/sessions/[id]/analyze
The Analyst runs immediately after a test ends. It takes Scouty’s candidate list and performs deep-dive analysis on each moment. The Analyst maps every barrier to the relevant WCAG success criterion, assigns a severity rating, and identifies cross-task patterns where the same barrier surfaces multiple times.
The Analyst produces the structured finding objects that the tester reviews in the confirmation interface. Its output is stored as session_notes records with a confirmed flag set to false.
Human-in-the-Loop Gate
After Agent 2 completes, the pipeline pauses. The tester reviews every AI-generated finding, confirming, rejecting, or annotating each one. Agent 3 does not run until the tester sets notes_confirmed = true on the study.
This gate is not optional and cannot be bypassed via the API. It is a core product invariant that keeps tester expertise authoritative over AI output.
Agent 3: Report Writer
Endpoint: POST /api/reports
Trigger condition: notes_confirmed = true on the parent study
The Report Writer transforms confirmed findings into the individual tester report. It generates both technical descriptions and Plain English summaries for every finding. It structures output according to the selected tone profile and produces WCAG references with success criterion codes and conformance levels.
The Report Writer creates one reports record per tester per study.
Agent 4: Synthesizer
Endpoint: POST /api/reports/job/[id]
The Synthesizer runs after all tester reports for a study are complete. It performs cross-test analysis, comparing findings across all testers to identify which barriers are widespread versus profile-specific. It produces the ND stratification data that appears in the final business-facing study report and stores output as a job_reports record.
Model Strategy
The pipeline uses multiple AI providers: OpenAI, Anthropic (Claude), and Google Gemini. The platform implements automatic failover across providers. When the primary provider is unavailable or returns an error, requests are routed to the next available provider.
Every AI response includes a usedFallback boolean field. This field is stored on the relevant database record so monitoring queries can track fallback rates over time. A sustained increase in usedFallback = true records indicates primary provider degradation.
Tone Profiles
The Report Writer applies a tone profile when generating report text. Tone profiles affect word choice and framing but do not alter the factual content of findings.
| Profile | Description |
|---|---|
| Standard | Direct, neutral technical language. Suitable for developer-facing reports. |
| Supportive | Encouraging framing. Acknowledges tester effort and product progress. |
| Moderate | Balanced tone between Standard and Supportive. General-purpose default. |
| Restorative | Trauma-informed language for studies involving sensitive user groups. |
The tone profile is set at the study level when the business creates the study. It cannot be changed after the Report Writer has run.
Data Flow Summary
The following stages describe how data moves from a live test to a delivered report.
- Tester installs the Chrome extension and begins a test
- Extension streams browser events, facial snapshots, and voice transcripts to the API
- Scouty processes incoming events and maintains candidate findings in real time
- Test ends; Analyst runs and produces structured
session_notesrecords - Tester reviews findings in the confirmation interface and sets
notes_confirmed = true - Report Writer runs and creates a
reportsrecord for the tester - When all testers for a study have confirmed, the Synthesizer runs and creates a
job_reportsrecord - Business receives the completed study report
Technology Decisions
Next.js API routes handle all server-side logic. This keeps the API collocated with the frontend without requiring a separate server process.
Supabase provides the PostgreSQL database, authentication, Row-Level Security, and real-time subscriptions. RLS policies enforce data isolation between organizations at the database level, not just the application level.
Manifest V3 is used for the Chrome extension because it is the current and supported extension platform. Service workers replace background pages, which required architectural adjustments to how the extension manages state between events.