Run the Local UI Viewer Application
ASSERT includes a local web app for browsing evaluation artifacts with a richly rendered visualization UI. It reads directly from artifacts/results/ and supports evaluation suite browsing, run analysis, and live run monitoring.
The viewer reads from the filesystem on each request. There is no database or run-launch API.
Prerequisites
- Node.js 18+
- Evaluation artifacts in
artifacts/results/(fromassert-ai run)
Run in development
cd viewer
npm install
npm run dev
The dev server starts at http://localhost:5174.
What the viewer shows
- suite list with taxonomy and test-case counts
- taxonomy browser
- prompt browser (single-turn cases)
- scenario browser (multi-turn transcripts)
- run comparison views
- dimension breakdowns
- inference preview while runs are in progress
- live run monitor from
manifest.json
For more information on the layout of the local UI viewer application, see how to use the local viewer.
Build and preview
cd viewer
npm run build
npm run preview
Type checking
cd viewer
npm run check
Required artifacts
The viewer expects this layout per evaluation suite:
artifacts/results/<suite>/
├── taxonomy.json
├── systematization.json # optional
├── test_set.jsonl
├── suite.json
└── <run>/
├── manifest.json
├── config.yaml
├── inference_set.jsonl
├── scores.jsonl
├── viewer_run_manifest.json # completed judged runs
├── viewer_prompt_rows.json # completed judged runs
├── viewer_audit_rows.json # completed judged runs
├── viewer_transcript_index.json # completed inferences
└── viewer_score_index.json # completed judged runs
Missing files expected for incomplete runs are handled where appropriate. Invalid JSON, JSONL, or YAML is treated as an artifact error and should be fixed or re-generated.
One exception exists for live inference: while manifest.stages.inference == "running", the viewer tolerates one malformed trailing segment in inference_set.jsonl so it can read already-written rows before the current append finishes.
Read-model behavior and refresh
Completed judged runs are served from run-level viewer read-model files, not by rescanning canonical JSONL on every request.
If viewer_run_manifest.json is missing or stale, rebuild by re-running judge for that run:
assert-ai run --config artifacts/results/<suite>/<run>/config.yaml --resume --force-stage judge
Expected verdict contract
The viewer expects each successful score row to include:
verdict.dimensionswith binary event flags includingpolicy_violationandoverrefusalverdict.dimension_justificationsfor every dimension inverdict.dimensionsverdict.node_judgmentsin taxonomy order withnode_namematchingtaxonomy.jsonnamesverdict.citationsused by inline[N]evidence markers
Rows that fail this strict contract (for example, policy_compliance-only rows) are not treated as valid scored judgments.
Evidence drawer behavior
Explanation text can contain [N] citation chips that jump to cited transcript messages and highlight stored spans. Turn labels remain visible, but Turn N is not linkified, and the old separate Evidence block is not used for new structured artifacts.
Code layout
src/lib/server/artifacts.ts: artifact reads, path validation, and missing-vs-invalid handlingsrc/lib/server/data.ts: page-facing view modelssrc/lib/server/metrics.ts: prompt/scenario aggregatessrc/lib/server/run-status.ts: live monitor payloads frommanifest.jsonsrc/routes/*: route handlers and page orchestrationsrc/lib/*: shared UI helpers (citations, audit grouping, run polling, suite grouping)