Results and artifacts
ASSERT writes local artifacts and evaluation results under the artifacts folder, sorted by the evaluation suites (configured for each evaluation config YAML specification):
artifacts/results/<suite>/
Run-level outputs are located under each evaluation suite:
artifacts/results/<suite>/<run>/
Artifact layout and description
artifacts/results/<suite>/
├── suite.json
├── taxonomy.json
├── test_set.jsonl
└── <run>/
├── manifest.json
├── config.yaml
├── inference_set.jsonl
├── scores.jsonl
└── metrics.json
suite.json: evaluation suite metadatataxonomy.json: behavior categories generated from your evaluation config YAML in the systematization step of the pipeline.test_set.jsonl: single turn prompt and multi-turn scenario test cases generated by the test set generation step of the pipelinemanifest.json: stage-by-stage run status and timestampsconfig.yaml: frozen config snapshot used for this runinference_set.jsonl: target outputs plus trace references/eventsscores.jsonl: per-case judge verdicts, dimensions, and evidencemetrics.json: aggregate rates by dimension and category, along with token usage metadata
Tip: After a run, start with
metrics.jsonfirst then see thescores.jsonlbefore inspecting theinference_set.jsonlmore closely.
Useful CLI commands for viewing results
assert-ai results list
assert-ai results status <suite>
assert-ai results status <suite> <run>
assert-ai results compare <suite> <run-a> <run-b>
assert-ai results compare-suites <suite-a>/<run-a> <suite-b>/<run-b>
See CLI Commands for full options.
View evaluation suite artifacts and run results in a local UI app
Access a rich inspector and editing application to view run status, evaluation suite artifacts such as richly rendered taxonomy of behavior categories and their associated policy labels.
cd viewer
npm install
npm run dev
The local hosted UI application server starts at http://localhost:5174. Paste this into your browser to open up the inspector view.