View results in Local UI Viewer Application
Though portable local artifacts can be a powerful feature, ASSERT comes with a local-hosted UI viewer web application that helps richly render the results and artifacts to help with analysis.
Evaluation Suite list
This is the first page you'll land on. Use the Evaluation Suite list page to quickly scan available suites and jump to a specific suite that includes taxonomies, test sets and evaluation results.

You can also click "New evaluation" button on the top right to walk through a guided UI wizard to author an evaluation config and create a new evaluation suite.
Create a new evaluation flow
The "Create new evaluation" flow guides you through selecting source artifacts and run settings. In just three steps, set up the whole evaluation pipeline and hit run.
1. Input specification
First, define your behavior name and description, or re-use one that you've already run before from a different evaluation suite. Select your target type that you'd like to evaluate (currently a hosted model or prompt agent supported only), and fill in the application context and system prompt used by your model or prompt agent.

2. Category and evaluation set
Next, set up your evaluation pipeline: including the systematization + test set generation + judge pipeline stages. Define what models you want to use for each step of the pipeline, along with its parameters.

3. Summary and submit
Finally, review your evaluation configuration, and submit the run. You'll be redirected to a monitoring page to keep track of your run status and when the pipeline is done running.

Evaluation suite overview tabs
The suite details page contains tabs for taxonomy, test set content, and run-level evaluation results.
1. Review taxonomy
Take a look at the generated taxonomy and the encoded policies (permissable/not permissable), with the definition of each behavior category and it's polocy label.

2. Review generated test set
Browse the single turn prompt test cases and multi-turn scenarios generated by the taxonomy. If you want to easily share a .csv file, you can download the test set directly to your local file.

3. Review Evaluation results
Take a look at all the evaluation runs completed in the evaluation suite, along with its high level metrics.

Evaluation run summary and result tables
Within a single run, the viewer exposes a high-level summary plus row-level result drilldowns.

View all the rows of the evaluation run as a flat list or by judge dimensions.

When you click on a specific row, the viewer will pop up a detailed view of each interaction with the judge verdict, confidence and citations.

Compare runs
Use the compare view to inspect differences across runs side by side. Optionally, toggle to see where there are disagreements.
