CLI Commands
This page lists command signatures and key options.
Base command
assert-ai [GLOBAL_OPTIONS] COMMAND [ARGS] [OPTIONS]
Global options
-v,--verbose-q,--quiet--log-file <path>--output text|json
Command groups
init: interactive config generation assistantrun: execute pipeline stagesresults: list/status/compare suites and runsanalysis: post-hoc metrics commandsjudge-traces: score pre-collected OTel traceslibrary: browse built-in behavior/judge presets
init
Design an eval config with an LLM assistant.
assert-ai init [OPTIONS]
Options:
-o, --output <path>optional, defaulteval_config.yaml--describe <text>optional--from <path>optional--behavior <name>optional--judge-preset <name>optional--dimensions <csv>optional--model <litellm-model>optional, defaultazure/gpt-4o-mini--env-file <path>optional, default.env--non-interactiveoptional flag--max-turns <int>optional, default20--forceoptional flag--dry-runoptional flag--no-coloroptional flag
run
Run the evaluation pipeline from evaluation config YAML file.
assert-ai run --config <path> [OPTIONS]
Required:
--config <path>
Optional:
--force-stage <stage>repeatable (systematize,test_set,inference,judge)--strict--override <key=value>repeatable-v,--verbose-q,--quiet--log-file <path>--output text|json
results list
List suites or list runs for one suite.
assert-ai results list [OPTIONS]
Options:
--results-dir <path>optional--suite <suite-id>optional--jsonoptional flag--no-coloroptional flag
results status
Show suite summary or run details.
assert-ai results status <suite> [run] [OPTIONS]
Args:
suiterequiredrunoptional
Options:
--results-dir <path>optional--jsonoptional flag--no-coloroptional flag
results compare
Compare runs in the same suite or across suites.
assert-ai results compare <suite> <run1> <run2> [run3 ...] [OPTIONS]
assert-ai results compare <suite1>/<run1> <suite2>/<run2> [suite3/run3 ...] [OPTIONS]
Options:
--results-dir <path>optional--metric <dimension>optional, defaultpolicy_violation--limit <int>optional, default8--jsonoptional flag--no-coloroptional flag
results compare-suites
Compare named runs across different suites.
assert-ai results compare-suites <suite1>/<run1> <suite2>/<run2> [OPTIONS]
Options:
--results-dir <path>optional--metric <dimension>optional--jsonoptional flag--no-coloroptional flag
analysis test-set-metrics
Compute test-set coverage/diversity metrics.
assert-ai analysis test-set-metrics --taxonomy <path> --test_set <path> [OPTIONS]
Required:
--taxonomy <path>--test_set <path>
Optional:
--embed-model <name>defaulttext-embedding-3-large--embed-backend openai|hfdefaultopenai--k <int>repeatable--example-distance-thresh <float>default0.2--presence-coverageflag--out-json <path>defaultartifacts/analysis/test_set_metrics.json--out-md <path>
judge-traces
Judge pre-collected OTel traces without running inference.
assert-ai judge-traces --traces <path> --config <path> [OPTIONS]
Required:
--traces <path>--config <path>
Optional:
--group-by <attribute>defaultsession.id--output <path>
library list
List available built-in presets.
assert-ai library list [OPTIONS]
Options:
-k, --kind behavior|judge_preset--json--no-color
library show
Show one preset.
assert-ai library show <name> [OPTIONS]
Options:
-k, --kind behavior|judge_preset--json