poq.toml Specification
The poq.toml file for each Sapien project is its system spec. It turns the data you upload into a working project: the fields on each review item, what validators see, the questions they answer, and how they are scored.
Every poq.toml has three phases (ingestion, validation, attestation) plus a validators config and file metadata:
| Phase / config | Namespace | Purpose |
|---|---|---|
| File metadata | [project] | Spec format version (spec_version) and optional tag |
| Ingestion | [ingestion.*] | Sources, joins, field projection → datapoints |
| Validation | [validation.*] | Task UI, rubric, scoring |
| Validators | [validators.*] | Who reviews: counts, pay, classes, routing, qualification |
| Attestation | [attestation.*] | Signed PoQ report export |
Ingestion — Turn uploaded files into datapoint rows: declare sources, optional joins, and which columns each task carries.
[[ingestion.sources]](#the-ingestionsources-section)(required) — Where your data lives and how Sapien reads it.[[ingestion.joins]](#the-ingestionjoins-section)(optional) — Combine separate sources before field projection.[ingestion.fields](#the-ingestionfields-section)(required) — Name each column on a task and say which file it came from (e.g.title = "findings.title").
Validation — What validators see on each task, the questions they answer, and how answers are scored.
[[validation.evidence]](#the-validationevidence-section)(required) — What validators see on the task page.[[validation.rubric]](#the-validationrubric-section)(required) — Questions validators answer.[validation.verdict](#the-validationverdict-section)(optional) — Item-level pass/fail rollup across rubric rows.[validation.ground_truth](#the-validationground_truth-section)(optional) — Secret labels, prefill hints, and difficulty binning.[[validation.instructions]](#the-validationinstructions-section)(optional) — Onboarding slides. Currently not rendered.
Validators — Who reviews each item, how many validators, pay and stake, routing by field values, and gates before claim.
[validators](#the-validators-section)(required) — Default validator count, pay, and stake.[[validators.classes]](#the-validatorsclasses-section)(optional) — Validator personas (human or AI), pay, and consensus weight.[[validators.routes]](#the-validatorsroutes-section)(optional) — Route items to validator counts and classes.[[validators.qualification]](#the-validatorsqualification-section)(optional) — Profile fields required before claim.[validators.claims](#the-validatorsclaims-section)(optional) — Claim TTL before a task expires.[validators.actions](#the-validatorsactions-section)(optional) — Task-page actions (e.g. conflict-of-interest self-decline).
Attestation — Signed proof-of-quality report export: schema, signing key, payload fields, and output format.
[attestation](#the-attestation-section)(optional) — Signed report download settings.
Migration from the flat layout
| Today | Proposed |
|---|---|
[[inputs]] | [[ingestion.sources]] |
[[data.joins]] | [[ingestion.joins]] |
[[data]] (N blocks with name / source) | [ingestion.fields] (one table, one line per field) |
[[evidence]] | [[validation.evidence]] (field → ingestion_field) |
[[rubric.rows]] | [[validation.rubric]] |
[verdict] | [validation.verdict] |
[ground_truth] / [stage.ground_truth] | [validation.ground_truth] (same keys) |
[instructions] | [[validation.instructions]] |
[validator_actions] | [validators.actions] |
[validation] + claim_ttl_minutes | [validators.claims] + duration_minutes |
[validators] | [validators] (defaults only) |
[[validators.classes]] | [[validators.classes]] |
[[validators.routes]] | [[validators.routes]] |
[[qualification]] / [qualification] | [[validators.qualification]] |
[stage.ground_truth] | [validation.ground_truth] |
[stage.attestation] | [attestation] (top-level) |
[attestation] | [attestation] (unchanged) |
Unknown keys are rejected. The parser is strict: any key or table that isn't part of the schema (a typo, an extra field, or a leftover from the flat layout such as
[[inputs]]or[stage.*]) fails at parse time with an error naming the offending key, rather than being silently ignored. Migrate old specs to the namespaced layout above.
The Data Lifecycle: From Files to Review
Before writing your spec, it is helpful to understand how Sapien transforms your files into tasks.
1. File Upload
You upload your files (CSVs, JSONs, images, or Markdown) from a project folder.
2. Ingestion
When you Ingest a project, Sapien runs your poq.toml spec to build the review items.
- The Spreadsheet Model: Sapien treats every collection of files as a temporary spreadsheet.
- Rows: Each individual item in your file (a CSV line, a JSON file, a Markdown section).
- Columns: The data inside those items (a CSV header, a JSON key, a regex capture).
- The Mapping:
[ingestion.fields]is the wiring — each table key is the column name used everywhere else; each value is<source_id>.<column>at ingest time.- Example:
finding_title = "findings.title"keeps the JSONtitleproperty under the namefinding_title.
- Example:
- The Merge (Optional): If you have multiple sources (e.g., labels in a CSV and images in a folder),
[[ingestion.joins]]lines them up into one wider spreadsheet before field projection.
3. Datapoints
Once ingestion finishes, the temporary spreadsheets are deleted. What remains are review items, stored in a database table.
- One row = one task: Each review item is a single row in the database.
- Persistent data: The keys you declared in
[ingestion.fields]are saved on that row.
4. Review
When a validator starts a task, they aren't looking at your original files. They are looking at one specific review item that was ingested.
- The
[validation.*]sections tell the UI what to show, what to ask, and how items are scored; the[validators.*]sections decide who validates, how many, pay, routing, and qualification gates.
A working spec needs at minimum: [project] with spec_version, at least one [[ingestion.sources]], [ingestion.fields] with a key named id, at least one [[validation.evidence]], at least one [[validation.rubric]], and [validators]. Every other section is optional.
The [project] section
The [project] header declares which version of the spec format you wrote your file against.
[project]
spec_version = "1"
Today the only accepted value for spec_version is "1". Any other value is rejected as soon as the file is read.
| option | accepts | required | description | example |
|---|---|---|---|---|
spec_version | string | yes | Tells Sapien which version of the spec format you wrote. Set to "1" — the only value accepted today. | "1" |
Ingestion (ingestion.*)
Everything before review: declare sources, optional joins, and the field projection that becomes each datapoint row.
The [[ingestion.sources]] section
[[ingestion.sources]] tells Sapien which files contain your project's data and how to read them. Every batch of files is its own [[ingestion.sources]] block.
One Primary Source: Sapien builds your task list from the first [[ingestion.sources]] block listed in your TOML. If you want tasks from multiple files or folders, you should group them into this first block using a glob.
- Multiple Markdown files: Use
path_glob = "reports/*.md". Every section in every matching file will become a task. - Multiple JSON folders: Use
path_glob = "{folder_a,folder_b}/*.json". Every JSON file in both folders will become a task.
If you create separate [[ingestion.sources]] blocks (with different ids), Sapien assumes they are different types of data that need to be joined together (like joining a CSV of labels to a folder of images). Data in secondary inputs will only appear in your tasks if you explicitly join them to the first input.
[[ingestion.sources]]
id = "findings"
type = "json"
path_glob = "findings/*.json"
The id is the label you pick for this batch. Wire columns in [ingestion.fields] using that id as the prefix:
[ingestion.fields]
title = "findings.title" # "findings" is the source id; ".title" is the column
With multiple inputs (CSV + image folder), ids disambiguate sources:
[[ingestion.sources]]
id = "labels"
type = "csv"
path = "labels.csv"
[[ingestion.sources]]
id = "images"
type = "file_collection"
path_glob = "images/*.jpg"
[ingestion.fields]
diagnosis = "labels.diagnosis"
Input fields
| option | accepts | required | description | example |
|---|---|---|---|---|
id | string | yes | Short name for this batch. Used in [ingestion.fields] values (findings.title), join left/right, and route match keys. Must be unique across [[ingestion.sources]]. | "findings" |
type | enum | yes | How Sapien reads this batch: csv, json, file_collection, or markdown_split. | "json" |
path | string | no (defaults to <id>.<ext> for csv/json) | Single uploaded file when this input is one file. Cannot start with / or contain ... | "findings.csv" |
path_glob | string | yes for file_collection; one of path/glob for json and markdown_split | Pattern matching many files (e.g. images/*.jpg). Use instead of path for folders. | "images/*.jpg" |
splitter.* | see below | yes for markdown_split (at least splitter.regex) | For markdown_split only — full list in Splitter settings. | splitter.regex = "^## (?P<id>.+)quot; |
Per-type field set
Each type allows a different subset of keys. The compiler rejects forbidden keys with a path-prefixed error (for example inputs[1].path_glob: only valid on file_collection or json inputs).
| Key | csv | json | file_collection | markdown_split |
|---|---|---|---|---|
id | required | required | required | required |
type | required | required | required | required |
path | optional (defaults <id>.csv) | optional (defaults <id>.json) | forbidden | optional |
path_glob | forbidden | optional | required | optional |
file_id_strategy | forbidden | forbidden | required | forbidden |
splitter.regex | forbidden | forbidden | forbidden | required |
splitter.end_regex, [[ingestion.sources.splitter.metadata]] | forbidden | forbidden | forbidden | optional |
json and markdown_split accept either path (single file) or path_glob (many files). Setting both is a compile-time error.
The markdown_split input type
Use markdown_split when one or more Markdown files should become many review items. Each regex match in splitter.regex becomes one row on that input.
Write the pattern from header lines in your file — copy the fixed prefix literally, use (?P<name>...) for the parts that change on each row:
| Header line in your Markdown (string) | splitter.regex |
|---|---|
## F-01: Missing rate limit | '^##\s+(?P<id>F-\d+):?\s+(?P<title>.+)#x27; |
## F-02: SQL injection | (same regex — one pattern matches every section header) |
Optional splitter.end_regex — stop the section body before the next header when subsections use the same ## level:
| Line where the section should end (string) | splitter.end_regex |
|---|---|
## Next finding | '^##\s+' |
Optional [[ingestion.sources.splitter.metadata]] — extract extra columns from a line in the file or in each section.
Repository row (document header table) — string → regex:
| Repository | my-repo |
'^\|\s*Repository\s*\|\s*(?P<repository>[^|]+?)\s*\|'
File line (per section) — string → regex:
- **File**: `src/auth.ts:L42`
'^-\s*\*\*File\*\*:\s*`?(?P<source_file>.+?):L(?P<line_number>\d+)'
[[ingestion.sources]]
id = "audit_findings"
type = "markdown_split"
path_glob = "reports/*.md"
splitter.regex = '^##\s+(?P<id>F-\d+):?\s+(?P<title>.+)#x27;
splitter.end_regex = '^##\s+'
[[ingestion.sources.splitter.metadata]]
scope = "document"
column = "repository"
regex = '^\|\s*Repository\s*\|\s*(?P<repository>[^|]+?)\s*\|'
Splitter settings
These keys apply only when type = "markdown_split". Regex values use RE2; Sapien adds multiline matching ((?m)) at compile time — do not prefix patterns yourself.
| option | accepts | required | description | example |
|---|---|---|---|---|
splitter.regex | string (regex) | yes (markdown_split) | Header pattern that starts each section. Each match becomes one row. Named captures ((?P<id>...)) become output columns on this input. | ## F-01: … → '^##\s+(?P<id>F-\d+):?\s+(?P<title>.+)#x27; |
splitter.end_regex | string (regex) | no | Optional early stop for body. After a header match, Sapien scans forward for this pattern; the section ends there instead of at the next header (or EOF). | ## Next … → '^##\s+' |
[[ingestion.sources.splitter.metadata]] | array of tables with fields listed below | no | Repeat to add extra output columns. Each row uses scope, column, regex, and optional capture below. | [[ingestion.sources.splitter.metadata]] with scope = "document", column = "repository" |
scope | document, section | yes (each metadata row) | document — run regex once on the full file; value copied to every row from that file. section — run regex on each split section (body span) only. | "document" |
column | string | yes (each metadata row) | Output column name. Wire in [ingestion.fields] as repository = "<input_id>.repository". | "repository" |
regex | string (regex) | yes (each metadata row) | Pattern to extract the value. Use a named capture matching column, or a single unnamed capture group. | see string → regex examples above |
capture | string | no | Named capture to read from regex when it differs from column. Defaults to column. Use when one regex fills several columns. | "line_number" |
After ingest, each split row exposes columns you can wire in [ingestion.fields] — e.g. named captures from splitter.regex, metadata column names, and built-ins such as body and row_id:
[ingestion.fields]
finding_body = "audit_findings.body"
repo = "audit_findings.repository"
The [[ingestion.joins]] section
The purpose of [[ingestion.joins]] is to line up related data from different files so they can be treated as a single task.
Use this section only if you declared more than one [[ingestion.sources]] block. It merges your separate data sources into one wider "spreadsheet" before you pick your final columns in [ingestion.fields].
[[ingestion.joins]]
left = "labels"
right = "images"
left_on = "case_id"
right_on = "file_id"
type = "left"
| option | accepts | required | description | example |
|---|---|---|---|---|
left | string | yes | Input id on the left side of the join (usually the table you want to keep all rows from). | "labels" |
right | string | yes | Input id on the right side. | "images" |
left_on | string | yes | Column on the left batch to match on. | "case_id" |
right_on | string | yes | Column on the right batch to match on — often file_id for file collections. | "file_id" |
type | enum | yes | Join mode: left keeps all left rows; inner drops non-matching rows. | "left" |
FROM-root order (required for connected joins)
In plain terms: think of building one big spreadsheet by gluing smaller ones together. Start with your main file (one row per task — e.g. labels.csv) and list it first. Each join then glues a new file onto what you already have: left is the file you already have, right is the file you're adding. Always glue new files onto the main one — never the other way around. If you flip it, the tool can't attach your main file and ingest fails.
Ingest builds SQL with the first [[ingestion.sources]] entry as the FROM root (one row per review item — map id from this source). Each [[ingestion.joins]] row adds right as a new table; left must already be in the FROM chain.
Getting this backwards (e.g. left = "images", right = "labels" when labels is primary) produces ingest SQL where the primary table never enters FROM.
# Primary CSV first, then join enrichment
[[ingestion.sources]]
id = "labels"
type = "csv"
path = "labels.csv"
[[ingestion.sources]]
id = "images"
type = "file_collection"
path_glob = "images/*.jpg"
[[ingestion.joins]]
left = "labels" # primary / already in FROM
right = "images" # new table
left_on = "image_id"
right_on = "file_id"
type = "left"
The [ingestion.fields] section
[ingestion.fields] is one flat table that defines every column on each review item. Each key is the column name used everywhere after ingest; each value is where the data comes from: <source_id>.<column>.
If you used [[ingestion.joins]] to merge multiple sources, values can reference columns from any joined source.
After ingest, those columns are stored on each review item in the database. See The Data Lifecycle.
Every project must include exactly one key named id. That key is the unique identifier for each review task.
Ten fields need about ten lines — not thirty repeating two-key blocks.
Worked example: JSON findings (finding-004.json)
Example file: datasets/test-audit-contract/findings/finding-004.json in poq-monorepo. One JSON file = one review item.
Step 1 — declare the input (gives you the findings. prefix):
[[ingestion.sources]]
id = "findings"
type = "json"
path_glob = "findings/*.json"
Step 2 — map JSON keys to review-item columns. Each row below becomes one line in [ingestion.fields]. The value uses the JSON property name after ingest; the key is what you use everywhere else — and what lands in datapoint.canonical_fields.
| JSON key in file | Value in [ingestion.fields] | Suggested key | Value in finding-004.json |
|---|---|---|---|
id | findings.id | id | "F-04" |
title | findings.title | finding_title | "Centralization risk — single EOA owner" |
description | findings.description | description | (markdown narrative) |
sourcePath | findings.sourcePath | source_path | "src/Counter.sol" |
lineNumber | findings.lineNumber | line_number | 7 |
proposedSeverity | findings.proposedSeverity | proposed_severity | "low" |
repository | findings.repository | repository | repo URL |
commitSha | findings.commitSha | commit_sha | pinned commit hash |
[ingestion.fields]
id = "findings.id"
finding_title = "findings.title"
description = "findings.description"
source_path = "findings.sourcePath"
line_number = "findings.lineNumber"
proposed_severity = "findings.proposedSeverity"
repository = "findings.repository"
commit_sha = "findings.commitSha"
Step 3 — use these keys in later sections. Reference [ingestion.fields] keys only — never raw <source_id>.<column> paths outside that table.
For example, to show the finding's description to a validator:
[[validation.evidence]]
type = "markdown"
ingestion_field = "description"
Or to route tasks based on severity:
[[validators.routes]]
match = { proposed_severity = "low" }
total = 5
You should never reference the original source (like findings.sourcePath or findings.proposedSeverity) outside of [ingestion.fields].
CSV + images example (two inputs, no JSON):
[ingestion.fields]
id = "findings.case_id"
image_path = "images.path"
| Key (table) | Value (ingest wiring) | Required | Description |
|---|---|---|---|
| (each key) | string | yes | Key — column name used everywhere else (ingestion_field, route match keys). One key must be id. Value — <source_id>.<column>. After joins, values may reference any joined source. |
Validation (validation.*)
What validators interact with and how their answers are scored: task UI, evidence, rubric, verdict rollup, ground truth, and onboarding. Who does the reviewing — counts, pay, classes, routing, claim, and qualification gates — lives in the Validators section.
Values of ingestion_field = "..." are always keys from [ingestion.fields] unless this doc marks them as literal UI copy. Keys on source_excerpt / source_link blocks (repository, path, commit_sha, …) also hold [ingestion.fields] key names.
The [[validation.evidence]] section
Each [[validation.evidence]] block is one thing validators see on the task page (an image, markdown text, a source link, a fact table, etc.). Blocks resolve per-row values from [ingestion.fields] only — do not put static markdown or catalogue text in evidence (use [[validation.instructions]] for fixed validator guidance).
When a validator opens a task, each evidence block says which fields to show and how.
Every block starts with type. After that, keys on the block map render inputs from [ingestion.fields] keys. Values on the right of ingestion_field (and on source_excerpt / source_link keys) are always [ingestion.fields] key names, not file paths or raw JSON properties.
Optional title on any evidence block sets the panel heading above that widget. The value is literal display text (not an [ingestion.fields] key). Omit title to render no heading — there are no default headings per type.
Worked example: one task page, every evidence type
This example builds on the JSON findings walkthrough. First map the columns you need in [ingestion.fields]:
[ingestion.fields]
description = "findings.description"
image_path = "images.path"
finding_json = "findings.payload"
repository = "findings.repository"
source_path = "findings.sourcePath"
commit_sha = "findings.commitSha"
line_number = "findings.lineNumber"
finding_title = "findings.title"
proposed_severity = "findings.proposedSeverity"
Add one [[validation.evidence]] block per widget. Order in the file is display order (markdown and images tend to appear first).
[[validation.evidence]]
type = "markdown"
title = "Finding"
ingestion_field = "description"
[[validation.evidence]]
type = "image"
title = "Dermoscopy image"
ingestion_field = "image_path"
[[validation.evidence]]
type = "json_finding"
ingestion_field = "finding_json"
[[validation.evidence]]
type = "source_excerpt"
repository = "repository"
path = "source_path"
commit_sha = "commit_sha"
[[validation.evidence]]
type = "source_link"
repository = "repository"
path = "source_path"
commit_sha = "commit_sha"
line_number = "line_number"
label = "Open in GitHub"
[[validation.evidence]]
type = "datapoint_facts"
fields = [
{ label = "Title", field = "finding_title" },
{ label = "Severity", field = "proposed_severity" },
]
You rarely need every type on one project — pick the blocks that match your data. A melanoma project might only declare the image block; an audit might use markdown, source_excerpt, and datapoint_facts.
How to read a block: repository = "repository" means “read the repository column from this task’s row.” The string on the right is a [ingestion.fields] key, not a GitHub URL literal.
Evidence types
type | What validators see | Wiring pattern |
|---|---|---|
markdown | Rich text (finding narrative) | ingestion_field |
image | Image from a URL column | ingestion_field |
json_finding | JSON payload (inline or URL) | ingestion_field |
source_excerpt | Pinned GitHub file in the panel | flat keys on the block |
source_link | Link out to GitHub file + line | flat keys on the block |
datapoint_facts | Label/value fact table | fields = [{ label, field }, …] on the block |
Shared on all types: optional title (literal panel heading; omit for no heading).
Single-column evidence (image, markdown, json_finding)
One column from the row — set type and ingestion_field:
[[validation.evidence]]
type = "markdown"
ingestion_field = "description"
| option | accepts | required | description | example |
|---|---|---|---|---|
type | enum | yes | image, markdown, or json_finding. | "markdown" |
ingestion_field | string | yes | The [ingestion.fields] key whose cell value feeds the widget. | "description" |
title | string | no | Literal panel heading; omit for no heading. | "Finding" |
source_excerpt
Fetches and displays source from a public GitHub repo at a pinned commit (in-panel viewer):
[[validation.evidence]]
type = "source_excerpt"
repository = "repository"
path = "source_path"
commit_sha = "commit_sha"
| option | accepts | required | description | example |
|---|---|---|---|---|
type | enum | yes | Must be source_excerpt. | "source_excerpt" |
title | string | no | Literal panel heading; omit for no heading. | "Source" |
repository | string | yes | [ingestion.fields] key for repo URL or owner/repo. | "repository" |
path | string | one of path or paths | [ingestion.fields] key for one file path in the repo. | "source_path" |
paths | string | one of path or paths | [ingestion.fields] key whose cell is an array of paths (UI shows a file picker). | "source_paths" |
commit_sha | string | no | [ingestion.fields] key for commit hash; UI defaults to main when empty. | "commit_sha" |
Use path or paths, not both.
source_link
Compact link to a pinned GitHub file (opens in a new tab). Verifies the path exists at the commit before linking.
[[validation.evidence]]
type = "source_link"
repository = "repository"
path = "source_path"
commit_sha = "commit_sha"
line_number = "line_number"
label = "Open in GitHub"
| option | accepts | required | description | example |
|---|---|---|---|---|
type | enum | yes | Must be source_link. | "source_link" |
title | string | no | Literal panel heading; omit for no heading. | "Open source" |
repository | string | yes | [ingestion.fields] key for repo URL or owner/repo. | "repository" |
path | string | yes | [ingestion.fields] key for file path in the repo. | "source_path" |
commit_sha | string | yes | [ingestion.fields] key for pinned commit hash. | "commit_sha" |
line_number | string | no | [ingestion.fields] key for line number (GitHub line anchor). | "line_number" |
label | string | no | Literal button copy, or a [ingestion.fields] key when label text varies per row. | "Open in GitHub" |
datapoint_facts
Structured label/value rows (severity, id, found-by, etc.). Declare fields as an inline array on the block — each element has label (UI copy) and field (an [ingestion.fields] key):
[[validation.evidence]]
type = "datapoint_facts"
title = "Finding facts"
fields = [
{ label = "Title", field = "finding_title" },
{ label = "Severity", field = "proposed_severity" },
]
| option | accepts | required | description | example |
|---|---|---|---|---|
type | enum | yes | Must be datapoint_facts. | "datapoint_facts" |
title | string | no | Literal panel heading above the fact table; omit for none. | "Finding facts" |
fields | array of tables | yes | Ordered rows shown in the fact table. At least one element required. | see example above |
fields[].label | string | yes (each fields element) | Fixed heading beside the value — UI copy, not a column. | "Severity" |
fields[].field | string | yes (each fields element) | [ingestion.fields] key whose cell value is shown. | "proposed_severity" |
The [[validation.rubric]] section
Each [[validation.rubric]] block is one question validators answer on the task page — its label, answer buttons (scale.*), and how that single question is scored.
This section is only about questions. It does not set the overall pass/fail for the whole item; that optional rollup lives in the separate [validation.verdict] section below.
[[validation.rubric]]
id = "severity"
label = "Severity"
prompt = "How severe is this finding?"
scale.type = "ordinal"
scale.labels = ["info", "low", "medium", "high", "critical"]
consensus_weight = 1.0
Add one [[validation.rubric]] block per question. Reference rows elsewhere by id (for example route match keys, verdict.match skip rules, or ground-truth wiring).
Rubric row fields
Each [[validation.rubric]] block sets question copy, scale (scale.*), optional verdict overrides (verdict.verified_threshold, verdict.match), and consensus weight on the same row.
| option | accepts | required | description | example |
|---|---|---|---|---|
id | string | yes | Internal name for this question — unique within the rubric. | "severity" |
label | string | yes | Heading validators see above the answer buttons. | "Severity" |
prompt | string | no | Extra question text under the label. | "How severe is this finding?" |
tooltip | string | no | Help text below the buttons (visible, not hover-only). | "Match severity to OWASP guidance." |
role | string | no | Tag for UI affordances (e.g. influence_gauge, certainty). | "influence_gauge" |
scale.type | enum | yes | Answer template: likert, ordinal, numeric. | "ordinal" |
scale.size | int | yes (likert only) | Likert point count — must be 5 or 7 when scale.type is likert. | 5 |
scale.labels | array of strings | yes (ordinal); no (likert); optional (numeric) | Button labels. Must match scale.size on likert rows if set. | ["low", "medium", "high"] |
scale.values | array of numbers | yes (numeric) | Stored scores (0–100). | [0, 50, 100] |
verdict.verified_threshold | number | no | Share of validators that must agree on the winning label for verified on this row. See Row-level verdict. | 0.85 |
verdict.match | map (row id → labels) | no | Skip this row (n_a) when another row’s consensus matches a trigger label. | { validity = ["False positive"] } |
consensus_weight | number | no (default 1.0) | How much this row counts in consensus deviation math. 0 excludes the row (advisory only). | 1.0 |
Do not rename consensus_weight to importance. Removed keys verdict.type and verdict.contested_rule.* fail parse.
The scale block (scale.*)
There is no separate top-level [scale] section. On each [[validation.rubric]] block, dotted keys (scale.type, scale.size, …) define the answer buttons validators see and the 0–100 number stored when they pick one.
At compile time the server turns every scale into a list of (label, value) pairs. Validators click labels; consensus and storage always use the numeric values. Different rows in the same project can use different scale types (e.g. severity as ordinal, confidence as numeric, diagnosis as 7-point Likert).
scale.type — which answer template
Required. Picks one of three templates:
scale.type | Use for | Also set on the row |
|---|---|---|
likert | Symmetric agree/disagree or frequency scales (“Strongly disagree” … “Strongly agree”, “Never” … “Always”) | scale.size = 5 or 7; optional scale.labels (must match size) |
ordinal | Your own ordered categories (severity, validity, …) | Required scale.labels — one label per button; count sets the number of options |
numeric | Explicit 0–100 steps you choose (e.g. certainty 0/25/50/75/100) | Required scale.values; optional matching scale.labels |
For ordinal and likert types, the compiler auto-generates evenly spaced 0–100 values from the label count or scale.size. For numeric, you declare the stored scores in scale.values.
AI validator classes: When any [[validators.classes]] row has type = "ai", every rubric scale must resolve to integer-only anchors (the AI worker constrains model output to integers per row). Practical rules: likert with scale.size = 5 is fine; scale.size = 7 produces fractional steps and is rejected. ordinal label counts must satisfy (count - 1) divides 100 (e.g. five labels). numeric scales must use integer scale.values. See AI agent classes for class fields.
Examples by scale type
Ordinal (multi-bucket) — labels required; values auto-spaced:
[[validation.rubric]]
id = "severity"
label = "Severity"
scale.type = "ordinal"
scale.labels = ["info", "low", "medium", "high", "critical"]
Ordinal (binary yes/no) — still scale.type = "ordinal", but exactly two labels:
[[validation.rubric]]
id = "is_valid"
label = "Is this finding valid?"
scale.type = "ordinal"
scale.labels = ["no", "yes"]
Skip a row when another row’s consensus matches — use verdict.match on the dependent row:
[[validation.rubric]]
id = "validity"
label = "Validity"
prompt = "Is this finding a real issue?"
scale.type = "ordinal"
scale.labels = ["False positive", "Unlikely valid", "Unclear", "Likely valid", "Clearly valid"]
[[validation.rubric]]
id = "fix_soundness"
label = "Fix soundness"
prompt = "How sound is the proposed fix?"
scale.type = "ordinal"
scale.labels = ["Unsound", "Weak", "Partial", "Mostly sound", "Fully sound"]
verdict.match = { validity = ["False positive"] }
When validators agree the finding is a False positive, fix_soundness is treated as n/a — see Row-level verdict.
Likert — size required; labels optional:
[[validation.rubric]]
id = "diagnosis_match"
label = "Does the diagnosis match?"
scale.type = "likert"
scale.size = 7
Numeric — values required; labels optional:
[[validation.rubric]]
id = "certainty"
label = "How certain are you?"
scale.type = "numeric"
scale.values = [0, 25, 50, 75, 100]
scale.labels = ["none", "low", "medium", "high", "certain"]
[[validation.rubric]]
id = "confidence"
label = "Confidence Level"
prompt = "How confident are you in this assessment?"
scale.type = "numeric"
scale.values = [0, 20, 80, 100]
scale.labels = ["Guessing", "Low", "High", "Certain"]
Row-level verdict
Each [[validation.rubric]] block can carry optional verdict.verified_threshold and verdict.match. After validators submit, the engine assigns one outcome per row — then, if you declared [validation.verdict], rolls those up into one outcome per item.
Two levels — same word, different jobs
[[validation.rubric]].verdict.* → per-question outcome (verified, insufficient_signal, n_a)
[validation.verdict] → how row outcomes combine for the whole item
Do not confuse per-row verdict.* overrides with the item-level [validation.verdict] section below.
Row outcomes (runtime)
These are computed — not TOML keys you set:
| outcome | Meaning |
|---|---|
verified | Enough validators agreed on the same answer (default ≥ 80% on the winning label). |
insufficient_signal | Not enough agreement to call verified (e.g. scattered votes with no clear winner). |
n_a | This question does not apply — set by verdict.match when another row’s consensus matches a trigger label. Skipped in item-level rollup. |
How each row is classified
- If
verdict.matchmatches referenced rows’ consensus labels →n_a - If there are no valid votes or ties cannot be resolved →
insufficient_signal - If the winning label’s vote share ≥
verified_threshold(default 0.8) →verified - Otherwise →
insufficient_signal
Optional row verdict keys
| option | accepts | required | description | example |
|---|---|---|---|---|
verdict.verified_threshold | number | no | Share of validators that must agree on the winning label for verified. Defaults to 0.8, or [validation.verdict].verified_threshold when set. | 0.85 |
verdict.match | map (row id → labels) | no | Rubric row id keys; values are consensus labels that trigger n_a (scalar or array). | { validity = ["False positive"] } |
Removed keys (parse error if present): verdict.type, verdict.contested_rule.*, [stage.verdict].
See Examples by scale type for full row snippets including verdict.match skip rules.
The [validation.verdict] section (datapoint-level)
Optional. Use when multiple [[validation.rubric]] questions must combine into one outcome for the whole item — for example KYC where ID match and document authenticity both must pass.
This is item-level composition. Per-row outcomes (verified, insufficient_signal, n_a) come from Row-level verdict on each [[validation.rubric]] block.
Skip [validation.verdict] when one rubric row is enough or when you do not need a single rolled-up item outcome.
[validation.verdict]
composition = "strict_and"
verified_threshold = 0.8
| option | accepts | required | description | example |
|---|---|---|---|---|
composition | enum | no (default strict_and) | How row outcomes combine into one item result. Only strict_and is supported today. | "strict_and" |
verified_threshold | number | no | Agreement threshold (0.0–1.0) for verified consensus across validators. | 0.8 |
tiebreakers | array of strings | no | Tie-break rules when validators disagree. Seniority tiebreaker is added automatically when validator classes exist. | ["majority", "validator:class_priority"] |
The [validation.ground_truth] section
Use [validation.ground_truth] when your dataset includes columns validators should not see directly (golden labels, difficulty bins) or columns shown as hints (model predictions, pre-filled labels). Values are [ingestion.fields] key names, not raw source paths.
Skip when there is no answer key or prefill (e.g. open-ended preference data).
[validation.ground_truth]
golden_label_field = "label"
prefill_hint_field = "label_shown"
difficulty_field = "difficulty"
| option | accepts | required | description | example |
|---|---|---|---|---|
golden_label_field | string | no | [ingestion.fields] key with the secret correct answer. Stripped before the validator UI. | "label" |
prefill_hint_field | string | no | [ingestion.fields] key with a hint or model prediction shown to validators. | "label_shown" |
difficulty_field | string | no | [ingestion.fields] key used for difficulty routing/binning. Stripped before the validator UI. | "difficulty" |
Future: Per-rubric-row
[[validation.answer_key]]and[[validation.prefill]]blocks may replace this table in a follow-up change.
The [[validation.instructions]] section
[[validation.instructions]] holds onboarding slides for validators. Currently not rendered — the onboarding modal uses hardcoded copy.
Use TOML literal strings ('''…''', triple single quotes) for markdown content and any other verbatim user or policy text. Literal strings do not process backslash escapes, so Markdown escapes like \<NONE\> in tables parse correctly. Do not use basic strings ("""…""") for slide bodies — invalid escapes (e.g. \<) fail TOML parse with invalid escaped character. Reserve basic strings for short values that need escapes (e.g. delimiter = "\n\n---\n" under attestation).
[[validation.instructions]]
type = "markdown"
content = '''
Rate each finding using the OWASP severity definitions.
'''
[[validation.instructions]]
type = "image"
src = "instructions/severity.png"
caption = "Severity ladder"
| option | accepts | required | description | example |
|---|---|---|---|---|
type | string | yes | markdown or image. | "markdown" |
content | string | yes (markdown slides) | Markdown slide body; use '''…''' literals. | see example above |
src | string | yes (image slides) | Image path under the project upload prefix. | "instructions/severity.png" |
caption | string | no | Caption under an image slide. | "Severity ladder" |
The same '''…''' literal rule applies to prompt on [[validators.classes]] when type = "ai" — prompts often embed sample rows or policy text with backslashes.
Validators (validators.*)
Who does the reviewing and under what terms: default validator count/pay/stake, validator classes (human or AI), per-item routing and escalation, claim duration, qualification gates, and task-page actions. These were previously at the top level or under [stage.*].
The [validators.claims] section
Optional claim duration — how long a validator has to finish one item after claiming it. Default 5 minutes.
[validators.claims]
duration_minutes = 45
| option | accepts | required | description | example |
|---|---|---|---|---|
duration_minutes | int | no (default 5) | Minutes before an unfinished claim expires. Max 24 hours. | 45 |
The [validators.actions] section
Optional validator task-page actions. Today: conflict-of-interest self-decline.
[validators.actions]
allow_conflict_of_interest_self_decline = true
| option | accepts | required | description | example |
|---|---|---|---|---|
allow_conflict_of_interest_self_decline | bool | no | When true, validators can decline a claimed item due to conflict of interest. | true |
conflict_of_interest_decline_button_label | string | no | Button label override. | "Decline due to conflict" |
conflict_of_interest_decline_confirm_title | string | no | Confirmation dialog title override. | "Decline this validation?" |
conflict_of_interest_decline_confirm_body | string | no | Confirmation dialog body override. | "Use this when you have a conflict of interest." |
The [validators] section
Default validator count, pay, and stake. Declare [[validators.classes]] and [[validators.routes]] separately when you need personas or item-specific routing.
[validators]
num_validators = 3
reward_usd = "1.00"
stake_usd = "0.00"
| option | accepts | required | description | example |
|---|---|---|---|---|
num_validators | int | yes | Default validators per item when no route matches. | 3 |
reward_usd | string | yes | Default reward per completed review (quoted decimal string). | "1.00" |
stake_usd | string | yes | Default stake required to claim ("0.00" = none). | "0.00" |
The [[validators.classes]] section
Under [validators.*], classes are validator personas (pay, stake, consensus weight, and optionally an LLM agent) — not rubric answer types.
Each class is filled either by humans claiming slots in the portal (type = "human", the default when type is omitted) or by an AI agent (type = "ai") backed by a synthetic system user and the server AI worker. Routing, composition, and escalation treat both kinds the same — you can compose multiple AI classes on the default route and escalate to a human senior reviewer when consensus is low.
Class fields
| option | accepts | required | description | example |
| ------------------ | ------- | ------------------ | --------------------------------------------------------------------------------------- | -------------------------------------------------------------------------- | ------ |
| id | string | yes | Class id used in routes. | "senior" |
| label | string | no | Display name in admin UI. | "Senior auditor" |
| type | human, ai | no (default human) | Who fills slots for this class. See AI agent classes. | "ai" |
| model | string | when type = "ai" | OpenRouter model slug — must be one of Allowed model values. | "anthropic/claude-sonnet-4.5" |
| prompt | string | when type = "ai" | System prompt sent to the model (max 64 KiB); use '''…''' literals for verbatim text. | see AI example below |
| reward_usd | string | yes | Per-review reward for this class. | "10.00" |
| stake_usd | string | yes | Per-review stake for this class. | "0.00" |
| priority | int | yes | Tie-break priority (higher = stronger). | 10 |
| consensus_weight | number | no (default 1.0) | Per-class vote weight in weighted consensus. | 1.0 |
Human example:
[[validators.classes]]
id = "senior"
label = "Senior reviewer"
reward_usd = "20.00"
stake_usd = "0.00"
priority = 10
AI agent classes
Set type = "ai" and declare model plus prompt to fill a class with an LLM instead of human validators. The worker polls unclaimed validations for that class, packages the same evidence and rubric a human would see, calls the model, and submits via the standard validation path (audit, staking, and consensus unchanged).
[[validators.classes]]
id = "opus-reviewer"
label = "Opus 4.7 reviewer"
type = "ai"
model = "anthropic/claude-opus-4.7"
prompt = '''
You are an expert reviewer. Score each rubric row using only the discrete
anchors provided. Ground every judgment in the evidence; if evidence is
insufficient, pick the midpoint and say so in the rationale.
'''
priority = 30
reward_usd = "0.00"
stake_usd = "0.00"
Constraints
modelmust be an OpenRouter slug from the allowlist below. Compile-time validation rejects unknown slugs and echoes this list in the error.promptis required fortype = "ai"and is sent verbatim as the system message (max 64 KiB).type = "human"(or omittedtype) must not setmodelorprompt— misplaced AI-only fields are rejected.- AI classes have no wallet.
reward_usdandstake_usddefault to"0.00"when omitted; non-zero stake can be set but is not slashable for the synthetic user. consensus_weightdefaults to1.0unless you explicitly weight AI votes differently.- Rubric scales must use integer-only anchors when any AI class is declared (see The scale block).
Allowed model values
Each slug is an OpenRouter model id. All entries support multimodal evidence (images in messages) and strict JSON-schema response_format for rubric vectors.
model slug | Context (tokens) | Notes |
|---|---|---|
anthropic/claude-haiku-4.5 | 200,000 | Lowest-latency Anthropic option; good for high-volume first-pass classes. |
anthropic/claude-opus-4.6 | 200,000 | Prior Opus generation; kept for reproducibility of votes pinned to this version. |
anthropic/claude-opus-4.7 | 200,000 | Latest Opus; highest-quality reasoning; high-stakes or final-step reviewers. |
anthropic/claude-sonnet-4.5 | 200,000 | Default for most AI reviewer classes — strong vision, fast, cheaper than Opus. |
google/gemini-2.5-flash | 1,000,000 | Fast and cheap with full multimodal support. |
google/gemini-2.5-pro | 1,000,000 | Long context, strong vision; best when evidence payloads are large. |
x-ai/grok-4.3 | 1,000,000 | xAI flagship; vision + structured outputs; vendor diversity. |
meta-llama/llama-4-maverick | 1,048,576 | Open-weight multimodal flagship; useful for diversity in multi-agent compositions. |
mistralai/mistral-medium-3.1 | 131,072 | European-hosted multimodal option. |
qwen/qwen3-vl-235b-a22b-instruct | 131,072 | Open-weight multimodal (Alibaba); broad provider coverage. |
openai/gpt-4o | 128,000 | Strong multimodal generalist; diversity in multi-agent compositions. |
openai/gpt-4o-mini | 128,000 | Cheap, fast multimodal; first-pass or budget-bound classes. |
Example route mixing AI panel + human escalation:
[[validators.routes]]
total = 3
[[validators.routes.composition]]
class = "opus-reviewer"
count = 1
[[validators.routes.composition]]
class = "sonnet-reviewer"
count = 1
[[validators.routes.composition]]
class = "maverick-reviewer"
count = 1
[[validators.routes.escalation]]
match = { consensus_below = 0.6 }
add = 1
[[validators.routes.escalation.composition]]
class = "senior"
count = 1
Fixture: server/internal/projectspec/testdata/validators_ai.poq.toml. Design notes: issue #633; implementation server/internal/aivalidator.
The [[validators.routes]] section
Under [validators.*], routes assign validator counts and class slots to matching items — not HTTP or API routes.
Route fields
Use match to select items by [ingestion.fields] column names. A key may be a scalar (exact match) or an array (any listed value). Omit match on exactly one catch-all route (or use match = {}).
[[validators.routes]]
match = { proposed_severity = ["high", "critical"], image_type = "skin_lesion" }
total = 5
[[validators.routes.composition]]
class = "senior"
count = 2
[[validators.routes]]
total = 3 # catch-all — no match
[[validators.routes.escalation]]
match = { consensus_below = 0.6 }
add = 2
consensus_below is a reserved runtime metric (not an [ingestion.fields] key). Other match keys must match a [ingestion.fields] key.
| option | accepts | required | description | example |
|---|---|---|---|---|
match | table | no (omit on exactly one default route) | Match review items: [ingestion.fields] key → scalar or array of allowed values. | match = { difficulty = "hard" } |
total | int | yes | Validators per matching item. | 5 |
composition | array of tables | yes | Class slots: { class, count }. Use class = "*" for any class. | { class = "senior", count = 2 } |
escalation | array of tables | no | Add validators when agreement is below threshold. Use match.consensus_below on escalation steps. | match = { consensus_below = 0.5 }, add = 2 |
The [[validators.qualification]] section
Validators fill out [[validators.qualification]] before claiming items.
[[validators.qualification]]
key = "audit_firm"
label = "Audit firm"
type = "enum"
required = true
| option | accepts | required | description | example |
|---|---|---|---|---|
key | string | yes | Stored profile key (snake_case, unique). | "audit_firm" |
label | string | yes | Form question text. | "Audit firm" |
type | enum | no (default text) | Input type — only text today. | "text" |
required | bool | no (default false) | Must be filled before claim when true. | true |
Attestation (attestation.*)
Signed PoQ report export — the output phase after review completes.
The [attestation] section
Signed PoQ report export settings. This section controls how the signed YAML attestation is constructed and formatted for download.
[attestation]
schema_version = "poq.attestation/v1"
signing_key_id = "poq-prod-2024"
[attestation.payload]
include_per_validator_votes = true
include_rationales_inline = false
outputs_alias = "findings"
[attestation.payload.outputs]
report_hash = "audit_report_hash"
severity_in_report = "proposed_severity"
repo_url = "repository"
commit_hash = "commit_sha"
[[attestation.metadata]]
field_name = "report_source"
channel = "portal"
sender = "0xC921CF11A568142223a52C7f0b4AE7023fb3326B"
[attestation.output]
delimiter = "\n\n---\n"
fence = "yaml"
Attestation fields
| option | accepts | required | description | example |
|---|---|---|---|---|
schema_version | string | yes | Must be poq.attestation/v1 for the current signer. | "poq.attestation/v1" |
signing_key_id | string | yes | Matches the server-configured signing key id (e.g. sapien-prod-ed25519-v1). | "poq-prod-2024" |
payload | table | yes | Nested table for payload options (see below). | [attestation.payload] |
metadata | array | no | Extra signed report header fields (see below). | [[attestation.metadata]] |
output | table | yes | Download formatting for appended mode (see below). | [attestation.output] |
The [attestation.payload] block
| option | accepts | required | description | example |
|---|---|---|---|---|
include_per_validator_votes | bool | no | When true, the signed YAML includes each validator's specific answer for every rubric row. | true |
include_rationales_inline | bool | no | When false, rationales use opaque references instead of inline text. | false |
outputs_alias | string | no | Names the main findings array in the YAML. Use "findings" for security audits; defaults to "sub_reports". | "findings" |
The [attestation.payload.outputs] block
Maps attestation fields to canonical field names defined in [ingestion.fields].
| option | accepts | required | description | example |
|---|---|---|---|---|
report_hash | string | no | Field holding the hash of the original input document. | "audit_report_hash" |
severity_in_report | string | no | Field holding the original tool-reported severity. | "proposed_severity" |
repo_url | string | no | Field holding the repository URL (emitted in audit_target). | "repository" |
commit_hash | string | no | Field holding the pinned commit SHA (emitted in audit_target). | "commit_sha" |
The [[attestation.metadata]] block
Used for literal top-level YAML metadata blocks.
| option | accepts | required | description | example |
|---|---|---|---|---|
field_name | string | yes | The top-level YAML key name (e.g. report_source). | "report_source" |
| (others) | any | no | Any other keys in the table are emitted as key-value pairs under field_name. | channel = "portal" |
The [attestation.output] block
| option | accepts | required | description | example |
|---|---|---|---|---|
delimiter | string | yes | String inserted between the original document and the attestation in "appended" mode. | "\n\n---\n" |
fence | enum | no | Optional markdown fence type. Only "yaml" is supported today (wraps the attestation in yaml ... ). | "yaml" |
Examples by project type
Smart contract audit findings
[project]
spec_version = "1"
[[ingestion.sources]]
id = "audit_report"
type = "markdown_split"
path_glob = "reports/*.md"
splitter.regex = '^##\s+(?P<id>F-\d+):?\s+(?P<title>.+)#x27;
[ingestion.fields]
id = "audit_report.id"
description = "audit_report.body"
[[validation.evidence]]
type = "markdown"
ingestion_field = "description"
[[validation.evidence]]
type = "source_excerpt"
repository = "repository"
path = "source_path"
commit_sha = "commit_sha"
[[validation.rubric]]
id = "severity"
label = "Severity"
scale.type = "ordinal"
scale.labels = ["info", "low", "medium", "high", "critical"]
[validators]
num_validators = 3
reward_usd = "5.00"
stake_usd = "0.00"
Medical imaging (melanoma)
[project]
spec_version = "1"
[[ingestion.sources]]
id = "labels"
type = "csv"
path = "labels.csv"
[[ingestion.sources]]
id = "images"
type = "file_collection"
path_glob = "images/*.jpg"
[[ingestion.joins]]
left = "labels"
right = "images"
left_on = "image_id"
right_on = "file_id"
type = "left"
[ingestion.fields]
id = "labels.image_id"
image_url = "images.url"
label = "labels.label"
label_shown = "labels.label_shown"
difficulty = "labels.difficulty"
[validation.ground_truth]
golden_label_field = "label"
prefill_hint_field = "label_shown"
difficulty_field = "difficulty"
[[validation.evidence]]
type = "image"
ingestion_field = "image_url"
[[validation.rubric]]
id = "diagnosis"
label = "Diagnosis"
scale.type = "likert"
scale.size = 5
[validators]
num_validators = 3
reward_usd = "1.00"
stake_usd = "0.00"
[[validators.routes]]
match = { difficulty = "hard" }
total = 5
KYC document review
[project]
spec_version = "1"
[[ingestion.joins]]
left = "applicants"
right = "documents"
left_on = "user_id"
right_on = "owner_id"
type = "inner"
[[validation.rubric]]
id = "id_match"
label = "Does the ID match?"
scale.type = "ordinal"
scale.labels = ["no", "yes"]
verdict.verified_threshold = 0.8
[[validation.rubric]]
id = "doc_authentic"
label = "Is the document authentic?"
scale.type = "ordinal"
scale.labels = ["no", "yes"]
verdict.verified_threshold = 0.8
[validation.verdict]
composition = "strict_and"
[validators]
num_validators = 3
reward_usd = "0.50"
stake_usd = "0.00"
[validators.actions]
allow_conflict_of_interest_self_decline = true
For developers
On-disk TOML in this reference is the authoring shape accepted by the parser (ingestion.*, validation.*, validators.*, top-level [attestation]). Legacy [stage.*] and [[inputs]] roots are no longer accepted.
Compiled runtime JSON still uses internal names (rubric_rows, ground_truth, evidence_blocks); mapping from the namespaced TOML layout to compile output is 1