poq.toml Specification

The poq.toml file for each Sapien project is its system spec. It turns the data you upload into a working project: the fields on each review item, what validators see, the questions they answer, and how they are scored.

Every poq.toml has three phases (ingestion, validation, attestation) plus a validators config and file metadata:

Phase / configNamespacePurpose
File metadata[project]Spec format version (spec_version) and optional tag
Ingestion[ingestion.*]Sources, joins, field projection → datapoints
Validation[validation.*]Task UI, rubric, scoring
Validators[validators.*]Who reviews: counts, pay, classes, routing, qualification
Attestation[attestation.*]Signed PoQ report export

Ingestion — Turn uploaded files into datapoint rows: declare sources, optional joins, and which columns each task carries.

  • [[ingestion.sources]](#the-ingestionsources-section) (required) — Where your data lives and how Sapien reads it.
  • [[ingestion.joins]](#the-ingestionjoins-section) (optional) — Combine separate sources before field projection.
  • [ingestion.fields](#the-ingestionfields-section) (required) — Name each column on a task and say which file it came from (e.g. title = "findings.title").

Validation — What validators see on each task, the questions they answer, and how answers are scored.

  • [[validation.evidence]](#the-validationevidence-section) (required) — What validators see on the task page.
  • [[validation.rubric]](#the-validationrubric-section) (required) — Questions validators answer.
  • [validation.verdict](#the-validationverdict-section) (optional) — Item-level pass/fail rollup across rubric rows.
  • [validation.ground_truth](#the-validationground_truth-section) (optional) — Secret labels, prefill hints, and difficulty binning.
  • [[validation.instructions]](#the-validationinstructions-section) (optional) — Onboarding slides. Currently not rendered.

Validators — Who reviews each item, how many validators, pay and stake, routing by field values, and gates before claim.

  • [validators](#the-validators-section) (required) — Default validator count, pay, and stake.
  • [[validators.classes]](#the-validatorsclasses-section) (optional) — Validator personas (human or AI), pay, and consensus weight.
  • [[validators.routes]](#the-validatorsroutes-section) (optional) — Route items to validator counts and classes.
  • [[validators.qualification]](#the-validatorsqualification-section) (optional) — Profile fields required before claim.
  • [validators.claims](#the-validatorsclaims-section) (optional) — Claim TTL before a task expires.
  • [validators.actions](#the-validatorsactions-section) (optional) — Task-page actions (e.g. conflict-of-interest self-decline).

Attestation — Signed proof-of-quality report export: schema, signing key, payload fields, and output format.

  • [attestation](#the-attestation-section) (optional) — Signed report download settings.

Migration from the flat layout

TodayProposed
[[inputs]][[ingestion.sources]]
[[data.joins]][[ingestion.joins]]
[[data]] (N blocks with name / source)[ingestion.fields] (one table, one line per field)
[[evidence]][[validation.evidence]] (fieldingestion_field)
[[rubric.rows]][[validation.rubric]]
[verdict][validation.verdict]
[ground_truth] / [stage.ground_truth][validation.ground_truth] (same keys)
[instructions][[validation.instructions]]
[validator_actions][validators.actions]
[validation] + claim_ttl_minutes[validators.claims] + duration_minutes
[validators][validators] (defaults only)
[[validators.classes]][[validators.classes]]
[[validators.routes]][[validators.routes]]
[[qualification]] / [qualification][[validators.qualification]]
[stage.ground_truth][validation.ground_truth]
[stage.attestation][attestation] (top-level)
[attestation][attestation] (unchanged)

Unknown keys are rejected. The parser is strict: any key or table that isn't part of the schema (a typo, an extra field, or a leftover from the flat layout such as [[inputs]] or [stage.*]) fails at parse time with an error naming the offending key, rather than being silently ignored. Migrate old specs to the namespaced layout above.

The Data Lifecycle: From Files to Review

Before writing your spec, it is helpful to understand how Sapien transforms your files into tasks.

1. File Upload

You upload your files (CSVs, JSONs, images, or Markdown) from a project folder.

2. Ingestion

When you Ingest a project, Sapien runs your poq.toml spec to build the review items.

  • The Spreadsheet Model: Sapien treats every collection of files as a temporary spreadsheet.
    • Rows: Each individual item in your file (a CSV line, a JSON file, a Markdown section).
    • Columns: The data inside those items (a CSV header, a JSON key, a regex capture).
  • The Mapping: [ingestion.fields] is the wiring — each table key is the column name used everywhere else; each value is <source_id>.<column> at ingest time.
    • Example: finding_title = "findings.title" keeps the JSON title property under the name finding_title.
  • The Merge (Optional): If you have multiple sources (e.g., labels in a CSV and images in a folder), [[ingestion.joins]] lines them up into one wider spreadsheet before field projection.

3. Datapoints

Once ingestion finishes, the temporary spreadsheets are deleted. What remains are review items, stored in a database table.

  • One row = one task: Each review item is a single row in the database.
  • Persistent data: The keys you declared in [ingestion.fields] are saved on that row.

4. Review

When a validator starts a task, they aren't looking at your original files. They are looking at one specific review item that was ingested.

  • The [validation.*] sections tell the UI what to show, what to ask, and how items are scored; the [validators.*] sections decide who validates, how many, pay, routing, and qualification gates.

A working spec needs at minimum: [project] with spec_version, at least one [[ingestion.sources]], [ingestion.fields] with a key named id, at least one [[validation.evidence]], at least one [[validation.rubric]], and [validators]. Every other section is optional.

The [project] section

The [project] header declares which version of the spec format you wrote your file against.

[project]
spec_version = "1"

Today the only accepted value for spec_version is "1". Any other value is rejected as soon as the file is read.

optionacceptsrequireddescriptionexample
spec_versionstringyesTells Sapien which version of the spec format you wrote. Set to "1" — the only value accepted today."1"

Ingestion (ingestion.*)

Everything before review: declare sources, optional joins, and the field projection that becomes each datapoint row.

The [[ingestion.sources]] section

[[ingestion.sources]] tells Sapien which files contain your project's data and how to read them. Every batch of files is its own [[ingestion.sources]] block.

One Primary Source: Sapien builds your task list from the first [[ingestion.sources]] block listed in your TOML. If you want tasks from multiple files or folders, you should group them into this first block using a glob.

  • Multiple Markdown files: Use path_glob = "reports/*.md". Every section in every matching file will become a task.
  • Multiple JSON folders: Use path_glob = "{folder_a,folder_b}/*.json". Every JSON file in both folders will become a task.

If you create separate [[ingestion.sources]] blocks (with different ids), Sapien assumes they are different types of data that need to be joined together (like joining a CSV of labels to a folder of images). Data in secondary inputs will only appear in your tasks if you explicitly join them to the first input.


[[ingestion.sources]]
id = "findings"
type = "json"
path_glob = "findings/*.json"

The id is the label you pick for this batch. Wire columns in [ingestion.fields] using that id as the prefix:

[ingestion.fields]
title = "findings.title"    # "findings" is the source id; ".title" is the column

With multiple inputs (CSV + image folder), ids disambiguate sources:


[[ingestion.sources]]
id = "labels"
type = "csv"
path = "labels.csv"

[[ingestion.sources]]
id = "images"
type = "file_collection"
path_glob = "images/*.jpg"

[ingestion.fields]
diagnosis = "labels.diagnosis"

Input fields

optionacceptsrequireddescriptionexample
idstringyesShort name for this batch. Used in [ingestion.fields] values (findings.title), join left/right, and route match keys. Must be unique across [[ingestion.sources]]."findings"
typeenumyesHow Sapien reads this batch: csv, json, file_collection, or markdown_split."json"
pathstringno (defaults to <id>.<ext> for csv/json)Single uploaded file when this input is one file. Cannot start with / or contain ..."findings.csv"
path_globstringyes for file_collection; one of path/glob for json and markdown_splitPattern matching many files (e.g. images/*.jpg). Use instead of path for folders."images/*.jpg"
splitter.*see belowyes for markdown_split (at least splitter.regex)For markdown_split only — full list in Splitter settings.splitter.regex = "^## (?P<id>.+)
quot;

Per-type field set

Each type allows a different subset of keys. The compiler rejects forbidden keys with a path-prefixed error (for example inputs[1].path_glob: only valid on file_collection or json inputs).

Keycsvjsonfile_collectionmarkdown_split
idrequiredrequiredrequiredrequired
typerequiredrequiredrequiredrequired
pathoptional (defaults <id>.csv)optional (defaults <id>.json)forbiddenoptional
path_globforbiddenoptionalrequiredoptional
file_id_strategyforbiddenforbiddenrequiredforbidden
splitter.regexforbiddenforbiddenforbiddenrequired
splitter.end_regex, [[ingestion.sources.splitter.metadata]]forbiddenforbiddenforbiddenoptional

json and markdown_split accept either path (single file) or path_glob (many files). Setting both is a compile-time error.

The markdown_split input type

Use markdown_split when one or more Markdown files should become many review items. Each regex match in splitter.regex becomes one row on that input.

Write the pattern from header lines in your file — copy the fixed prefix literally, use (?P<name>...) for the parts that change on each row:

Header line in your Markdown (string)splitter.regex
## F-01: Missing rate limit'^##\s+(?P<id>F-\d+):?\s+(?P<title>.+)
#x27;
## F-02: SQL injection(same regex — one pattern matches every section header)

Optional splitter.end_regex — stop the section body before the next header when subsections use the same ## level:

Line where the section should end (string)splitter.end_regex
## Next finding'^##\s+'

Optional [[ingestion.sources.splitter.metadata]] — extract extra columns from a line in the file or in each section.

Repository row (document header table) — string → regex:

| Repository | my-repo |
'^\|\s*Repository\s*\|\s*(?P<repository>[^|]+?)\s*\|'

File line (per section) — string → regex:

- **File**: `src/auth.ts:L42`
'^-\s*\*\*File\*\*:\s*`?(?P<source_file>.+?):L(?P<line_number>\d+)'

[[ingestion.sources]]
id = "audit_findings"
type = "markdown_split"
path_glob = "reports/*.md"
splitter.regex = '^##\s+(?P<id>F-\d+):?\s+(?P<title>.+)
#x27;
splitter.end_regex = '^##\s+' [[ingestion.sources.splitter.metadata]] scope = "document" column = "repository" regex = '^\|\s*Repository\s*\|\s*(?P<repository>[^|]+?)\s*\|'

Splitter settings

These keys apply only when type = "markdown_split". Regex values use RE2; Sapien adds multiline matching ((?m)) at compile time — do not prefix patterns yourself.

optionacceptsrequireddescriptionexample
splitter.regexstring (regex)yes (markdown_split)Header pattern that starts each section. Each match becomes one row. Named captures ((?P<id>...)) become output columns on this input.## F-01: …'^##\s+(?P<id>F-\d+):?\s+(?P<title>.+)
#x27;
splitter.end_regexstring (regex)noOptional early stop for body. After a header match, Sapien scans forward for this pattern; the section ends there instead of at the next header (or EOF).## Next …'^##\s+'
[[ingestion.sources.splitter.metadata]]array of tables with fields listed belownoRepeat to add extra output columns. Each row uses scope, column, regex, and optional capture below.[[ingestion.sources.splitter.metadata]] with scope = "document", column = "repository"
scopedocument, sectionyes (each metadata row)document — run regex once on the full file; value copied to every row from that file. section — run regex on each split section (body span) only."document"
columnstringyes (each metadata row)Output column name. Wire in [ingestion.fields] as repository = "<input_id>.repository"."repository"
regexstring (regex)yes (each metadata row)Pattern to extract the value. Use a named capture matching column, or a single unnamed capture group.see string → regex examples above
capturestringnoNamed capture to read from regex when it differs from column. Defaults to column. Use when one regex fills several columns."line_number"

After ingest, each split row exposes columns you can wire in [ingestion.fields] — e.g. named captures from splitter.regex, metadata column names, and built-ins such as body and row_id:


[ingestion.fields]
finding_body = "audit_findings.body"
repo         = "audit_findings.repository"

The [[ingestion.joins]] section

The purpose of [[ingestion.joins]] is to line up related data from different files so they can be treated as a single task.

Use this section only if you declared more than one [[ingestion.sources]] block. It merges your separate data sources into one wider "spreadsheet" before you pick your final columns in [ingestion.fields].


[[ingestion.joins]]
left = "labels"
right = "images"
left_on = "case_id"
right_on = "file_id"
type = "left"
optionacceptsrequireddescriptionexample
leftstringyesInput id on the left side of the join (usually the table you want to keep all rows from)."labels"
rightstringyesInput id on the right side."images"
left_onstringyesColumn on the left batch to match on."case_id"
right_onstringyesColumn on the right batch to match on — often file_id for file collections."file_id"
typeenumyesJoin mode: left keeps all left rows; inner drops non-matching rows."left"

FROM-root order (required for connected joins)

In plain terms: think of building one big spreadsheet by gluing smaller ones together. Start with your main file (one row per task — e.g. labels.csv) and list it first. Each join then glues a new file onto what you already have: left is the file you already have, right is the file you're adding. Always glue new files onto the main one — never the other way around. If you flip it, the tool can't attach your main file and ingest fails.

Ingest builds SQL with the first [[ingestion.sources]] entry as the FROM root (one row per review item — map id from this source). Each [[ingestion.joins]] row adds right as a new table; left must already be in the FROM chain.

Getting this backwards (e.g. left = "images", right = "labels" when labels is primary) produces ingest SQL where the primary table never enters FROM.

# Primary CSV first, then join enrichment
[[ingestion.sources]]
id = "labels"
type = "csv"
path = "labels.csv"

[[ingestion.sources]]
id = "images"
type = "file_collection"
path_glob = "images/*.jpg"

[[ingestion.joins]]
left = "labels"      # primary / already in FROM
right = "images"     # new table
left_on = "image_id"
right_on = "file_id"
type = "left"

The [ingestion.fields] section

[ingestion.fields] is one flat table that defines every column on each review item. Each key is the column name used everywhere after ingest; each value is where the data comes from: <source_id>.<column>.

If you used [[ingestion.joins]] to merge multiple sources, values can reference columns from any joined source.

After ingest, those columns are stored on each review item in the database. See The Data Lifecycle.

Every project must include exactly one key named id. That key is the unique identifier for each review task.

Ten fields need about ten lines — not thirty repeating two-key blocks.

Worked example: JSON findings (finding-004.json)

Example file: datasets/test-audit-contract/findings/finding-004.json in poq-monorepo. One JSON file = one review item.

Step 1 — declare the input (gives you the findings. prefix):


[[ingestion.sources]]
id = "findings"
type = "json"
path_glob = "findings/*.json"

Step 2 — map JSON keys to review-item columns. Each row below becomes one line in [ingestion.fields]. The value uses the JSON property name after ingest; the key is what you use everywhere else — and what lands in datapoint.canonical_fields.

JSON key in fileValue in [ingestion.fields]Suggested keyValue in finding-004.json
idfindings.idid"F-04"
titlefindings.titlefinding_title"Centralization risk — single EOA owner"
descriptionfindings.descriptiondescription(markdown narrative)
sourcePathfindings.sourcePathsource_path"src/Counter.sol"
lineNumberfindings.lineNumberline_number7
proposedSeverityfindings.proposedSeverityproposed_severity"low"
repositoryfindings.repositoryrepositoryrepo URL
commitShafindings.commitShacommit_shapinned commit hash

[ingestion.fields]
id                = "findings.id"
finding_title     = "findings.title"
description       = "findings.description"
source_path       = "findings.sourcePath"
line_number       = "findings.lineNumber"
proposed_severity = "findings.proposedSeverity"
repository        = "findings.repository"
commit_sha        = "findings.commitSha"

Step 3 — use these keys in later sections. Reference [ingestion.fields] keys only — never raw <source_id>.<column> paths outside that table.

For example, to show the finding's description to a validator:

[[validation.evidence]]
type = "markdown"
ingestion_field = "description"

Or to route tasks based on severity:

[[validators.routes]]
match = { proposed_severity = "low" }
total = 5

You should never reference the original source (like findings.sourcePath or findings.proposedSeverity) outside of [ingestion.fields].

CSV + images example (two inputs, no JSON):

[ingestion.fields]
id         = "findings.case_id"
image_path = "images.path"
Key (table)Value (ingest wiring)RequiredDescription
(each key)stringyesKey — column name used everywhere else (ingestion_field, route match keys). One key must be id. Value<source_id>.<column>. After joins, values may reference any joined source.

Validation (validation.*)

What validators interact with and how their answers are scored: task UI, evidence, rubric, verdict rollup, ground truth, and onboarding. Who does the reviewing — counts, pay, classes, routing, claim, and qualification gates — lives in the Validators section.

Values of ingestion_field = "..." are always keys from [ingestion.fields] unless this doc marks them as literal UI copy. Keys on source_excerpt / source_link blocks (repository, path, commit_sha, …) also hold [ingestion.fields] key names.

The [[validation.evidence]] section

Each [[validation.evidence]] block is one thing validators see on the task page (an image, markdown text, a source link, a fact table, etc.). Blocks resolve per-row values from [ingestion.fields] only — do not put static markdown or catalogue text in evidence (use [[validation.instructions]] for fixed validator guidance).

When a validator opens a task, each evidence block says which fields to show and how.

Every block starts with type. After that, keys on the block map render inputs from [ingestion.fields] keys. Values on the right of ingestion_field (and on source_excerpt / source_link keys) are always [ingestion.fields] key names, not file paths or raw JSON properties.

Optional title on any evidence block sets the panel heading above that widget. The value is literal display text (not an [ingestion.fields] key). Omit title to render no heading — there are no default headings per type.

Worked example: one task page, every evidence type

This example builds on the JSON findings walkthrough. First map the columns you need in [ingestion.fields]:


[ingestion.fields]
description       = "findings.description"
image_path        = "images.path"
finding_json      = "findings.payload"
repository        = "findings.repository"
source_path       = "findings.sourcePath"
commit_sha        = "findings.commitSha"
line_number       = "findings.lineNumber"
finding_title     = "findings.title"
proposed_severity = "findings.proposedSeverity"

Add one [[validation.evidence]] block per widget. Order in the file is display order (markdown and images tend to appear first).


[[validation.evidence]]
type = "markdown"
title = "Finding"
ingestion_field = "description"

[[validation.evidence]]
type = "image"
title = "Dermoscopy image"
ingestion_field = "image_path"

[[validation.evidence]]
type = "json_finding"
ingestion_field = "finding_json"

[[validation.evidence]]
type = "source_excerpt"
repository = "repository"
path = "source_path"
commit_sha = "commit_sha"

[[validation.evidence]]
type = "source_link"
repository = "repository"
path = "source_path"
commit_sha = "commit_sha"
line_number = "line_number"
label = "Open in GitHub"

[[validation.evidence]]
type = "datapoint_facts"
fields = [
  { label = "Title", field = "finding_title" },
  { label = "Severity", field = "proposed_severity" },
]

You rarely need every type on one project — pick the blocks that match your data. A melanoma project might only declare the image block; an audit might use markdown, source_excerpt, and datapoint_facts.

How to read a block: repository = "repository" means “read the repository column from this task’s row.” The string on the right is a [ingestion.fields] key, not a GitHub URL literal.

Evidence types

typeWhat validators seeWiring pattern
markdownRich text (finding narrative)ingestion_field
imageImage from a URL columningestion_field
json_findingJSON payload (inline or URL)ingestion_field
source_excerptPinned GitHub file in the panelflat keys on the block
source_linkLink out to GitHub file + lineflat keys on the block
datapoint_factsLabel/value fact tablefields = [{ label, field }, …] on the block

Shared on all types: optional title (literal panel heading; omit for no heading).

Single-column evidence (image, markdown, json_finding)

One column from the row — set type and ingestion_field:


[[validation.evidence]]
type = "markdown"
ingestion_field = "description"
optionacceptsrequireddescriptionexample
typeenumyesimage, markdown, or json_finding."markdown"
ingestion_fieldstringyesThe [ingestion.fields] key whose cell value feeds the widget."description"
titlestringnoLiteral panel heading; omit for no heading."Finding"

source_excerpt

Fetches and displays source from a public GitHub repo at a pinned commit (in-panel viewer):


[[validation.evidence]]
type = "source_excerpt"
repository = "repository"
path = "source_path"
commit_sha = "commit_sha"
optionacceptsrequireddescriptionexample
typeenumyesMust be source_excerpt."source_excerpt"
titlestringnoLiteral panel heading; omit for no heading."Source"
repositorystringyes[ingestion.fields] key for repo URL or owner/repo."repository"
pathstringone of path or paths[ingestion.fields] key for one file path in the repo."source_path"
pathsstringone of path or paths[ingestion.fields] key whose cell is an array of paths (UI shows a file picker)."source_paths"
commit_shastringno[ingestion.fields] key for commit hash; UI defaults to main when empty."commit_sha"

Use path or paths, not both.

Compact link to a pinned GitHub file (opens in a new tab). Verifies the path exists at the commit before linking.


[[validation.evidence]]
type = "source_link"
repository = "repository"
path = "source_path"
commit_sha = "commit_sha"
line_number = "line_number"
label = "Open in GitHub"
optionacceptsrequireddescriptionexample
typeenumyesMust be source_link."source_link"
titlestringnoLiteral panel heading; omit for no heading."Open source"
repositorystringyes[ingestion.fields] key for repo URL or owner/repo."repository"
pathstringyes[ingestion.fields] key for file path in the repo."source_path"
commit_shastringyes[ingestion.fields] key for pinned commit hash."commit_sha"
line_numberstringno[ingestion.fields] key for line number (GitHub line anchor)."line_number"
labelstringnoLiteral button copy, or a [ingestion.fields] key when label text varies per row."Open in GitHub"

datapoint_facts

Structured label/value rows (severity, id, found-by, etc.). Declare fields as an inline array on the block — each element has label (UI copy) and field (an [ingestion.fields] key):


[[validation.evidence]]
type = "datapoint_facts"
title = "Finding facts"
fields = [
  { label = "Title", field = "finding_title" },
  { label = "Severity", field = "proposed_severity" },
]
optionacceptsrequireddescriptionexample
typeenumyesMust be datapoint_facts."datapoint_facts"
titlestringnoLiteral panel heading above the fact table; omit for none."Finding facts"
fieldsarray of tablesyesOrdered rows shown in the fact table. At least one element required.see example above
fields[].labelstringyes (each fields element)Fixed heading beside the value — UI copy, not a column."Severity"
fields[].fieldstringyes (each fields element)[ingestion.fields] key whose cell value is shown."proposed_severity"

The [[validation.rubric]] section

Each [[validation.rubric]] block is one question validators answer on the task page — its label, answer buttons (scale.*), and how that single question is scored.

This section is only about questions. It does not set the overall pass/fail for the whole item; that optional rollup lives in the separate [validation.verdict] section below.


[[validation.rubric]]
id = "severity"
label = "Severity"
prompt = "How severe is this finding?"
scale.type = "ordinal"
scale.labels = ["info", "low", "medium", "high", "critical"]
consensus_weight = 1.0

Add one [[validation.rubric]] block per question. Reference rows elsewhere by id (for example route match keys, verdict.match skip rules, or ground-truth wiring).

Rubric row fields

Each [[validation.rubric]] block sets question copy, scale (scale.*), optional verdict overrides (verdict.verified_threshold, verdict.match), and consensus weight on the same row.

optionacceptsrequireddescriptionexample
idstringyesInternal name for this question — unique within the rubric."severity"
labelstringyesHeading validators see above the answer buttons."Severity"
promptstringnoExtra question text under the label."How severe is this finding?"
tooltipstringnoHelp text below the buttons (visible, not hover-only)."Match severity to OWASP guidance."
rolestringnoTag for UI affordances (e.g. influence_gauge, certainty)."influence_gauge"
scale.typeenumyesAnswer template: likert, ordinal, numeric."ordinal"
scale.sizeintyes (likert only)Likert point count — must be 5 or 7 when scale.type is likert.5
scale.labelsarray of stringsyes (ordinal); no (likert); optional (numeric)Button labels. Must match scale.size on likert rows if set.["low", "medium", "high"]
scale.valuesarray of numbersyes (numeric)Stored scores (0–100).[0, 50, 100]
verdict.verified_thresholdnumbernoShare of validators that must agree on the winning label for verified on this row. See Row-level verdict.0.85
verdict.matchmap (row id → labels)noSkip this row (n_a) when another row’s consensus matches a trigger label.{ validity = ["False positive"] }
consensus_weightnumberno (default 1.0)How much this row counts in consensus deviation math. 0 excludes the row (advisory only).1.0

Do not rename consensus_weight to importance. Removed keys verdict.type and verdict.contested_rule.* fail parse.

The scale block (scale.*)

There is no separate top-level [scale] section. On each [[validation.rubric]] block, dotted keys (scale.type, scale.size, …) define the answer buttons validators see and the 0–100 number stored when they pick one.

At compile time the server turns every scale into a list of (label, value) pairs. Validators click labels; consensus and storage always use the numeric values. Different rows in the same project can use different scale types (e.g. severity as ordinal, confidence as numeric, diagnosis as 7-point Likert).

scale.type — which answer template

Required. Picks one of three templates:

scale.typeUse forAlso set on the row
likertSymmetric agree/disagree or frequency scales (“Strongly disagree” … “Strongly agree”, “Never” … “Always”)scale.size = 5 or 7; optional scale.labels (must match size)
ordinalYour own ordered categories (severity, validity, …)Required scale.labels — one label per button; count sets the number of options
numericExplicit 0–100 steps you choose (e.g. certainty 0/25/50/75/100)Required scale.values; optional matching scale.labels

For ordinal and likert types, the compiler auto-generates evenly spaced 0–100 values from the label count or scale.size. For numeric, you declare the stored scores in scale.values.

AI validator classes: When any [[validators.classes]] row has type = "ai", every rubric scale must resolve to integer-only anchors (the AI worker constrains model output to integers per row). Practical rules: likert with scale.size = 5 is fine; scale.size = 7 produces fractional steps and is rejected. ordinal label counts must satisfy (count - 1) divides 100 (e.g. five labels). numeric scales must use integer scale.values. See AI agent classes for class fields.

Examples by scale type

Ordinal (multi-bucket) — labels required; values auto-spaced:


[[validation.rubric]]
id = "severity"
label = "Severity"
scale.type = "ordinal"
scale.labels = ["info", "low", "medium", "high", "critical"]

Ordinal (binary yes/no) — still scale.type = "ordinal", but exactly two labels:


[[validation.rubric]]
id = "is_valid"
label = "Is this finding valid?"
scale.type = "ordinal"
scale.labels = ["no", "yes"]

Skip a row when another row’s consensus matches — use verdict.match on the dependent row:


[[validation.rubric]]
id = "validity"
label = "Validity"
prompt = "Is this finding a real issue?"
scale.type = "ordinal"
scale.labels = ["False positive", "Unlikely valid", "Unclear", "Likely valid", "Clearly valid"]

[[validation.rubric]]
id = "fix_soundness"
label = "Fix soundness"
prompt = "How sound is the proposed fix?"
scale.type = "ordinal"
scale.labels = ["Unsound", "Weak", "Partial", "Mostly sound", "Fully sound"]
verdict.match = { validity = ["False positive"] }

When validators agree the finding is a False positive, fix_soundness is treated as n/a — see Row-level verdict.

Likert — size required; labels optional:


[[validation.rubric]]
id = "diagnosis_match"
label = "Does the diagnosis match?"
scale.type = "likert"
scale.size = 7

Numeric — values required; labels optional:


[[validation.rubric]]
id = "certainty"
label = "How certain are you?"
scale.type = "numeric"
scale.values = [0, 25, 50, 75, 100]
scale.labels = ["none", "low", "medium", "high", "certain"]

[[validation.rubric]]
id = "confidence"
label = "Confidence Level"
prompt = "How confident are you in this assessment?"
scale.type = "numeric"
scale.values = [0, 20, 80, 100]
scale.labels = ["Guessing", "Low", "High", "Certain"]

Row-level verdict

Each [[validation.rubric]] block can carry optional verdict.verified_threshold and verdict.match. After validators submit, the engine assigns one outcome per row — then, if you declared [validation.verdict], rolls those up into one outcome per item.

Two levels — same word, different jobs

[[validation.rubric]].verdict.*     →    per-question outcome (verified, insufficient_signal, n_a)
[validation.verdict]                →    how row outcomes combine for the whole item

Do not confuse per-row verdict.* overrides with the item-level [validation.verdict] section below.

Row outcomes (runtime)

These are computed — not TOML keys you set:

outcomeMeaning
verifiedEnough validators agreed on the same answer (default ≥ 80% on the winning label).
insufficient_signalNot enough agreement to call verified (e.g. scattered votes with no clear winner).
n_aThis question does not apply — set by verdict.match when another row’s consensus matches a trigger label. Skipped in item-level rollup.

How each row is classified

  1. If verdict.match matches referenced rows’ consensus labels → n_a
  2. If there are no valid votes or ties cannot be resolved → insufficient_signal
  3. If the winning label’s vote share ≥ verified_threshold (default 0.8) → verified
  4. Otherwise → insufficient_signal

Optional row verdict keys

optionacceptsrequireddescriptionexample
verdict.verified_thresholdnumbernoShare of validators that must agree on the winning label for verified. Defaults to 0.8, or [validation.verdict].verified_threshold when set.0.85
verdict.matchmap (row id → labels)noRubric row id keys; values are consensus labels that trigger n_a (scalar or array).{ validity = ["False positive"] }

Removed keys (parse error if present): verdict.type, verdict.contested_rule.*, [stage.verdict].

See Examples by scale type for full row snippets including verdict.match skip rules.

The [validation.verdict] section (datapoint-level)

Optional. Use when multiple [[validation.rubric]] questions must combine into one outcome for the whole item — for example KYC where ID match and document authenticity both must pass.

This is item-level composition. Per-row outcomes (verified, insufficient_signal, n_a) come from Row-level verdict on each [[validation.rubric]] block.

Skip [validation.verdict] when one rubric row is enough or when you do not need a single rolled-up item outcome.

[validation.verdict]
composition = "strict_and"
verified_threshold = 0.8
optionacceptsrequireddescriptionexample
compositionenumno (default strict_and)How row outcomes combine into one item result. Only strict_and is supported today."strict_and"
verified_thresholdnumbernoAgreement threshold (0.0–1.0) for verified consensus across validators.0.8
tiebreakersarray of stringsnoTie-break rules when validators disagree. Seniority tiebreaker is added automatically when validator classes exist.["majority", "validator:class_priority"]

The [validation.ground_truth] section

Use [validation.ground_truth] when your dataset includes columns validators should not see directly (golden labels, difficulty bins) or columns shown as hints (model predictions, pre-filled labels). Values are [ingestion.fields] key names, not raw source paths.

Skip when there is no answer key or prefill (e.g. open-ended preference data).


[validation.ground_truth]
golden_label_field = "label"
prefill_hint_field = "label_shown"
difficulty_field   = "difficulty"
optionacceptsrequireddescriptionexample
golden_label_fieldstringno[ingestion.fields] key with the secret correct answer. Stripped before the validator UI."label"
prefill_hint_fieldstringno[ingestion.fields] key with a hint or model prediction shown to validators."label_shown"
difficulty_fieldstringno[ingestion.fields] key used for difficulty routing/binning. Stripped before the validator UI."difficulty"

Future: Per-rubric-row [[validation.answer_key]] and [[validation.prefill]] blocks may replace this table in a follow-up change.

The [[validation.instructions]] section

[[validation.instructions]] holds onboarding slides for validators. Currently not rendered — the onboarding modal uses hardcoded copy.

Use TOML literal strings ('''…''', triple single quotes) for markdown content and any other verbatim user or policy text. Literal strings do not process backslash escapes, so Markdown escapes like \<NONE\> in tables parse correctly. Do not use basic strings ("""…""") for slide bodies — invalid escapes (e.g. \<) fail TOML parse with invalid escaped character. Reserve basic strings for short values that need escapes (e.g. delimiter = "\n\n---\n" under attestation).


[[validation.instructions]]
type = "markdown"
content = '''
Rate each finding using the OWASP severity definitions.
'''

[[validation.instructions]]
type = "image"
src = "instructions/severity.png"
caption = "Severity ladder"
optionacceptsrequireddescriptionexample
typestringyesmarkdown or image."markdown"
contentstringyes (markdown slides)Markdown slide body; use '''…''' literals.see example above
srcstringyes (image slides)Image path under the project upload prefix."instructions/severity.png"
captionstringnoCaption under an image slide."Severity ladder"

The same '''…''' literal rule applies to prompt on [[validators.classes]] when type = "ai" — prompts often embed sample rows or policy text with backslashes.

Validators (validators.*)

Who does the reviewing and under what terms: default validator count/pay/stake, validator classes (human or AI), per-item routing and escalation, claim duration, qualification gates, and task-page actions. These were previously at the top level or under [stage.*].

The [validators.claims] section

Optional claim duration — how long a validator has to finish one item after claiming it. Default 5 minutes.


[validators.claims]
duration_minutes = 45
optionacceptsrequireddescriptionexample
duration_minutesintno (default 5)Minutes before an unfinished claim expires. Max 24 hours.45

The [validators.actions] section

Optional validator task-page actions. Today: conflict-of-interest self-decline.


[validators.actions]
allow_conflict_of_interest_self_decline = true
optionacceptsrequireddescriptionexample
allow_conflict_of_interest_self_declineboolnoWhen true, validators can decline a claimed item due to conflict of interest.true
conflict_of_interest_decline_button_labelstringnoButton label override."Decline due to conflict"
conflict_of_interest_decline_confirm_titlestringnoConfirmation dialog title override."Decline this validation?"
conflict_of_interest_decline_confirm_bodystringnoConfirmation dialog body override."Use this when you have a conflict of interest."

The [validators] section

Default validator count, pay, and stake. Declare [[validators.classes]] and [[validators.routes]] separately when you need personas or item-specific routing.


[validators]
num_validators = 3
reward_usd = "1.00"
stake_usd = "0.00"
optionacceptsrequireddescriptionexample
num_validatorsintyesDefault validators per item when no route matches.3
reward_usdstringyesDefault reward per completed review (quoted decimal string)."1.00"
stake_usdstringyesDefault stake required to claim ("0.00" = none)."0.00"

The [[validators.classes]] section

Under [validators.*], classes are validator personas (pay, stake, consensus weight, and optionally an LLM agent) — not rubric answer types.

Each class is filled either by humans claiming slots in the portal (type = "human", the default when type is omitted) or by an AI agent (type = "ai") backed by a synthetic system user and the server AI worker. Routing, composition, and escalation treat both kinds the same — you can compose multiple AI classes on the default route and escalate to a human senior reviewer when consensus is low.

Class fields

| option | accepts | required | description | example | | ------------------ | ------- | ------------------ | --------------------------------------------------------------------------------------- | -------------------------------------------------------------------------- | ------ | | id | string | yes | Class id used in routes. | "senior" | | label | string | no | Display name in admin UI. | "Senior auditor" | | type | human, ai | no (default human) | Who fills slots for this class. See AI agent classes. | "ai" | | model | string | when type = "ai" | OpenRouter model slug — must be one of Allowed model values. | "anthropic/claude-sonnet-4.5" | | prompt | string | when type = "ai" | System prompt sent to the model (max 64 KiB); use '''…''' literals for verbatim text. | see AI example below | | reward_usd | string | yes | Per-review reward for this class. | "10.00" | | stake_usd | string | yes | Per-review stake for this class. | "0.00" | | priority | int | yes | Tie-break priority (higher = stronger). | 10 | | consensus_weight | number | no (default 1.0) | Per-class vote weight in weighted consensus. | 1.0 |

Human example:

[[validators.classes]]
id         = "senior"
label      = "Senior reviewer"
reward_usd = "20.00"
stake_usd  = "0.00"
priority   = 10

AI agent classes

Set type = "ai" and declare model plus prompt to fill a class with an LLM instead of human validators. The worker polls unclaimed validations for that class, packages the same evidence and rubric a human would see, calls the model, and submits via the standard validation path (audit, staking, and consensus unchanged).

[[validators.classes]]
id         = "opus-reviewer"
label      = "Opus 4.7 reviewer"
type       = "ai"
model      = "anthropic/claude-opus-4.7"
prompt     = '''
You are an expert reviewer. Score each rubric row using only the discrete
anchors provided. Ground every judgment in the evidence; if evidence is
insufficient, pick the midpoint and say so in the rationale.
'''
priority   = 30
reward_usd = "0.00"
stake_usd  = "0.00"

Constraints

  • model must be an OpenRouter slug from the allowlist below. Compile-time validation rejects unknown slugs and echoes this list in the error.
  • prompt is required for type = "ai" and is sent verbatim as the system message (max 64 KiB).
  • type = "human" (or omitted type) must not set model or prompt — misplaced AI-only fields are rejected.
  • AI classes have no wallet. reward_usd and stake_usd default to "0.00" when omitted; non-zero stake can be set but is not slashable for the synthetic user.
  • consensus_weight defaults to 1.0 unless you explicitly weight AI votes differently.
  • Rubric scales must use integer-only anchors when any AI class is declared (see The scale block).

Allowed model values

Each slug is an OpenRouter model id. All entries support multimodal evidence (images in messages) and strict JSON-schema response_format for rubric vectors.

model slugContext (tokens)Notes
anthropic/claude-haiku-4.5200,000Lowest-latency Anthropic option; good for high-volume first-pass classes.
anthropic/claude-opus-4.6200,000Prior Opus generation; kept for reproducibility of votes pinned to this version.
anthropic/claude-opus-4.7200,000Latest Opus; highest-quality reasoning; high-stakes or final-step reviewers.
anthropic/claude-sonnet-4.5200,000Default for most AI reviewer classes — strong vision, fast, cheaper than Opus.
google/gemini-2.5-flash1,000,000Fast and cheap with full multimodal support.
google/gemini-2.5-pro1,000,000Long context, strong vision; best when evidence payloads are large.
x-ai/grok-4.31,000,000xAI flagship; vision + structured outputs; vendor diversity.
meta-llama/llama-4-maverick1,048,576Open-weight multimodal flagship; useful for diversity in multi-agent compositions.
mistralai/mistral-medium-3.1131,072European-hosted multimodal option.
qwen/qwen3-vl-235b-a22b-instruct131,072Open-weight multimodal (Alibaba); broad provider coverage.
openai/gpt-4o128,000Strong multimodal generalist; diversity in multi-agent compositions.
openai/gpt-4o-mini128,000Cheap, fast multimodal; first-pass or budget-bound classes.

Example route mixing AI panel + human escalation:

[[validators.routes]]
total = 3
[[validators.routes.composition]]
class = "opus-reviewer"
count = 1
[[validators.routes.composition]]
class = "sonnet-reviewer"
count = 1
[[validators.routes.composition]]
class = "maverick-reviewer"
count = 1

[[validators.routes.escalation]]
match = { consensus_below = 0.6 }
add = 1
[[validators.routes.escalation.composition]]
class = "senior"
count = 1

Fixture: server/internal/projectspec/testdata/validators_ai.poq.toml. Design notes: issue #633; implementation server/internal/aivalidator.

The [[validators.routes]] section

Under [validators.*], routes assign validator counts and class slots to matching items — not HTTP or API routes.

Route fields

Use match to select items by [ingestion.fields] column names. A key may be a scalar (exact match) or an array (any listed value). Omit match on exactly one catch-all route (or use match = {}).


[[validators.routes]]
match = { proposed_severity = ["high", "critical"], image_type = "skin_lesion" }
total = 5
[[validators.routes.composition]]
class = "senior"
count = 2

[[validators.routes]]
total = 3   # catch-all — no match

[[validators.routes.escalation]]
match = { consensus_below = 0.6 }
add = 2

consensus_below is a reserved runtime metric (not an [ingestion.fields] key). Other match keys must match a [ingestion.fields] key.

optionacceptsrequireddescriptionexample
matchtableno (omit on exactly one default route)Match review items: [ingestion.fields] key → scalar or array of allowed values.match = { difficulty = "hard" }
totalintyesValidators per matching item.5
compositionarray of tablesyesClass slots: { class, count }. Use class = "*" for any class.{ class = "senior", count = 2 }
escalationarray of tablesnoAdd validators when agreement is below threshold. Use match.consensus_below on escalation steps.match = { consensus_below = 0.5 }, add = 2

The [[validators.qualification]] section

Validators fill out [[validators.qualification]] before claiming items.


[[validators.qualification]]
key = "audit_firm"
label = "Audit firm"
type = "enum"
required = true
optionacceptsrequireddescriptionexample
keystringyesStored profile key (snake_case, unique)."audit_firm"
labelstringyesForm question text."Audit firm"
typeenumno (default text)Input type — only text today."text"
requiredboolno (default false)Must be filled before claim when true.true

Attestation (attestation.*)

Signed PoQ report export — the output phase after review completes.

The [attestation] section

Signed PoQ report export settings. This section controls how the signed YAML attestation is constructed and formatted for download.


[attestation]
schema_version = "poq.attestation/v1"
signing_key_id = "poq-prod-2024"

[attestation.payload]
include_per_validator_votes = true
include_rationales_inline = false
outputs_alias = "findings"

[attestation.payload.outputs]
report_hash = "audit_report_hash"
severity_in_report = "proposed_severity"
repo_url = "repository"
commit_hash = "commit_sha"

[[attestation.metadata]]
field_name = "report_source"
channel = "portal"
sender = "0xC921CF11A568142223a52C7f0b4AE7023fb3326B"

[attestation.output]
delimiter = "\n\n---\n"
fence = "yaml"

Attestation fields

optionacceptsrequireddescriptionexample
schema_versionstringyesMust be poq.attestation/v1 for the current signer."poq.attestation/v1"
signing_key_idstringyesMatches the server-configured signing key id (e.g. sapien-prod-ed25519-v1)."poq-prod-2024"
payloadtableyesNested table for payload options (see below).[attestation.payload]
metadataarraynoExtra signed report header fields (see below).[[attestation.metadata]]
outputtableyesDownload formatting for appended mode (see below).[attestation.output]

The [attestation.payload] block

optionacceptsrequireddescriptionexample
include_per_validator_votesboolnoWhen true, the signed YAML includes each validator's specific answer for every rubric row.true
include_rationales_inlineboolnoWhen false, rationales use opaque references instead of inline text.false
outputs_aliasstringnoNames the main findings array in the YAML. Use "findings" for security audits; defaults to "sub_reports"."findings"

The [attestation.payload.outputs] block

Maps attestation fields to canonical field names defined in [ingestion.fields].

optionacceptsrequireddescriptionexample
report_hashstringnoField holding the hash of the original input document."audit_report_hash"
severity_in_reportstringnoField holding the original tool-reported severity."proposed_severity"
repo_urlstringnoField holding the repository URL (emitted in audit_target)."repository"
commit_hashstringnoField holding the pinned commit SHA (emitted in audit_target)."commit_sha"

The [[attestation.metadata]] block

Used for literal top-level YAML metadata blocks.

optionacceptsrequireddescriptionexample
field_namestringyesThe top-level YAML key name (e.g. report_source)."report_source"
(others)anynoAny other keys in the table are emitted as key-value pairs under field_name.channel = "portal"

The [attestation.output] block

optionacceptsrequireddescriptionexample
delimiterstringyesString inserted between the original document and the attestation in "appended" mode."\n\n---\n"
fenceenumnoOptional markdown fence type. Only "yaml" is supported today (wraps the attestation in yaml ... )."yaml"

Examples by project type

Smart contract audit findings


[project]
spec_version = "1"

[[ingestion.sources]]
id = "audit_report"
type = "markdown_split"
path_glob = "reports/*.md"
splitter.regex = '^##\s+(?P<id>F-\d+):?\s+(?P<title>.+)
#x27;
[ingestion.fields] id = "audit_report.id" description = "audit_report.body" [[validation.evidence]] type = "markdown" ingestion_field = "description" [[validation.evidence]] type = "source_excerpt" repository = "repository" path = "source_path" commit_sha = "commit_sha" [[validation.rubric]] id = "severity" label = "Severity" scale.type = "ordinal" scale.labels = ["info", "low", "medium", "high", "critical"] [validators] num_validators = 3 reward_usd = "5.00" stake_usd = "0.00"

Medical imaging (melanoma)


[project]
spec_version = "1"

[[ingestion.sources]]
id = "labels"
type = "csv"
path = "labels.csv"

[[ingestion.sources]]
id = "images"
type = "file_collection"
path_glob = "images/*.jpg"

[[ingestion.joins]]
left = "labels"
right = "images"
left_on = "image_id"
right_on = "file_id"
type = "left"

[ingestion.fields]
id           = "labels.image_id"
image_url    = "images.url"
label        = "labels.label"
label_shown  = "labels.label_shown"
difficulty   = "labels.difficulty"

[validation.ground_truth]
golden_label_field = "label"
prefill_hint_field = "label_shown"
difficulty_field   = "difficulty"

[[validation.evidence]]
type = "image"
ingestion_field = "image_url"

[[validation.rubric]]
id = "diagnosis"
label = "Diagnosis"
scale.type = "likert"
scale.size = 5

[validators]
num_validators = 3
reward_usd = "1.00"
stake_usd = "0.00"

[[validators.routes]]
match = { difficulty = "hard" }
total = 5

KYC document review


[project]
spec_version = "1"

[[ingestion.joins]]
left = "applicants"
right = "documents"
left_on = "user_id"
right_on = "owner_id"
type = "inner"

[[validation.rubric]]
id = "id_match"
label = "Does the ID match?"
scale.type = "ordinal"
scale.labels = ["no", "yes"]
verdict.verified_threshold = 0.8

[[validation.rubric]]
id = "doc_authentic"
label = "Is the document authentic?"
scale.type = "ordinal"
scale.labels = ["no", "yes"]
verdict.verified_threshold = 0.8

[validation.verdict]
composition = "strict_and"

[validators]
num_validators = 3
reward_usd = "0.50"
stake_usd = "0.00"

[validators.actions]
allow_conflict_of_interest_self_decline = true

For developers

On-disk TOML in this reference is the authoring shape accepted by the parser (ingestion.*, validation.*, validators.*, top-level [attestation]). Legacy [stage.*] and [[inputs]] roots are no longer accepted.

Compiled runtime JSON still uses internal names (rubric_rows, ground_truth, evidence_blocks); mapping from the namespaced TOML layout to compile output is 1

at the semantic level (no JSON shape change).

Edit this page on GitHub Last updated Jun 16, 2026