Skip to content

VOI Schema — Verified Outcome Instance

A Verified Outcome Instance (VOI) is the atomic, immutable unit of evidence in Bukti. Every capability claim in the system is backed by one or more VOIs. This page documents the schema, the evidence-type enumeration, and the immutability rules.


What a VOI represents

A VOI is not a "verified" claim in the cryptographic sense — it is a structured record of a single piece of evidence asserting that an entity has demonstrated a capability. The name reflects the aspiration (outcome-backed verification), not the guarantee. The strength of the evidence is scored by the substantive grading system (see scoring-formula.md).


Schema

python class VOI(BaseModel): id: str # Unique VOI identifier (UUID) entity_id: str # Entity this evidence belongs to capability_id: str # Ontology node reference evidence_type: EvidenceType evidence_text: str # Extracted evidence snippet evidence_uri: str | None # Source URL source_platform: str # "github", "credly", "resume", "web", ... extraction_confidence: float # LLM parse quality (0–1); ge=0, le=1 observed_at: datetime # When capability was demonstrated (valid time) recorded_at: datetime # When Bukti recorded this VOI (transaction time) raw_signal_hash: str | None # Deduplication hash supersedes_id: str | None # ID of VOI this corrects/supersedes


Evidence type enumeration

The EvidenceType enum defines the evidence types the system recognizes. Each type has a default weight in the scoring formula (see evidence-weights.md).

python class EvidenceType(StrEnum): behavioral_artifact = "behavioral_artifact" # commits, code artifacts task_outcome = "task_outcome" # deployed projects, measurable outputs peer_attestation = "peer_attestation" # third-party endorsement contribution_artifact = "contribution_artifact" # open-source contributions publication_artifact = "publication_artifact" # papers, articles credential_badge = "credential_badge" # Credly, Open Badges, certificates indirect_attestation = "indirect_attestation" # mentioned by others, not endorsed self_reported = "self_reported" # resume self-claims self_authored = "self_authored" # self-authored content about capability


Bi-temporal semantics

VOIs carry two timestamps:

  • observed_atvalid time: when the underlying capability was demonstrated. For a GitHub commit, this is the commit date. For a credential, this is the issue date or the event date. This is the timestamp used in decay calculations.
  • recorded_attransaction time: when Bukti recorded the VOI in the system. This is always set to the current time at ingestion and cannot be modified.

Bi-temporal storage means Bukti can reconstruct "what did the system know about entity X on date D, for evidence that was valid at that date?" — which matters for audit trails and for detecting retroactive evidence insertion.


Immutability rule

VOIs are never modified. If a VOI is found to be incorrect (wrong date, wrong capability mapping, wrong extraction), the correction procedure is:

  1. Create a new VOI with the correct data.
  2. Set supersedes_id on the new VOI to point at the original VOI's id.
  3. The original VOI is preserved in the database with all original values.

The scoring system only uses the latest non-superseded VOI in each supersession chain. Superseded VOIs are retained for audit purposes.


Extraction confidence

extraction_confidence is the LLM's self-reported quality for the parse operation that produced this VOI. It is a float on [0, 1]. In the scoring formula, it acts as the valence s_i — the degree to which this VOI is "success-like" versus "failure-like." A VOI with extraction_confidence = 0.9 contributes more positive pseudo-count than one with extraction_confidence = 0.5.

Caveat: extraction_confidence is an uncalibrated LLM self-report today. It has not been validated against held-out human-labeled VOIs. When calibration data exists, it will be recalibrated or replaced with a binary gate ("parses cleanly" / "does not"). See calibration-status.md.