Cohort Independence¶
Not all pieces of evidence are independent. Many commits to the same repository in a single weekend share an error mode. Multiple peer endorsements from the same team share a social-loyalty bias. Treating correlated evidence as independent inflates the posterior in ways that do not reflect real-world capability.
Bukti groups correlated evidence into cohorts. Within a cohort, a sub-linear aggregation replaces naive summation: the strongest single item provides the floor, and additional in-cohort items add a bonus that grows much slower than linearly with cohort size.
Cohort grouping rules¶
The following rules run in priority order. Once a VOI is assigned to a cohort by a higher-priority rule, it cannot be reassigned.
Rule 1 — Same platform, same ISO week
VOIs from the same source platform with observed_at timestamps in the same ISO calendar week are grouped. This captures farming patterns: many submissions on a single platform over seven days collapse to one cohort.
Rule 2 — Same attestor
Peer-attestation VOIs from the same attestor are grouped. This captures cases where one person attests to the same capability multiple times, or where multiple attestations arrive from the same source account.
Attack this stops: a single person endorsing a friend many times across many different capabilities does not inflate the friend's posterior linearly — the attestation cohort contributes as a single-attestor batch.
Rule 3 — Same employer, same quarter
Indirect-attestation VOIs sharing an organizational source domain in the same calendar quarter are grouped. This captures employer-mention clusters that share a common organizational context.
Rule 4 — Same credential event
Credential-badge VOIs from the same issuance event are grouped. This stops a workshop that issues several separate badges for several sub-skills of the same topic from producing several independent contributions.
Attack this stops: obtaining all badges from a single training course and claiming independent evidence for several capabilities. All such badges from the same event collapse to one cohort contributing one full-weight plus a small in-cohort bonus.
Rule 5 — Capability context
VOIs for capabilities that share a parent in the capability DAG are treated as having partial cohort overlap.
Attack this stops: "many framework certificates from one workshop" inflating across what appear to be distinct capabilities when the underlying evidence all comes from the same course.
Concrete attack scenarios¶
Single-platform farming¶
An entity submits a large volume of work on a single platform over a short window, all mapped to the same capability. Without cohort grouping, every submission contributes independent pseudo-counts, easily saturating the posterior. With Rule 1, the window collapses to one or two cohorts (depending on whether it crosses an ISO week boundary), each contributing one max-value plus a sub-linear in-cohort bonus. The posterior effect is dramatically smaller than naive aggregation would produce.
Reciprocal endorsement ring¶
Entity A and Entity B each create accounts and mutually attest each other's capabilities multiple times. Without cohort grouping, those attestations contribute linearly. With Rule 2, all attestations from the same attestor collapse to a single cohort. Additionally, the attestor starts at the lowest identity grade, which reduces the peer-attestation effective weight further.
What cohort grouping does not fix¶
Cohort grouping addresses within-session and same-source correlation. It does not:
- Fix independently-sourced but genuinely correlated evidence (e.g., two former colleagues at the same company who independently attest the same skill — they share an error mode but the same-employer rule only applies to indirect attestation).
- Fix calibration errors in the weights themselves.
- Replace an attestor-reliability system that downweights systematically unreliable attestors.
Related pages¶
- scoring-formula.md — how cohort pseudo-counts feed into α and β
- contradiction-detection.md — negative evidence handling
- evidence-weights.md — the weight categories that cohort contributions are built from