We track who reviewed what so calibration drift and inter-rater reliability are measurable. Your rater_id rides on every eval you submit.
Computed live from the evals.jsonl store.
See the full dashboard →
Form · Embed SDK · Dashboard · README