evals-feed¶
The issues view's evals section — the project's current measured loss as a feed, the left box's default tab (the Evals|Threads switcher, [[issues-view]]). Latest reading per scenario, fresh first, video first; title-only rows, media strictly lazy.
raw source¶
The issues view (issues-view) is where a human reviews the project, and evals outrank the issues there: the freshest measurements lead, above the discussion, with the issues section pinned below (never pushed off-screen — the surface's outer container never scrolls; each section scrolls internally). A feed of every reading ever filed grows without bound; a feed of the project's current loss does not. The unit of this feed is the scenario, not the reading: yatsu already defines the latest reading per scenario as the current score, so the feed is bounded by declared scenarios (structural, slow-growing), never by measurement count. Review attends to what still counts.
expanded spec¶
Default view: latest reading per scenario, fresh only, newest first, evidence-kind filter defaulting
to video, falling back to image when no video reading exists and to all when neither media kind does;
stale readings collapse to a count badge, expanded on demand. The chips (video | image | note | all, the
stale toggle) live in this group's sticky head and are this group's own state — issues-view owns the
page shell (split, selection, j/k), never this group's filters.
Kinds are honest. A reading's kind is its evidence: video/image/transcript when a blob exists
(a legacy blob with no recorded kind is an image — every legacy capture was one), and note when no
blob exists at all (a verdict filed with prose only). A blob-less reading is never claimed by the media
filters and its row never advertises media it lacks — the note chip and tag are its own.
Rows are title-only, always — verdict mark · scenario · node · evidence-kind tag · relative time —
no media request of any kind in the list. Selecting a row opens it in the page's DETAIL pane as the
annotator — media loads there, a <video> element exists only there. The group reports its visible
rows upward so the page's j/k walk one flat list across both groups; history drills down per scenario
(the node's yatsu-eval-tab scaffold), not in the list.
One data path, one computation. The board nodes arrive as a prop from the app's single board
poll + SSE subscription — the section fetches nothing of its own — and latest-per-scenario is
scenarioStates, the same computation behind the node badge, the focus panel, and the eval tab; the feed
never re-derives the current score its own way. At scale the board fold itself converges to the same
semantics — latest reading per scenario plus a history count, the full timeline served per node on
demand — one convergence shared by this feed, the node eval tab, and board-lean;
clean --keep-latest already aligns the evidence bytes with it.