Help Center

Improved

Fixed

kontext

In Review

Planned

In Progress

Completed

Rejected

High Priority

Low Priority

Backlog

Next up

Done

Main Roadmap

Hey {name|there}! 👋

The current screen is not properly rendering some details of the eval results, and makes them sometime hard to understand:<ul><li>Clarify what’s expected, and what’s the result only when a mistake is flagged: no repetition when successful, so we focus on mistakes to analyze</li><li>Fix the fact that a pure failure on the LLM side leads to false positives (ie if an example is supposed not to detect anything, we flag a success even if the pipeline itself failed)</li></ul>

Clean up and improve UX of the Analysis pipeline eval screen

Hervé Labas

Kontext

Clean up and improve UX of the Analysis pipeline eval screen

Subscribe to post

Subscribe to post