Flag mishaps in the analysis to feed benchmarks

Enable users to flag any part of an analysis that was performed but shows a misinterpretation from the LLM and requires an adjustment.

Allows to fine tune the analysis by feeding the example to the benchmark samples, which allows to build your own verified dataset over time, and will give you tools to adjust the analysis pipeline to improve its quality.

Please authenticate to join the conversation.

Upvoters
Status

Planned

Board

Feature requests and bug reports

Date

About 1 month ago

Author

Hervé Labas

Subscribe to post

Get notified by email when there are changes.