Leaderboard
Submissions are evaluated per task family and per baseline profile (coding · vision · fusion). All entries link back to their reproduce-recipe and tool-trace so any third party can re-run them.
| Submission | Profile | Family | Score | Date | Trace |
|---|
No submissions yet.
The 6 task families above are open for baselines. Submit a run by following the submission guide — your entry lands here within 24 hours of evaluation.