Qofi builds finance-native environments and real deal challenges to better understand and shape the behavior of frontier intelligences.
How It Works
Built From Real Work, Graded by Real Operators
Workflow-Faithful Tasks
Agents work the way analysts do, documents, models, and judgment calls in sequence, evaluated on the full trajectory rather than a single answer.
Operator-Graded Rewards
Reward functions are written and enforced by practitioners. Where a rubric cannot capture a judgment call, a former operator grades it.
Eval and Training Modes
The same environment runs as a held-out benchmark or a training signal, measure capability, then close the gap with the data only Qofi can produce.
Evaluation
Measure Any Model Against the Standard
Run a model through a Qofi environment and see exactly where it holds up and where it breaks, scored on the deal-level reasoning that public benchmarks miss, with results that still have headroom.
Trajectory scoringHeld-out tasksOperator rubrics
Coverage
Environments Across the Deal Spectrum
Each environment is drawn from a live workflow in the repo, and the library grows with every engagement.
Data Sources
Deal Domains
Evaluate a Model in a Qofi Environment
Experiment with Qofi RL environments today.