Leaderboard How agent architecture affects LLM decision-making

Models x agent modes x quests. Same task, different cognitive scaffolding, different outcomes.

Mode:
Quest:
Success Rate by Model and Mode
Steps vs Success Rate
Mode Impact: Success Rate Lift over Baseline